[feat] Do not fetch unknown Remove.object #1052

Open
opened 2026-01-21 06:33:53 +00:00 by a · 5 comments
Contributor

The idea

When: receiving Remove activities (targeting the http://joinmastodon.org/ns#featured collection, or in general?)
Then: Do not fetch the Remove.object

The reasoning

In particular, the most common usage of Remove is for actions like "unpinned a post", where the post being unpinned is not socially relevant at the time of the Remove activity. A similar optimization occurs with Delete when the Delete.object is not known.

One problem that can be observed with an unknown object being fetched is that Akkoma will wrap it in a fake Create, and set the published datetime of that Create to the current datetime. The timeline function will then show this object at the top of the timeline, since it "arrived" just now. Unpinning a lot of posts can end up inadvertently spamming followers using Akkoma instances in the following corner case:

  • Old posts have been purged after a profile has been discovered;
  • ObjectAgePolicy is either disabled, or it is configured with a cutoff higher than the discovered posts

(Somewhat relatedly, I would appreciate a feature to have "fake Creates" tagged somehow and excluded from timelines. #1053 covers that separate feature.)

Have you searched for this feature request?

  • I have double-checked and have not found this feature request mentioned anywhere.
  • This feature is related to the Akkoma backend specifically, and not pleroma-fe.
### The idea When: receiving Remove activities (targeting the `http://joinmastodon.org/ns#featured` collection, or in general?) Then: Do not fetch the Remove.object ### The reasoning In particular, the most common usage of Remove is for actions like "unpinned a post", where the post being unpinned is not socially relevant at the time of the Remove activity. A similar optimization occurs with Delete when the Delete.object is not known. One problem that can be observed with an unknown object being fetched is that Akkoma will wrap it in a fake Create, and set the published datetime of that Create to the current datetime. The timeline function will then show this object at the top of the timeline, since it "arrived" just now. Unpinning a lot of posts can end up inadvertently spamming followers using Akkoma instances in the following corner case: - Old posts have been purged after a profile has been discovered; - ObjectAgePolicy is either disabled, or it is configured with a cutoff higher than the discovered posts (Somewhat relatedly, I would appreciate a feature to have "fake Creates" tagged somehow and excluded from timelines. https://akkoma.dev/AkkomaGang/akkoma/issues/1053 covers that separate feature.) ### Have you searched for this feature request? - [x] I have double-checked and have not found this feature request mentioned anywhere. - [x] This feature is related to the Akkoma backend specifically, and not pleroma-fe.
Owner

Generally reasonable. Currently Remove sahres a high-level function and error-handling with other activitiy types which actually want or need to fetch, so this will require splitting this up with the least amount of code duplication.

Some more tangential clarifications:

an unknown object being fetched is that Akkoma will wrap it in a fake Create, and set the published datetime of that Create to the current datetime.

This was brought up before, but let me clarify again: this is not the case. Akkoma never mangles incoming published timestamps nor generates ones for synthesised Creates. It’s just that the database Flake ID (which contains time information) used for sorting is generated for the fetch time. The actual published time always shows and showed up correctly in API responses.

A similar optimization occurs with Delete when the Delete.object is not known.

Nit: for Deletes this is not an optimisation but avoiding fetches is strictly necessary since an already deleted object cannot (or at least should not be able to) be fetched.

Old posts have been purged after a profile has been discovered;

Just for clarification: this requires --prune-pinned to be set and --keep-followed {post,all} to not be set when pruning. In all other configurations these posts will be preserved.

Generally reasonable. Currently `Remove` sahres a high-level function and error-handling with other activitiy types which actually _want_ or _need_ to fetch, so this will require splitting this up with the least amount of code duplication. Some more tangential clarifications: > an unknown object being fetched is that Akkoma will wrap it in a fake Create, and set the published datetime of that Create to the current datetime. This was brought up before, but let me clarify again: this is not the case. Akkoma never mangles incoming `published` timestamps nor generates ones for synthesised `Create`s. It’s just that the database Flake ID (which contains time information) used for sorting is generated for the fetch time. The actual `published` time always shows and showed up correctly in API responses. > A similar optimization occurs with Delete when the Delete.object is not known. Nit: for `Delete`s this is not an optimisation but avoiding fetches is strictly necessary since an already deleted object cannot *(or at least should not be able to)* be fetched. > Old posts have been purged after a profile has been discovered; Just for clarification: this requires `--prune-pinned` to be set and `--keep-followed {post,all}` to **not** be set when pruning. In all other configurations these posts will be preserved.
Author
Contributor

Akkoma never mangles incoming published timestamps nor generates ones for synthesised Creates. It’s just that the database Flake ID (which contains time information) used for sorting is generated for the fetch time. The actual published time always shows and showed up correctly in API responses

and to clarify what i meant as well ^_^ -- it's not about the api responses, it's about the sorting. when you sort by flake id, you are effectively inferring the date received (or discovered) from the id, as opposed to using the date published on the object (or activity). "date received" is actually not a property of the activity, but rather a property of the graph or http request/resource.

for Deletes this is not an optimisation but avoiding fetches is strictly necessary since an already deleted object cannot (or at least should not be able to) be fetched.

a Delete might be stale and an object might exist at that identifier, but i get what you mean.

> Akkoma never mangles incoming published timestamps nor generates ones for synthesised Creates. It’s just that the database Flake ID (which contains time information) used for sorting is generated for the fetch time. The actual published time always shows and showed up correctly in API responses and to clarify what i meant as well ^_^ -- it's not about the api responses, it's about the sorting. when you sort by flake id, you are effectively *inferring* the date received (or discovered) from the id, as opposed to using the date published on the object (or activity). "date received" is actually not a property of the activity, but rather a property of the *graph* or *http request/resource*. > for Deletes this is not an optimisation but avoiding fetches is strictly necessary since an already deleted object cannot (or at least should not be able to) be fetched. a Delete might be stale and an object might exist at that identifier, but i get what you mean.
Owner

and to clarify what i meant as well ^_^ -- it's not about the api responses, it's about the sorting.

not sure what you mean; the sorting is part of the API response of timelines (but not part of the individual objects themselves)

> and to clarify what i meant as well ^_^ -- it's not about the api responses, it's about the sorting. not sure what you mean; the sorting is part of the API response of timelines (but not part of the individual objects themselves)
Author
Contributor

not sure what you mean

something like this:

  1. the thing is discovered
  2. it is assigned a flake id based on the current timestamp it was discovered, not the as:published or anything else
  3. the sorting is done by flake ids lexicographically

correct so far?

what i am talking about is taking that flake id, extracting the timestamp it represents, and setting that timestamp as the value of a jsonld property on an item node instead of on the activity:

type: _:TimelineItem, _:CollectionItem  # whatever really
_:date: <now>  # this is where the flake id is assigned
_:content:
  id: <the activity>
  published: <3 seconds ago>  # this can be a real activity or fake activity, whatever
  type: Create
  object: <the thing discovered>  # if the create is fake

in other words, what you describe as "not part of the individual objects themselves" can in fact be part of some other object -- that is the information being encoded into the flake id but not the json.

this is more in the realm of #1053 though. for #1052 we can proceed based on your earlier comments:

Generally reasonable. Currently Remove sahres a high-level function and error-handling with other activitiy types which actually want or need to fetch, so this will require splitting this up with the least amount of code duplication.

> not sure what you mean something like this: 1. the thing is discovered 2. it is assigned a flake id based on the *current timestamp it was discovered*, not the as:published or anything else 3. the sorting is done by flake ids lexicographically correct so far? what i am talking about is taking that flake id, extracting the timestamp it represents, and setting that timestamp as the value of a jsonld property *on an item node* instead of *on the activity*: ```yaml type: _:TimelineItem, _:CollectionItem # whatever really _:date: <now> # this is where the flake id is assigned _:content: id: <the activity> published: <3 seconds ago> # this can be a real activity or fake activity, whatever type: Create object: <the thing discovered> # if the create is fake ``` in other words, what you describe as "not part of the individual objects themselves" can in fact be part of some other object -- that is the information being encoded into the flake id but not the json. this is more in the realm of #1053 though. for #1052 we can proceed based on your earlier comments: > Generally reasonable. Currently Remove sahres a high-level function and error-handling with other activitiy types which actually want or need to fetch, so this will require splitting this up with the least amount of code duplication.
Owner
  1. the sorting is done by flake ids lexicographically
    correct so far?

Not quite, but it doesn’t matter. FlakeIDs are 128-bit UUIDs and thus numeric(ally sorted) in code. They only encoded (such that their lexicographical ordering is identical to the numeric ordering) for API

I’m still not sure what you mean, but if it’s not relevant for this issue than it doesn’t matter ig

> 3. the sorting is done by flake ids lexicographically > correct so far? Not quite, but it doesn’t matter. FlakeIDs are 128-bit UUIDs and thus numeric(ally sorted) in code. They only encoded (such that their lexicographical ordering is identical to the numeric ordering) for API I’m still not sure what you mean, but if it’s not relevant for this issue than it doesn’t matter ig
Sign in to join this conversation.
No milestone
No project
No assignees
2 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
AkkomaGang/akkoma#1052
No description provided.