TODO: unify and more resiliant collection handling #950

Open
opened 2025-07-21 17:53:47 +00:00 by Oneric · 0 comments
Owner

We already have a common Collections.Fetcher, but in the process of converting old code to it some Collection logic remained at the calling sites and the capabilities of the common fetcher not always fully leveraged.
There should be no special logic gated behind checks for "(Ordered)CollecctionPage", "(ordered)Items", "first" etc left outside of the common collection fetcher code. And no assumption of collection-like elements being just bare IDs — just pass them through the collection fetcher and/or get_ap_id before use.

This lead to bugs like the interop issue when bridgy introduced an inlined featured collection and might be responsible for the request fetch spams which we alleviated but had trouble reproducing and thus never fully fixed. Notably most reported instances of request spamming are related to collections (with some inline logic) (with the remaining being related to sync-user refreshes and too-strict validation).

Implementing the changes below should make things both more resilient as well as cleaner and easier to understand with everything actually using the same logic. It can be piecewise and shouldn’t be too hard; #949 can serve of an example of how such a change might look like:

  • move "private collection" logic into Collections.Fetcher
    And also interpret "403" and "404" responses as a “private” collection; currently only 200 responses with an empty item list are considered “private”.
    (This is currently used to decide whether to display follow(er|ing) counts and/or content; treating 403|404 as private collection instead of error also will reduce spurious error logs for other collection types)
  • use plain Collections.Fetcher for followers/following (requires the above)
  • use plain Collections.Fetcher for replies (fix_replies in ArticleNotePageValidator)
    This or the next item has a good chance of fixing the original issue reported in #831
  • fetch (if not inlined) and process replies collection async
  • keep inlined objects if passing auth check (instead of throwing away and refetching)
    (in particular replies, featured or outbox may inline objects or activities instead of only showing the AP id)
  • i only glanced at this, but it seems like we associating featured entries with a date, but just replace the date for all entries on each refresh from remotes. This seems kinda pointless and we could stop doing that (and purge it from db)
    I guess this is done for consistently with local pins, but do we actually need the date here or just an order? Arrays remain ordered anyway...
  • until a recent bugfix, we sometimes only used the first page or only inlined items of some kinds of collections.
    If we want to reintroduce this instead of having a fixed item count limit for all scenarios, we need to be able to pass a item count and/or page limit to the fetch function.
    Useful e.g. to limit featured items to the max amount of allowed pins (which is typically much smaller than the genreic item limit) etc
We already have a common `Collections.Fetcher`, but in the process of converting old code to it some Collection logic remained at the calling sites and the capabilities of the common fetcher not always fully leveraged. There should be no special logic gated behind checks for `"(Ordered)CollecctionPage"`, `"(ordered)Items"`, `"first"` etc left outside of the common collection fetcher code. And no assumption of collection-like elements being just bare IDs — just pass them through the collection fetcher and/or `get_ap_id` before use. This lead to bugs like the interop issue when bridgy introduced an inlined `featured` collection and might be responsible for the request fetch spams which we alleviated but had trouble reproducing and thus never fully fixed. Notably most reported instances of request spamming are related to collections (with some inline logic) *(with the remaining being related to sync-user refreshes and too-strict validation)*. Implementing the changes below should make things both more resilient as well as cleaner and easier to understand with everything actually using the same logic. It can be piecewise and shouldn’t be too hard; #949 can serve of an example of how such a change might look like: - [ ] move "private collection" logic into Collections.Fetcher And also interpret "403" and "404" responses as a “private” collection; currently only 200 responses with an empty item list are considered “private”. *(This is currently used to decide whether to display follow(er|ing) counts and/or content; treating 403|404 as private collection instead of error also will reduce spurious error logs for other collection types)* - [ ] use plain Collections.Fetcher for followers/following *(requires the above)* - [ ] use plain Collections.Fetcher for replies (`fix_replies` in `ArticleNotePageValidator`) This or the next item has a good chance of fixing the original issue reported in https://akkoma.dev/AkkomaGang/akkoma/issues/831 - [ ] fetch (if not inlined) and process replies collection async - [ ] keep inlined objects if passing auth check (instead of throwing away and refetching) *(in particular `replies`, `featured` or `outbox` may inline objects or activities instead of only showing the AP id)* - [ ] i only glanced at this, but it seems like we associating `featured` entries with a date, but just replace the date for all entries on each refresh from remotes. This seems kinda pointless and we could stop doing that (and purge it from db) I guess this is done for consistently with local pins, but do we actually need the date here or just an order? Arrays remain ordered anyway... - [ ] until a recent bugfix, we sometimes only used the first page or only inlined items of some kinds of collections. If we want to reintroduce this instead of having a fixed item count limit for all scenarios, we need to be able to pass a item count and/or page limit to the fetch function. Useful e.g. to limit `featured` items to the max amount of allowed pins *(which is typically much smaller than the genreic item limit)* etc
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
AkkomaGang/akkoma#950
No description provided.