provide full replies collection in ActivityPub objects #904

Merged
Oneric merged 9 commits from Oneric/akkoma:ap_replies_collection into develop 2025-06-07 19:33:40 +00:00
Owner

Until now only a limited number of self-replies were inlined as an anonymous, unordered ActivityPub collection. Notably the advertised replies might be private posts. (For reference; this restrricted inline version of the collection was added in early 2020 via pleroma#2129)

However, providing all (non-private) replies allows for better thread consistency across instances if the remote server cooperates.
The collection existing as a standalone object has two advantages for this. For one, if it was still anonymous, all replies would need to be inlined, which might be too bloated in pathological cases.
Secondly, it allows remote servers to update the thread by traversing the reply collection independent of the original post. (If the remote part knows about chronological ordering, it can in theory even efficiently resume from where it previously stopped)

NOTE:

  • everything up to and including commit tmp is just cherry -picked from other pending PRs so i don’t have to do annoying migration reverts/applications and dep refetches and recompiles over and over again; manging so many branches is already annoying enough as is. They should be ignored when looking at this

TODO:

  • figure out why the uuid path pramter ends up as a query paramter in page URIs which supposedly (untested) doesn't happen for in/outboxes.
    Update: this also affected in/outboxes afterall, since they too are using string paramters. Avoided now by never including path params in the first place
  • test this in prod for a while for potential unforeseen consequences.
    Update: seems all good
  • merge #874 and rebase (dropping temporary commits cherry-picked from there)

Tangential remark

  • something recent'ish apparently broke outbox fetches with an OAuth token; it always just returns the top-level outbox collection never a page with the actual items. This is because no user is assigned to the conn. Fetches with a HTTP signatures appear to still work.
    I suspect this fell victim to the hotfixes in the 3.15.x series; specifically d62808e4b though idk what it was supposed to fix given there won't be any user assign if neither an oauth header nor a HTTP signature is present
Until now only a limited number of self-replies were inlined as an anonymous, unordered ActivityPub collection. Notably the advertised replies might be private posts. (For reference; this restrricted inline version of the collection was added in early 2020 via [pleroma#2129](https://git.pleroma.social/pleroma/pleroma/-/merge_requests/2129)) However, providing all (non-private) replies allows for better thread consistency across instances if the remote server cooperates. The collection existing as a standalone object has two advantages for this. For one, if it was still anonymous, _all_ replies would need to be inlined, which might be too bloated in pathological cases. Secondly, it allows remote servers to update the thread by traversing the reply collection independent of the original post. *(If the remote part knows about chronological ordering, it can in theory even efficiently resume from where it previously stopped)* **NOTE**: - everything up to and including commit `tmp` is just cherry -picked from other pending PRs so i don’t have to do annoying migration reverts/applications and dep refetches and recompiles over and over again; manging so many branches is already annoying enough as is. They should be ignored when looking at this **TODO**: - [x] figure out why the `uuid` path pramter ends up as a query paramter in page URIs ~~which supposedly (untested) doesn't happen for in/outboxes~~. *Update:* this also affected in/outboxes afterall, since they too are using string paramters. Avoided now by never including path params in the first place - [x] test this in prod for a while for potential unforeseen consequences. *Update:* seems all good - [x] merge #874 and rebase *(dropping temporary commits cherry-picked from there)* **Tangential remark** - something recent'ish apparently broke outbox fetches with an OAuth token; it always just returns the top-level outbox collection never a page with the actual items. This is because no user is assigned to the conn. Fetches with a HTTP signatures appear to still work. I suspect this fell victim to the hotfixes in the 3.15.x series; specifically [d62808e4b](https://akkoma.dev/AkkomaGang/akkoma/commit/d62808e4b64eca2b12379f2a4ba2e9a29828626a) though idk what it was supposed to fix given there won't be any `user` assign if neither an oauth header nor a HTTP signature is present
Oneric added 20 commits 2025-04-20 19:10:04 +00:00
It is not needed since fetch_public_keys will already
initiate remote lookup if necessary
This property was introduced as a way to gauge whether and
how much enabling authfetch might break passive federation in
#312.

However, with the db field defaulting to false, there’s no distinction
between instances without valid signatures and those which just never
attempted to fetch anything from the local instance.
Furthermore, this was never exposed anywhere and required manually
checking the database or cachex state via a remote shell.

Given the above it appears this doesn't actually
provide anything useful, thus drop it.
This avoids spurious key refetches on each failing alias
And adjust log details
Activity db queries are not cached
and most request will not actually need these aliases
Most headers are automatically checked by the library after this
upgrade. But since digest is only required for requests with a body
and body processing is handled outside the lib atm, we need to
explicity pass the presence or absence along or not get feedback
about creating broken signatures.

This makes bugs in our signatures more apparent
allowing faster discovery and fixing
This requirement was originally added together with splicing the
inbox owner into the non b* addressing fields to make bcc transports
work in https://git.pleroma.social/pleroma/pleroma/-/merge_requests/390.
Later on this was relaxed to always allow deliveries devoid of any
addressing at all in f6cb963df2
and always allow deliveries from actors the owner is following in
750b369d04 to fix interop issues with
Mastodon and Honk respectively.

The justification for both the filtering and splicing comes from
one sentence in AP spec’s inbox section:
> In general, the owner of an inbox is likely
> to be able to access all of their inbox contents.

While this may provide plausible justification for splicing the owner
into cc, it is less clear how this requires or justifies the set of
filtering rules employed here.
Surveying a few other implementations no similar
filtering or splicing appears to be employed.

Furthermore, spec-compliant servers will strip bto/bcc _before_
delivery to remote servers, meaning any compliant bcc transport
out there will NOT contain any explicit addressing of the inbox owner.
Thus the addressing requirement directly opposes
the goal of the original patch.

Currently the requirement for the owner to be addressed once again
is causing interop issues. It turns out to be the root cause of
a long-standing (2+ years) bug preventing meaningful federation.
Bridgy sends e.g. Follow activities and Accepts for Follows directly
to the affected user’s personal inbox while solely addressing
the public scope in the to field. Notably follow relations never
getting established prevented the "accept if followed" allow rule
to ever come into effect.

To make matters worse non-addressed messages simply lead to a
vague "internal server error" response being sent back
which likely slowed down locating the issue.
Furthermore additional issues wrt to signatures cropped up after
the 500-response issues wa first reported, but they seem to have
already been fixed in the meantime, possibly with the signature
handling overhaul in Akkoma.

Given it repeatedly caused issues, does not appear to align with common
practice in the wider fedi ecosystem and apparently contradicts its
original intention, simply remove the requirement.

This is confirmed to fix bridgy interop.

The addressing splicing actually should also add the inbox owner to bto
or bcc instead of cc, but for now this is not changed and in practice
bto/bcc delivery appears to be basically unused anyway.
The initial info message listing all found packs ought to be sufficient
and with many packs installed thiscan create multiple pages of log
messages on each emoji reload or server start.
Any errors or non-indexed packs are still logged to higher levels.
And off-by-one error caused the last page
to always advertise one more page eventhough
the absence of further items is known
Up until now queries were always forced into descending ID order
(reverse chronological order with our ID schemes).
Now it’s possible to request the reverse by passing `oder_asc: true`.
Ecto.cast is will convert valid string keys to atoms, but can
only deal with inputs which use either string keys everywhere
or atom keys everywhere.
Since :id_type is used before the case it must be an atom,
thus it was impossible to use it with string paramteres before.
And use more precise name.
The enxt commit will add another user of the new function.
federation/out: add full replies collection to objects
Some checks are pending
ci/woodpecker/pr/build-amd64 Pipeline is pending approval
ci/woodpecker/pr/build-arm64 Pipeline is pending approval
ci/woodpecker/pr/docs Pipeline is pending approval
ci/woodpecker/pr/lint Pipeline is pending approval
ci/woodpecker/pr/test/1 Pipeline is pending approval
ci/woodpecker/pr/test/2 Pipeline is pending approval
177d3bf935
Until now only a limited number of self-replies were inlined as an
anonymous, unordered ActivityPub collection. Notably the advertised
replies might be private posts.

However, providing all (non-private) replies allows for better thread
consistency across instances if the remote server cooperates.
The collection existing as a stndalone object has two advantages
for this. For one, if it was still anonymous, _all_ replies would need
to be inlined, which might be too bloated in pathological cases.
Secondly, it allows remote servers to update the thread by traversing
the reply collection independent of the original post. (If the remote
part knows about chronological ordering, it can in theory
even efficiently resume from where it previously stopped)
Oneric force-pushed ap_replies_collection from 177d3bf935 to 82fc8c8696 2025-04-20 19:57:46 +00:00 Compare
Oneric force-pushed ap_replies_collection from 82fc8c8696 to a56f4186c3 2025-04-20 20:09:29 +00:00 Compare
Oneric force-pushed ap_replies_collection from a56f4186c3 to ba11f499f7 2025-04-20 23:43:45 +00:00 Compare
Oneric force-pushed ap_replies_collection from ba11f499f7 to ceb05f6bbd 2025-04-21 01:17:59 +00:00 Compare
Oneric changed title from WIP: provide full replies collection in ActivityPub objects to provide full replies collection in ActivityPub objects 2025-04-26 21:49:29 +00:00
Oneric force-pushed ap_replies_collection from ceb05f6bbd to b41a13df56 2025-06-07 19:03:04 +00:00 Compare
Oneric merged commit 401883b230 into develop 2025-06-07 19:33:40 +00:00
Oneric deleted branch ap_replies_collection 2025-06-07 19:33:41 +00:00
Sign in to join this conversation.
No description provided.