Since we now remember the final location redirects lead to
and use it for all further checks since
3e134b07fa, these redirects
can no longer be exploited to serve counterfeit objects.
This fixes:
- display URLs from independent webapp clients
redirecting to the canonical domain
- Peertube display URLs for remote content
(acting like the above)
As hinted at in the commit message when strict checking
was added in 8684964c5d,
refetching is more robust than display URL comparison
but in exchange is harder to implement correctly.
A similar refetch approach is also employed by
e.g. Mastodon, IceShrimp and FireFish.
To make sure no checks can be bypassed by forcing
a refetch, id checking is placed at the very end.
This will fix:
- Peertube display URL arrays our transmogrifier fails to normalise
- non-canonical display URLs from alternative frontends
(theoretical; we didnt’t get any actual reports about this)
It will also be helpful in the planned key handling overhaul.
The modified user collision test was introduced in
https://git.pleroma.social/pleroma/pleroma/-/merge_requests/461
and unfortunately the issues this fixes aren’t public.
Afaict it was just meant to guard against someone serving
faked data belonging to an unrelated domain. Since we now
refetch and the id actually is mocked, lookup now succeeds
but will use the real data from the authorative server
making it unproblematic. Instead modify the fake data further
and make sure we don’t end up using the spoofed version.
These tests relied on the removed Fetcher.fetch_object_from_id!/2 function injecting the error tuple into a log message with the exact words "Object containment failed."
We will keep this behavior by generating a similar log message, but perhaps this should do a better job of matching on the error tuple returned by Transmogrifier.handle_incoming/1
This protects us from falling for obvious spoofs as from the current
upload exploit (unfortunately we can’t reasonably do anything about
spoofs with exact matches as was possible via emoji and proxy).
Such objects being invalid is supported by the spec, sepcifically
sections 3.1 and 3.2: https://www.w3.org/TR/activitypub/#obj-id
Anonymous objects are not relevant here (they can only exists within
parent objects iiuc) and neither is client-to-server or transient objects
(as those cannot be fetched in the first place).
This leaves us with the requirement for `id` to (a) exist and
(b) be a publicly dereferencable URI from the originating server.
This alone does not yet demand strict equivalence, but the spec then
further explains objects ought to be fetchable _via their ID_.
Meaning an object not retrievable via its ID, is invalid.
This reading is supported by the fact, e.g. GoToSocial (recently) and
Mastodon (for 6+ years) do already implement such strict ID checks,
additionally proving this doesn’t cause federation issues in practice.
However, apart from canonical IDs there can also be additional display
URLs. *omas first redirect those to their canonical location, but *keys
and Mastodon directly serve the AP representation without redirects.
Mastodon and GTS deal with this in two different ways,
but both constitute an effective countermeasure:
- Mastodon:
Unless it already is a known AP id, two fetches occur.
The first fetch just reads the `id` property and then refetches from
the id. The last fetch requires the returned id to exactly match the
URL the content was fetched from. (This can be optimised by skipping
the second fetch if it already matches)
05eda8d193/app/helpers/jsonld_helper.rb (L168)63f0979799
- GTS:
Only does a single fetch and then checks if _either_ the id
_or_ url property (which can be an object) match the original fetch
URL. This relies on implementations always including their display URL
as "url" if differing from the id. For actors this is true for all
investigated implementations, for posts only Mastodon includes an
"url", but it is also the only one with a differing display URL.
2bafd7daf5 (diff-943bbb02c8ac74ac5dc5d20807e561dcdfaebdc3b62b10730f643a20ac23c24fR222)
Albeit Mastodon’s refetch offers higher compatibility with theoretical
implmentations using either multiple different display URL or not
denoting any of them as "url" at all, for now we chose to adopt a
GTS-like refetch-free approach to avoid additional implementation
concerns wrt to whether redirects should be allowed when fetching a
canonical AP id and potential for accidentally loosening some checks
(e.g. cross-domain refetches) for one of the fetches.
This may be reconsidered in the future.
Since we reject cross-domain redirects, this doesn’t yet
make a difference, but it’s requried for stricter checking
subsequent commits will introduce.
To make sure (and in case we ever decide to reallow
cross-domain redirects) also use the final location
for containment and reachability checks.
If it’s not already in the database,
it must be counterfeit (or just not exists at all)
Changed test URLs were only ever used from "local: false" users anyway.
This brings it in line with its name and closes an,
in practice harmless, verification hole.
This was/is the only user of contain_origin making it
safe to change the behaviour on actor-less objects.
Until now refetched objects did not ensure the new actor matches the
domain of the object. We refetch polls occasionally to retrieve
up-to-date vote counts. A malicious AP server could have switched out
the poll after initial posting with a completely different post
attribute to an actor from another server.
While we indeed fell for this spoof before the commit,
it fortunately seems to have had no ill effect in practice,
since the asociated Create activity is not changed. When exposing the
actor via our REST API, we read this info from the activity not the
object.
This at first thought still keeps one avenue for exploit open though:
the updated actor can be from our own domain and a third server be
instructed to fetch the object from us. However this is foiled by an
id mismatch. By necessity of being fetchable and our longstanding
same-domain check, the id must still be from the attacker’s server.
Even the most barebone authenticity check is able to sus this out.
Such redirects on AP queries seem most likely to be a spoofing attempt.
If the object is legit, the id should match the final domain anyway and
users can directly use the canonical URL.
The lack of such a check (and use of the initially queried domain’s
authority instead of the final domain) was enabling the current exploit
to even affect instances which already migrated away from a same-domain
upload/proxy setup in the past, but retained a redirect to not break old
attachments.
(In theory this redirect could, with some effort, have been limited to
only old files, but common guides employed a catch-all redirect, which
allows even future uploads to be reachable via an initial query to the
main domain)
Same-domain redirects are valid and also used by ourselves,
e.g. for redirecting /notice/XXX to /objects/YYY.
Only real change here is making MRF rejects log as debug instead of info (#234)
I don't know if it's the best way to do it, but it seems it's just MRF using this and almost always this is intended.
The rest are just minor docs changes and syncing the restricted nicknames stuff.
I compiled and ran my changes with Docker and they all work.
Co-authored-by: r3g_5z <june@terezi.dev>
Reviewed-on: #313
Co-authored-by: @r3g_5z@plem.sapphic.site <june@girlboss.ceo>
Co-committed-by: @r3g_5z@plem.sapphic.site <june@girlboss.ceo>
Current FedSocket implementation has a bunch of problems. It doesn't
have proper error handling (in case of an error the server just doesn't
respond until the connection is closed, while the client doesn't match
any error messages and just assumes there has been an error after 15s)
and the code is full of bad descisions (see: fetch registry which uses
uuids for no reason and waits for a response by recursively querying a
ets table until the value changes, or double JSON encoding).
Sometime ago I almost completed rewriting fedsockets from scrach to
adress these issues. However, while doing so, I realized that fedsockets
are just too overkill for what they were trying to accomplish, which is
reduce the overhead of federation by not signing every message.
This could be done without reimplementing failure states and endpoint
logic we already have with HTTP by, for example, using TLS cert auth,
or switching to a more performant signature algorithm. I opened
https://git.pleroma.social/pleroma/pleroma/-/issues/2262 for further
discussion on alternatives to fedsockets.
From discussions I had with other Pleroma developers it seems like they
would approve the descision to remove them as well,
therefore I am submitting this patch.
Validate the content-type of the response when fetching an object,
according to https://www.w3.org/TR/activitypub/#x3-2-retrieving-objects.
content-type headers had to be added to many mocks in order to support
this, some of this was done with a regex. While I did go over the
resulting files to check I didn't modify anything unrelated, there is a
possibility I missed something.
Closes pleroma#1948