Commit graph

9335 commits

Author SHA1 Message Date
timorl
cd7af81896
Rename StripLocation to StripMetadata for temporal-proofing reasons 2024-04-16 20:37:00 +02:00
ddb8a5ef73 yeet AP C2S support
literally nothing uses C2S AP, and it's another route into core
systems which requires analysis and maintenance. A second API
is just extra surface for potentially bad things so let's take
it out back and obliterate it
2024-04-16 13:55:03 +01:00
123db1abc4 Merge branch 'develop' into failed-fetch-processing 2024-04-16 12:35:54 +01:00
b2c29527fb make xmerl shut up about markup 2024-04-16 10:19:30 +01:00
timorl
59d32c10d9
Formatting 2024-04-16 08:02:13 +02:00
d2cee15c15 mix format says no 2024-04-16 03:07:28 +01:00
d70fa16383 oban options should be a keyword list 2024-04-16 02:58:50 +01:00
5043571084 Enable oban job uniqueness
by default just prevent job floods with a 1-seconds
uniqueness check, but override in RemoteFetcherWorker
for 5 minute uniqueness check over all states

:infinity is an option we can go for maybe at some point,
but that would prevent any refetches so maybe not idk.
2024-04-16 02:53:24 +01:00
b7dd739de1 Make sure we return the right format for oban 2024-04-16 02:35:21 +01:00
timorl
b144218dce
Merge branch 'develop' into elseinspe 2024-04-14 20:31:33 +02:00
2fc25980d1 fix pattern matching in fetch errors 2024-04-13 23:55:26 +01:00
c1f0b6b875 Merge pull request 'Accept body parameters for /api/pleroma/notification_settings' (#738) from Oneric/akkoma:notif-setting-parameters into develop
Reviewed-on: AkkomaGang/akkoma#738
2024-04-13 22:55:02 +00:00
33fb74043d Bring our adjustments into line with atom-failure 2024-04-13 22:56:04 +01:00
49ed27cd96 require logger 2024-04-13 22:25:31 +01:00
2e369aef71 Allow the Remote Fetcher to attempt fetching an unreachable instance 2024-04-12 20:33:21 +01:00
fed7a78c77 Oban jobs should be discarded on permanent errors 2024-04-12 20:33:17 +01:00
c0532bcae0 Handle 401s as I have observed it in the wild 2024-04-12 20:33:11 +01:00
ff515c05c3 Prevent requeuing Remote Fetcher jobs that exceed thread depth 2024-04-12 20:32:31 +01:00
7e5004b3e2 Leverage existing atoms as return errors for the object fetcher 2024-04-12 20:32:13 +01:00
53a9413b95 Formatting 2024-04-12 20:31:40 +01:00
d69cba1b93 Remove duplicate log messages from Transmogrifier
Object fetch errors are logged in the fetcher module
2024-04-12 20:31:31 +01:00
3c54f407c5 Conslidate log messages for object fetcher failures and leverage Logger.metadata 2024-04-12 20:30:38 +01:00
825ae46bfa Set Logger level to error 2024-04-12 20:29:33 +01:00
eeed051a0f Fix detection of user follower collection being private
We were overzealous with matching on a raw error from the object fetch that should have never been relied on like this. If we can't fetch successfully we should assume that the collection is private.

Building a more expressive and universal error struct to match on may be something to consider.
2024-04-12 20:29:11 +01:00
30d63aaa6e Revert "Mark instances as unreachable when returning a 403 from an object fetch"
This reverts commit d472bafec19cee269e7c943bafae7c805785acd7.
2024-04-12 20:28:56 +01:00
e2b04fac5a Skip remote fetch jobs for unreachable instances 2024-04-12 20:28:36 +01:00
6d368808d3 Remove mistaken duplicate fetch 2024-04-12 20:28:31 +01:00
132036f951 Cancel remote fetch jobs for deleted objects 2024-04-12 20:28:21 +01:00
4ff22a409a Consolidate the HTTP status code checking into the private get_object/1 2024-04-12 20:28:16 +01:00
4c29366fe5 Mark instances as unreachable when returning a 403 from an object fetch
This is a definite sign the instance is blocked and they are enforcing authorized_fetch
2024-04-12 20:27:33 +01:00
ac4cc619ea Fix Transmogrifier tests
These tests relied on the removed Fetcher.fetch_object_from_id!/2 function injecting the error tuple into a log message with the exact words "Object containment failed."

We will keep this behavior by generating a similar log message, but perhaps this should do a better job of matching on the error tuple returned by Transmogrifier.handle_incoming/1
2024-04-12 20:26:56 +01:00
c241b5b09f Remove Fetcher.fetch_object_from_id!/2
It was only being called once and can be replaced with a case statement.
2024-04-12 20:26:28 +01:00
6f3c955aa0 Merge pull request 'elixir1.16 testing' (#742) from elixir1.16 into develop
Reviewed-on: AkkomaGang/akkoma#742
2024-04-12 18:49:33 +00:00
024ffadd80 Merge pull request 'Don't list old accounts as aliases in WebFinger' (#713) from erincandescent/akkoma:no-old-account-alias into develop
Reviewed-on: AkkomaGang/akkoma#713
2024-04-12 18:34:14 +00:00
e2e4f53585 Merge pull request 'Use standard-compliant Accept header when fetching' (#740) from Oneric/akkoma:fetch_std-accept-hdr into develop
Reviewed-on: AkkomaGang/akkoma#740
2024-04-12 18:28:26 +00:00
df25d86999 Cleaned up FEP-fffd commits a bit 2024-04-12 18:50:57 +01:00
4887df12d7 Merge pull request 'Allow for url to be a list' (#718) from helge/akkoma:develop into develop
Reviewed-on: AkkomaGang/akkoma#718
2024-04-12 17:39:38 +00:00
e6ca2b4d2a Merge pull request 'Fix array-less EmojiReacts' (#739) from Oneric/akkoma:tag-arrayless into develop
Reviewed-on: AkkomaGang/akkoma#739
2024-04-12 17:26:07 +00:00
6ba80aaff5 Merge pull request 'Check if data is visible before embedding it in OG tags' (#741) from ograph-restrictions into develop
Reviewed-on: AkkomaGang/akkoma#741
2024-04-12 17:22:59 +00:00
8e60177466 Merge pull request 'MRF.InlineQuotePolicy: Add link to post URL, not ID' (#733) from erincandescent/akkoma:quote-url into develop
Reviewed-on: AkkomaGang/akkoma#733
2024-04-12 17:02:52 +00:00
75d9e2b375 MRF.InlineQuotePolicy: Add link to post URL, not ID
"id" is used for the canonical link to the AS2 representation of an object.
"url" is typically used for the canonical link to the HTTP representation.
It is what we use, for example, when following the "external source" link
in the frontend. However, it's not the link we include in the post contents
for quote posts.

Using URL instead means we include a more user-friendly URL for Mastodon,
and a working (in the browser) URL for Threads
2024-04-12 13:23:50 +02:00
05f8179d08 check if data is visible before embedding it in OG tags
previously we would uncritically take data and format it into
tags for static-fe and the like - however, instances can be
configured to disallow unauthenticated access to these resources.

this means that OG tags as a vector for information leakage.

_technically_ this should only occur if you have both
restrict_unauthenticated *AND* you run static-fe, which makes no
sense since static-fe is for unauthenticated people in particular,
but hey ho.
2024-04-12 05:16:47 +01:00
fae0a14ee8 Use standard-compliant Accept header when fetching
Spec says clients MUST use this header and servers MUST respond to it,
while servers merely SHOULD respond to the one we used before.
https://www.w3.org/TR/activitypub/#retrieving-objects

The old value is kept as a fallback since at least two years ago
not every implementation correctly dealt with the spec-compliant
variant, see: https://github.com/owncast/owncast/issues/1827

Fixes: AkkomaGang/akkoma#730
2024-04-12 00:22:37 +02:00
1135935cbe Merge remote-tracking branch 'oneric/ipv6' into develop 2024-04-11 20:59:49 +01:00
bd74ad9ce4 Accept body parameters for /api/pleroma/notification_settings
This brings it in line with its documentation and akkoma-fe’s
expectations. For backwards compatibility URL parameters are still
accept with lower priority. Unfortunately this means duplicating
parameters and descriptions in the API spec.

Usually Plug already pre-merges parameters from different sources into
the plain 'params' parameter which then gets forwarded by Phoenix.
However, OpenApiSpex 3.x prevents this; 4.x is set to change this
  https://github.com/open-api-spex/open_api_spex/issues/334
  https://github.com/open-api-spex/open_api_spex/issues/92

Fixes: AkkomaGang/akkoma#691
Fixes: AkkomaGang/akkoma#722
2024-04-09 04:11:28 +02:00
462225880a Accept EmojiReacts with non-array tag
JSON-LD compaction strips the array since it’s just one object

Fixes: AkkomaGang/akkoma#720
2024-04-09 04:04:16 +02:00
9598137d32 Drop base_url special casing in test env
61621ebdbc already explicitly added
the uploader base url to config/test.exs and it reduces differences
from prod.
2024-04-07 00:20:12 +02:00
554f19a9ed Merge pull request 'Refresh Users much more aggressively when processing Move activities' (#714) from erincandescent/akkoma:move-bust-cache into develop
Reviewed-on: AkkomaGang/akkoma#714
2024-04-03 10:03:14 +00:00
b5d97e7d85 Don't error out if we're not using the local uploader 2024-04-02 11:36:26 +01:00
f592090206 Fix tests that relied on no base_url in the uploader 2024-04-02 11:23:57 +01:00
61621ebdbc Add tests for extra warnings about media subdomains 2024-04-02 10:54:53 +01:00
4cd299bd83 Add extra warnings if the uploader is on the same domain as the main application 2024-04-02 10:20:59 +01:00
464db9ea0b Don't list old accounts as aliases in WebFinger
Per the XRD specification:

> 2.4. Element <Alias>
>
> The <Alias> element contains a URI value that is an additional
> identifier for the resource described by the XRD. This value
> MUST be an absolute URI. The <Alias> element does not identify
> additional resources the XRD is describing, **but rather provides
> additional identifiers for the same resource.**

(http://docs.oasis-open.org/xri/xrd/v1.0/os/xrd-1.0-os.html#element.alias, emphasis mine)

In other words, the alias list is expected to link to things which are
not just semantically the same, but exactly the same. Old user accounts
don't do that

This change should not pose a compatibility issue: Mastodon does not
list old accounts here (See e1fcb02867/app/serializers/webfinger_serializer.rb (L12))

The use of as:alsoKnownAs is also not quite semantically right here
(see https://www.w3.org/TR/did-core/#dfn-alsoknownas, which defines
it to be used to refer to identifiers which are interchangable) but
that's what DID get for reusing a property definition that Mastodon
already squatted long before they got to it
2024-04-01 13:34:58 +02:00
2d439034ca Ensure that spoof-inserted does not time out 2024-03-30 12:55:22 +00:00
0648d9ebaa Add mix tasks to detect spoofed posts and users
At least as far as we can
2024-03-26 16:05:20 -01:00
d441101200 Add mix task to detect uploaded spoof payloads 2024-03-26 16:05:20 -01:00
61ec592d66 Drop obsolete pixelfed workaround
This pixelfed issue was fixed in 2022-12 in
https://github.com/pixelfed/pixelfed/pull/3932

Co-authored-by: FloatingGhost <hannah@coffee-and-dreams.uk>
2024-03-26 15:11:06 -01:00
8684964c5d Only allow exact id matches
This protects us from falling for obvious spoofs as from the current
upload exploit (unfortunately we can’t reasonably do anything about
spoofs with exact matches as was possible via emoji and proxy).

Such objects being invalid is supported by the spec, sepcifically
sections 3.1 and 3.2: https://www.w3.org/TR/activitypub/#obj-id

Anonymous objects are not relevant here (they can only exists within
parent objects iiuc) and neither is client-to-server or transient objects
(as those cannot be fetched in the first place).
This leaves us with the requirement for `id` to (a) exist and
(b) be a publicly dereferencable URI from the originating server.
This alone does not yet demand strict equivalence, but the spec then
further explains objects ought to be fetchable _via their ID_.
Meaning an object not retrievable via its ID, is invalid.

This reading is supported by the fact, e.g. GoToSocial (recently) and
Mastodon (for 6+ years) do already implement such strict ID checks,
additionally proving this doesn’t cause federation issues in practice.

However, apart from canonical IDs there can also be additional display
URLs. *omas first redirect those to their canonical location, but *keys
and Mastodon directly serve the AP representation without redirects.

Mastodon and GTS deal with this in two different ways,
but both constitute an effective countermeasure:
 - Mastodon:
   Unless it already is a known AP id, two fetches occur.
   The first fetch just reads the `id` property and then refetches from
   the id. The last fetch requires the returned id to exactly match the
   URL the content was fetched from. (This can be optimised by skipping
   the second fetch if it already matches)
   05eda8d193/app/helpers/jsonld_helper.rb (L168)
   63f0979799

 - GTS:
   Only does a single fetch and then checks if _either_ the id
   _or_ url property (which can be an object) match the original fetch
   URL. This relies on implementations always including their display URL
   as "url" if differing from the id. For actors this is true for all
   investigated implementations, for posts only Mastodon includes an
   "url", but it is also the only one with a differing display URL.
   2bafd7daf5 (diff-943bbb02c8ac74ac5dc5d20807e561dcdfaebdc3b62b10730f643a20ac23c24fR222)

Albeit Mastodon’s refetch offers higher compatibility with theoretical
implmentations using either multiple different display URL or not
denoting any of them as "url" at all, for now we chose to adopt a
GTS-like refetch-free approach to avoid additional implementation
concerns wrt to whether redirects should be allowed when fetching a
canonical AP id and potential for accidentally loosening some checks
(e.g. cross-domain refetches) for one of the fetches.
This may be reconsidered in the future.
2024-03-25 14:05:05 -01:00
48b3a35793 Update user reference after fetch
Since we always followed redirects (and until recently allowed fuzzy id
matches), the ap_id of the received object might differ from the iniital
fetch url. This lead to us mistakenly trying to insert a new user with
the same nickname, ap_id, etc as an existing user (which will fail due
to uniqueness constraints) instead of updating the existing one.
2024-03-25 14:05:05 -01:00
9061d148be Ensure object id doesn’t change on refetch 2024-03-25 14:05:05 -01:00
3e134b07fa fetcher: return final URL after redirects from get_object
Since we reject cross-domain redirects, this doesn’t yet
make a difference, but it’s requried for stricter checking
subsequent commits will introduce.

To make sure (and in case we ever decide to reallow
cross-domain redirects) also use the final location
for containment and reachability checks.
2024-03-25 14:05:05 -01:00
f07eb4cb55 Sanity check fetched user data
In order to properly process incoming notes we need
to be able to map the key id back to an actor.
Also, check collections actually belong to the same server.

Key ids of Hubzilla and Bridgy samples were updated to what
modern versions of those output. If anything still uses the
old format, we would not be able to verify their posts anyway.
2024-03-25 14:05:05 -01:00
59a142e0b0 Never fetch resource from ourselves
If it’s not already in the database,
it must be counterfeit (or just not exists at all)

Changed test URLs were only ever used from "local: false" users anyway.
2024-03-25 14:05:05 -01:00
fee57eb376 Move actor check into fetch_and_contain_remote_object_from_id
This brings it in line with its name and closes an,
in practice harmless, verification hole.

This was/is the only user of contain_origin making it
safe to change the behaviour on actor-less objects.

Until now refetched objects did not ensure the new actor matches the
domain of the object. We refetch polls occasionally to retrieve
up-to-date vote counts. A malicious AP server could have switched out
the poll after initial posting with a completely different post
attribute to an actor from another server.
While we indeed fell for this spoof before the commit,
it fortunately seems to have had no ill effect in practice,
since the asociated Create activity is not changed. When exposing the
actor via our REST API, we read this info from the activity not the
object.

This at first thought still keeps one avenue for exploit open though:
the updated actor can be from our own domain and a third server be
instructed to fetch the object from us. However this is foiled by an
id mismatch. By necessity of being fetchable and our longstanding
same-domain check, the id must still be from the attacker’s server.
Even the most barebone authenticity check is able to sus this out.
2024-03-25 14:05:05 -01:00
c4cf4d7f0b Reject cross-domain redirects when fetching AP objects
Such redirects on AP queries seem most likely to be a spoofing attempt.
If the object is legit, the id should match the final domain anyway and
users can directly use the canonical URL.

The lack of such a check (and use of the initially queried domain’s
authority instead of the final domain) was enabling the current exploit
to even affect instances which already migrated away from a same-domain
upload/proxy setup in the past, but retained a redirect to not break old
attachments.

(In theory this redirect could, with some effort, have been limited to
 only old files, but common guides employed a catch-all redirect, which
 allows even future uploads to be reachable via an initial query to the
 main domain)

Same-domain redirects are valid and also used by ourselves,
e.g. for redirecting /notice/XXX to /objects/YYY.
2024-03-25 14:05:05 -01:00
2bcf633dc2 Document Pleroma.Object.Fetcher 2024-03-25 14:05:05 -01:00
c806adbfdb Refactor Fetcher.get_object for readability
Apart from slightly different error reasons wrt content-type,
this does not change functionality in any way.
2024-03-18 22:40:43 -01:00
ddd79ff22d Proactively harden emoji pack against path traversal
No new path traversal attacks are known. But given the many entrypoints
and code flow complexity inside pack.ex, it unfortunately seems
possible a future refactor or addition might reintroduce one.
Furthermore, some old packs might still contain traversing path entries
which could trigger undesireable actions on rename or delete.

To ensure this can never happen, assert safety during path construction.

Path.safe_relative was introduced in Elixir 1.14, but
fortunately, we already require at least 1.14 anyway.
2024-03-18 22:33:10 -01:00
d6d838cbe8 StealEmoji: check remote size before downloading
To save on bandwith and avoid OOMs with large files.
Ofc, this relies on the remote server
 (a) sending a content-length header and
 (b) being honest about the size.

Common fedi servers seem to provide the header and (b) at least raises
the required privilege of an malicious actor to a server infrastructure
admin of an explicitly allowed host.

A more complete defense which still works when faced with
a malicious server requires changes in upstream Finch;
see https://github.com/sneako/finch/issues/224
2024-03-18 22:33:10 -01:00
a4fa2ec9af StealEmoji: make final paths infeasible to predict
Certain attacks rely on predictable paths for their payloads.
If we weren’t so overly lax in our (id, URL) check, the current
counterfeit activity exploit would be one of those.
It seems plausible for future attacks to hinge on
or being made easier by predictable paths too.

In general, letting remote actors place arbitrary data at
a path within our domain of their choosing (sans prefix)
just doesn’t seem like a good idea.

Using fully random filenames would have worked as well, but this
is less friendly for admins checking emoji dirs.
The generated suffix should still be more than enough;
an attacker needs on average 140 trillion attempts to
correctly guess the final path.
2024-03-18 22:33:10 -01:00
d1c4d07404 Convert StealEmoji to pack.json
This will decouple filenames from shortcodes and
allow more image formats to work instead of only
those included in the auto-load glob. (Albeit we
still saved other formats to disk, wasting space)

Furthermore, this will allow us to make
final URL paths infeasible to predict.
2024-03-18 22:33:10 -01:00
fa98b44acf Fill out path for newly created packs
Before this was only filled on loading the pack again,
preventing the created pack from being used directly.
2024-03-18 22:33:10 -01:00
5b126567bb StealEmoji: drop superfluous basename
Since 3 commits ago we restrict shortcodes to a subset of
the POSIX Portable Filename Character Set, therefore
this can never have a directory component.
2024-03-18 22:33:10 -01:00
a8c6c780b4 StealEmoji: use Content-Type and reject non-images
E.g. *key’s emoji URLs typically don’t have file extensions, but
until now we just slapped ".png" at its end hoping for the best.

Furthermore, this gives us a chance to actually reject non-images,
which before was not feasible exatly due to those extension-less URLs
2024-03-18 22:33:10 -01:00
111cdb0d86 Split steal_emoji function for better readability 2024-03-18 22:33:10 -01:00
af041db6dc Limit emoji stealer to alphanum, dash, or underscore characters
As suggested in b387f4a1c1, only steal
emoji with alphanumerc, dash, or underscore characters.

Also consolidate all validation logic into a single function.

===

Taken from akkoma#703 with cosmetic tweaks

This matches our existing validation logic from Pleroma.Emoji,
and apart from excluding the dot also POSIX’s Portable Filename
Character Set making it always safe for use in filenames.

Mastodon is even stricter also disallowing U+002D HYPEN-MINUS
and requiring at least two characters.

Given both we and Mastodon reject shortcodes excluded
by this anyway, this doesn’t seem like a loss.
2024-03-18 22:33:10 -01:00
fc36b04016 Drop media proxy same-domain default for base_url
Even more than with user uploads, a same-domain proxy setup bears
significant security risks due to serving untrusted content under
the main domain space.

A risky setup like that should never be the default.
2024-03-18 22:33:10 -01:00
11ae8344eb Sanitise Content-Type of media proxy URLs
Just as with uploads and emoji before, this can otherwise be used
to place counterfeit AP objects or other malicious payloads.
In this case, even if we never assign a priviliged type to content,
the remote server can and until now we just mimcked whatever it told us.

Preview URLs already handle only specific, safe content types
and redirect to the external host for all else; thus no additional
sanitisiation is needed for them.

Non-previews are all delegated to the modified ReverseProxy module.
It already has consolidated logic for building response headers
making it easy to slip in sanitisation.

Although proxy urls are prefixed by a MAC built from a server secret,
attackers can still achieve a perfect id match when they are able to
change the contents of the pointed to URL. After sending an posts
containing an attachment at a controlled destination, the proxy URL can
be read back and inserted into the payload. After injection of
counterfeits in the target server the content can again be changed
to something innocuous lessening chance of detection.
2024-03-18 22:33:10 -01:00
e88d0a2853 Fix Content-Type of our schema
Strict servers fail to process anything from us otherwise.

Fixes: akkoma#716
2024-03-18 22:33:10 -01:00
ba558c0c24 Limit instance emoji to image types
Else malicious emoji packs or our EmojiStealer MRF can
put payloads into the same domain as the instance itself.
Sanitising the content type should prevent proper clients
from acting on any potential payload.

Note, this does not affect the default emoji shipped with Akkoma
as they are handled by another plug. However, those are fully trusted
and thus not in needed of sanitisation.
2024-03-18 22:33:10 -01:00
0ec62acb9d Always insert Dedupe upload filter
This actually was already intended before to eradict all future
path-traversal-style exploits and to fix issues with some
characters like akkoma#610 in 0b2ec0ccee. However, Dedupe and
AnonymizeFilename got mixed up. The latter only anonymises the name
in Content-Disposition headers GET parameters (with link_name),
_not_ the upload path.

Even without Dedupe, the upload path is prefixed by an UUID,
so it _should_ already be hard to guess for attackers. But now
we actually can be sure no path shenanigangs occur, uploads
reliably work and save some disk space.

While this makes the final path predictable, this prediction is
not exploitable. Insertion of a back-reference to the upload
itself requires pulling off a successfull preimage attack against
SHA-256, which is deemed infeasible for the foreseeable futures.

Dedupe was already included in the default list in config.exs
since 28cfb2c37a, but this will get overridde by whatever the
config generated by the "pleroma.instance gen" task chose.

Upload+delete tests running in parallel using Dedupe might be flaky, but
this was already true before and needs its own commit to fix eventually.
2024-03-18 22:33:10 -01:00
fef773ca35 Drop media base_url default and recommend different domain
Same-domain setups enabled now at least two exploits,
so they ought to be discouraged and definitely not be the default.
2024-03-18 22:33:10 -01:00
bdefbb8fd9 plug/upload_media: query config only once on init 2024-03-18 22:33:10 -01:00
f7c9793542 Sanitise Content-Type of uploads
The lack thereof enables spoofing ActivityPub objects.

A malicious user could upload fake activities as attachments
and (if having access to remote search) trick local and remote
fedi instances into fetching and processing it as a valid object.

If uploads are hosted on the same domain as the instance itself,
it is possible for anyone with upload access to impersonate(!)
other users of the same instance.
If uploads are exclusively hosted on a different domain, even the most
basic check of domain of the object id and fetch url matching should
prevent impersonation. However, it may still be possible to trick
servers into accepting bogus users on the upload (sub)domain and bogus
notes attributed to such users.
Instances which later migrated to a different domain and have a
permissive redirect rule in place can still be vulnerable.
If — like Akkoma — the fetching server is overly permissive with
redirects, impersonation still works.

This was possible because Plug.Static also uses our custom
MIME type mappings used for actually authentic AP objects.

Provided external storage providers don’t somehow return ActivityStream
Content-Types on their own, instances using those are also safe against
their users being spoofed via uploads.

Akkoma instances using the OnlyMedia upload filter
cannot be exploited as a vector in this way — IF the
fetching server validates the Content-Type of
fetched objects (Akkoma itself does this already).

However, restricting uploads to only multimedia files may be a bit too
heavy-handed. Instead this commit will restrict the returned
Content-Type headers for user uploaded files to a safe subset, falling
back to generic 'application/octet-stream' for anything else.
This will also protect against non-AP payloads as e.g. used in
past frontend code injection attacks.

It’s a slight regression in user comfort, if say PDFs are uploaded,
but this trade-off seems fairly acceptable.

(Note, just excluding our own custom types would offer no protection
 against non-AP payloads and bear a (perhaps small) risk of a silent
 regression should MIME ever decide to add a canonical extension for
 ActivityPub objects)

Now, one might expect there to be other defence mechanisms
besides Content-Type preventing counterfeits from being accepted,
like e.g. validation of the queried URL and AP ID matching.
Inserting a self-reference into our uploads is hard, but unfortunately
*oma does not verify the id in such a way and happily accepts _anything_
from the same domain (without even considering redirects).
E.g. Sharkey (and possibly other *keys) seem to attempt to guard
against this by immediately refetching the object from its ID, but
this is easily circumvented by just uploading two payloads with the
ID of one linking to the other.

Unfortunately *oma is thus _both_ a vector for spoofing and
vulnerable to those spoof payloads, resulting in an easy way
to impersonate our users.

Similar flaws exists for emoji and media proxy.

Subsequent commits will fix this by rigorously sanitising
content types in more areas, hardening our checks, improving
the default config and discouraging insecure config options.
2024-03-18 22:33:10 -01:00
6116f81546
Don't strip newlines in the Atom feed 2024-03-11 12:50:14 +01:00
7ef93c0b6d Add set_content_type to Plug.StaticNoCT 2024-03-04 17:50:20 +01:00
dbb6091d01 Import copy of Plug.Static from Plug 1.15.3
The following commit will apply the needed patch
2024-03-04 17:50:20 +01:00
Helge
5d89e0c917 Allow for url to be a list
This solves interoperability issues, see:
- https://git.pleroma.social/pleroma/pleroma/-/issues/3253
- https://socialhub.activitypub.rocks/t/fep-fffd-proxy-objects/3172/30?u=helge
- https://data.funfedi.dev/0.1.1/#url-parameter
2024-03-03 09:11:45 +01:00
f18e2ba42c Refresh Users much more aggressively when processing Move activities
The default refresh interval of 1 day is woefully inadequate here;
users expect to be able to add the alias to their new account and
press the move button on their old account and have it work.

This allows callers to specify a maximum age before a refetch is
triggered. We set that to 5s for the move code, as a nice compromise
between Making Things Work and ensuring that this can't be used
to hammer a remote server
2024-02-29 21:14:53 +01:00
fc95519dbf Allow fetching over IPv6
Mint/Finch disable IPv6 by default preventing us from
fetching anything from IPv6-only hosts without this.
2024-02-25 23:50:51 +01:00
7d61fb0906 Merge pull request 'Fix static-fe Twitter metadata / URL previews' (#700) from Oneric/akkoma:staticfe-metadata into develop
Reviewed-on: AkkomaGang/akkoma#700
2024-02-24 13:42:55 +00:00
3111181d3c mix format 2024-02-20 15:09:04 +00:00
b387f4a1c1 Don't steal emoji who's shortcodes have dots or colons in their name
Mastodon at the very least seems to prevent the creation of emoji with
dots in their name (and refuses to accept them in federation). It feels
like being cautious in what we accept is reasonable here.

Colons are the emoji separator and so obviously should be blocked.

Perhaps instead of filtering out things like this we should just
do a regex match on `[a-zA-Z0-9_-]`? But that's plausibly a decision
for another day

    Perhaps we should also have a centralised "is this a valid emoji shortcode?"
    function
2024-02-20 11:33:55 +01:00
Haelwenn (lanodan) Monnier
7d94476dd6 StealEmojiPolicy: Sanitize shortcodes
Closes: https://git.pleroma.social/pleroma/pleroma/-/issues/3245
2024-02-20 11:19:00 +01:00
37e2a35b86 Fix Twitter metadata
This partly reverts 1d884fd914
while fixing both the issue it addressed and the issue it caused.

The above commit successfully fixed OpenGraph metadata tags
which until then always showed the user bio instead of post content
by handing the activities AP ID as url to the Metadata builder
_instead_ of passing the internal ID as activity_id.
However, in doing so the commit instead inflicted this very problem
onto Twitter metadata tags which ironically are used by akkoma-fe.

This is because while the OpenGraph builder wants an URL as url,
the Twitter builder needs the internal ID to build the URL to the
embedded player for videos and has no URL property.

Thanks to twpol for tracking down this root cause in #644.

Now, once identified the problem is simple, but this simplicity
invites multiple possible solutions to bikeshed about.

 1. Just pass both properties to the builder and let them pick

 2. Drop the url parameter from the OpenGraph builder and instead
     a) build static-fe URL of the post from the ID (like Twitter)
     b) use the passed-in object’s AP ID as an URL

Approach 2a has the disadvantage of hardcoding the expected URL outside
the router, which will be problematic should it ever change.
Approach 2b is conceptually similar to how the builder works atm.
However, the og:url is supposed to be a _permanent_ ID, by changing it
we might, afaiui, technically violate OpenGraph specs(?). (Though its
real-world consequence may very well be near non-existent.)

This leaves just approach 1, which this commit implements.
Albeit it too is not without nits to pick, as it leaves the metadata
builders with an inconsistent interface.

Additionally, this will resolve the subotpimal Discord previews for
content-less image posts reported in #664.
Discord already prefers OpenGraph metadata, so it’s mostly unaffected.
However, it appears when encountering an explicitly empty OpenGraph
description and a non-empty Twitter description, it replaces just the
empty field with its Twitter counterpart, resulting in the user’s bio
slipping into the preview.
Secondly, regardless of any OpenGraph tags, Discord uses twitter:card to
decide how prominently images should be, but due to the bug the card
type was stuck as "summary", forcing images to always remain small.

Root cause identified by: twpol

Fixes: AkkomaGang/akkoma#644
Fixes: AkkomaGang/akkoma#664
2024-02-19 21:09:43 +00:00
3e24210e9f Merge pull request 'Prune old Update activities' (#683) from Oneric/akkoma:db-prune-old-updates into develop
Reviewed-on: AkkomaGang/akkoma#683
2024-02-19 13:59:33 +00:00
551ae69541 Merge pull request 'Fix and provide sane defaults for SMTP' (#686) from Oneric/akkoma:smtp-defaults into develop
Reviewed-on: AkkomaGang/akkoma#686
2024-02-19 13:39:15 +00:00
1a7839eaf2 Prune old Update activities
Once processed they serve no purpose anymore afaict.
Therefor, lets prune them like other transient activities
to not unnecessarily bloat the table.
2024-02-17 16:57:40 +01:00
755c75d8a4 Merge pull request 'Clean up warnings (+fallback metrics)' (#685) from Oneric/akkoma:metrics into develop
Reviewed-on: AkkomaGang/akkoma#685
2024-02-17 11:41:10 +00:00
289f93f5a2 Merge pull request 'Return last_status_at as date, not datetime' (#681) from katafrakt/akkoma:fix-last-status-at into develop
Reviewed-on: AkkomaGang/akkoma#681
2024-02-17 11:37:19 +00:00