Commit graph

1389 commits

Author SHA1 Message Date
219fffa0c3 Allow deleting uploaded files after media migration
Until now only the current base_url was checked.
After a media domain migration all pre-existing files
thus turned undeletable. Fix this with a new config
option allowing to list all old media base urls.
(This may also come in handy for a later db refactor, see
 AkkomaGang/akkoma#765)
2024-07-05 22:21:38 +02:00
34cb5b350c [TMP] following commits depend on #789
this is AkkomaGang/akkoma#789
squahsed into a single commit
2024-07-05 22:21:38 +02:00
83aab0859a Move prune changelog entries to correct version 2024-06-17 22:41:40 -04:00
59bfdf2ca4 Merge pull request 'Add limit CLI flags to prune jobs' (#655) from Oneric/akkoma:prune-batch into develop
Reviewed-on: AkkomaGang/akkoma#655
2024-06-17 20:47:53 +00:00
16bed0562d Fix tests 2024-06-09 18:28:00 +01:00
a3840e7d1f Raise minimum PostgreSQL version to 12
This lets us:
 - avoid issues with broken hash indices for PostgreSQL <10
 - drop runtime checks and legacy codepaths for <11 in db search
 - always enable custom query plans for performance optimisation

PostgreSQL 11 is already EOL since 2023-11-09, so
in theory everyone should already have moved on to 12 anyway.
2024-06-07 16:21:09 +02:00
b17d3dc6d8 Fix changelog
Apparently got jumbled during some rebase(s)
2024-06-07 16:20:34 +02:00
225f87ad62 Also allow limiting the initial prune_object
May sometimes be helpful to get more predictable runtime
than just with an age-based limit.

The subquery for the non-keep-threads path is required
since delte_all does not directly accept limit().

Again most of the diff is just adjusting indentation, best
hide whitespace-only changes with git diff -w or similar.
2024-05-31 17:16:40 +02:00
fa52093bac Add standalone prune_orphaned_activities CLI task
This part of pruning can be very expensive and bog down the whole
instance to an unusable sate for a long time. It can thus be desireable
to split it from prune_objects and run it on its own in smaller limited batches.

If the batches are smaller enough and spaced out a bit, it may even be possible
to avoid any downtime. If not, the limit can still help to at least make the
downtime duration somewhat more predictable.
2024-05-31 17:16:40 +02:00
fc7e07f424 meilisearch: enable using search_key
Using only the admin key works as well currently
and Akkoma needs to know the admin key to be able
to add new entries etc. However the Meilisearch
key descriptions suggest the admin key is not
supposed to be used for searches, so let’s not.

For compatibility with existings configs, search_key remains optional.
2024-05-29 23:17:27 +00:00
65aeaefa41 meilisearch: respect meili’s result ranking
Meilisearch is already configured to return results sorted by a
particular ranking configured in the meilisearch CLI task.
Resorting the returned top results by date partially negates this and
runs counter to what someone with tweaked settings expects.

Issue and fix identified by AdamK2003 in
AkkomaGang/akkoma#579
But instead of using a O(n^2) resorting, this commit directly
retrieves results in the correct order from the database.

Closes: AkkomaGang/akkoma#579
2024-05-29 23:17:27 +00:00
05eda169fe Document AP and nodeinfo extensions
And while add it point to this via a top-level
FEDERATION.md document as standardised by FEP-67ff.

Also add a few missing descriptions to the config cheatsheet
and move the recently removed C2S extension into an appropiate
subsection.
2024-05-26 19:04:06 +02:00
5e92f955ac bump version 2024-05-22 19:42:25 +01:00
9a91299f96 Don't try to handle non-media objects as media
Trying to display non-media as media crashed the renderer,
but when posting a status with a valid, non-media object id
the post was still created, but then crashed e.g. timeline rendering.
It also crashed C2S inbox reads, so this could not be used to leak
private posts.
2024-05-22 20:30:23 +02:00
0c2b33458d Restrict media usage to owners
In Mastodon media can only be used by owners and only be associated with
a single post. We currently allow media to be associated with several
posts and until now did not limit their usage in posts to media owners.
However, media update and GET lookup was already limited to owners.
(In accordance with allowing media reuse, we also still allow GET
lookups of media already used in a post unlike Mastodon)

Allowing reuse isn’t problematic per se, but allowing use by non-owners
can be problematic if media ids of private-scoped posts can be guessed
since creating a new post with this media id will reveal the uploaded
file content and alt text.
Given media ids are currently just part of a sequentieal series shared
with some other objects, guessing media ids is with some persistence
indeed feasible.

E.g. sampline some public media ids from a real-world
instance with 112 total and 61 monthly-active users:

  17.465.096  at  t0
  17.472.673  at  t1 = t0 + 4h
  17.473.248  at  t2 = t1 + 20min

This gives about 30 new ids per minute of which most won't be
local media but remote and local posts, poll answers etc.
Assuming the default ratelimit of 15 post actions per 10s, scraping all
media for the 4h interval takes about 84 minutes and scraping the 20min
range mere 6.3 minutes. (Until the preceding commit, post updates were
not rate limited at all, allowing even faster scraping.)
If an attacker can infer (e.g. via reply to a follower-only post not
accessbile to the attacker) some sensitive information was uploaded
during a specific time interval and has some pointers regarding the
nature of the information, identifying the specific upload out of all
scraped media for this timerange is not impossible.

Thus restrict media usage to owners.

Checking ownership just in ActivitDraft would already be sufficient,
since when a scheduled status actually gets posted it goes through
ActivityDraft again, but would erroneously return a success status
when scheduling an illegal post.

Independently discovered and fixed by mint in Pleroma
1afde067b1
2024-05-22 20:30:18 +02:00
76ded10a70 Merge pull request 'Backoff on HTTP requests when 429 is recieved' (#762) from backoff-http into develop
Reviewed-on: AkkomaGang/akkoma#762
2024-05-11 04:38:47 +00:00
7038b60ab5 bump version 2024-04-27 15:08:21 +01:00
9671cdecdf changelog entry 2024-04-26 19:10:17 +01:00
828158ef49 Merge remote-tracking branch 'oneric/fedfix-public-ld' into develop 2024-04-26 18:49:31 +01:00
c7276713e0 Merge remote-tracking branch 'oneric/changelog-3.13' into develop 2024-04-26 18:43:39 +01:00
5bc64c5753 changelog: add note about StripMetadata and ReadDescription order 2024-04-26 18:57:28 +02:00
5ee0fb18cb exiftool: make stripped tags configurable 2024-04-26 18:57:24 +02:00
12db5c23f2 Add missing changelog entries 2024-04-26 00:51:45 +02:00
b0a46c1e2e Normalise public adressing to fix federation
Due to JSON-LD compaction the full address of public scope
may also occur in shorter forms and the spec requires us to treat them
all equivalently. To save us the pain of repeatedly checking for all
variants internally, normalise inbound data to just one form.
See note at: https://www.w3.org/TR/activitypub/#public-addressing

This needs to happen very early, even before the other addressing fixes
else an earlier validator will reject the object. This in turn required
to move the list-tpye normalisation earlier as well, but since I was
unsure about putting empty lists into the data when no such field
existed before, I excluded this case and thus the later fixing had to be
kept as well.

Fixes: AkkomaGang/akkoma#670
2024-04-25 18:45:16 +02:00
timorl
2a9db73b4c
Merge branch 'develop' into elseinspe 2024-04-19 17:11:55 +02:00
timorl
cd7af81896
Rename StripLocation to StripMetadata for temporal-proofing reasons 2024-04-16 20:37:00 +02:00
1896ff1ab0 changelog entry 2024-04-16 02:35:59 +01:00
timorl
b144218dce
Merge branch 'develop' into elseinspe 2024-04-14 20:31:33 +02:00
d910e8d7d1 Add test suite for elixir1.16 2024-04-12 19:13:33 +01:00
df25d86999 Cleaned up FEP-fffd commits a bit 2024-04-12 18:50:57 +01:00
05f8179d08 check if data is visible before embedding it in OG tags
previously we would uncritically take data and format it into
tags for static-fe and the like - however, instances can be
configured to disallow unauthenticated access to these resources.

this means that OG tags as a vector for information leakage.

_technically_ this should only occur if you have both
restrict_unauthenticated *AND* you run static-fe, which makes no
sense since static-fe is for unauthenticated people in particular,
but hey ho.
2024-04-12 05:16:47 +01:00
1135935cbe Merge remote-tracking branch 'oneric/ipv6' into develop 2024-04-11 20:59:49 +01:00
b5d97e7d85 Don't error out if we're not using the local uploader 2024-04-02 11:36:26 +01:00
4cd299bd83 Add extra warnings if the uploader is on the same domain as the main application 2024-04-02 10:20:59 +01:00
3650bb0370 Changelog entry 2024-03-30 11:44:34 +00:00
ee7d98b093 Update Changelog 2024-03-29 08:35:15 -01:00
0ec62acb9d Always insert Dedupe upload filter
This actually was already intended before to eradict all future
path-traversal-style exploits and to fix issues with some
characters like akkoma#610 in 0b2ec0ccee. However, Dedupe and
AnonymizeFilename got mixed up. The latter only anonymises the name
in Content-Disposition headers GET parameters (with link_name),
_not_ the upload path.

Even without Dedupe, the upload path is prefixed by an UUID,
so it _should_ already be hard to guess for attackers. But now
we actually can be sure no path shenanigangs occur, uploads
reliably work and save some disk space.

While this makes the final path predictable, this prediction is
not exploitable. Insertion of a back-reference to the upload
itself requires pulling off a successfull preimage attack against
SHA-256, which is deemed infeasible for the foreseeable futures.

Dedupe was already included in the default list in config.exs
since 28cfb2c37a, but this will get overridde by whatever the
config generated by the "pleroma.instance gen" task chose.

Upload+delete tests running in parallel using Dedupe might be flaky, but
this was already true before and needs its own commit to fix eventually.
2024-03-18 22:33:10 -01:00
fef773ca35 Drop media base_url default and recommend different domain
Same-domain setups enabled now at least two exploits,
so they ought to be discouraged and definitely not be the default.
2024-03-18 22:33:10 -01:00
f7c9793542 Sanitise Content-Type of uploads
The lack thereof enables spoofing ActivityPub objects.

A malicious user could upload fake activities as attachments
and (if having access to remote search) trick local and remote
fedi instances into fetching and processing it as a valid object.

If uploads are hosted on the same domain as the instance itself,
it is possible for anyone with upload access to impersonate(!)
other users of the same instance.
If uploads are exclusively hosted on a different domain, even the most
basic check of domain of the object id and fetch url matching should
prevent impersonation. However, it may still be possible to trick
servers into accepting bogus users on the upload (sub)domain and bogus
notes attributed to such users.
Instances which later migrated to a different domain and have a
permissive redirect rule in place can still be vulnerable.
If — like Akkoma — the fetching server is overly permissive with
redirects, impersonation still works.

This was possible because Plug.Static also uses our custom
MIME type mappings used for actually authentic AP objects.

Provided external storage providers don’t somehow return ActivityStream
Content-Types on their own, instances using those are also safe against
their users being spoofed via uploads.

Akkoma instances using the OnlyMedia upload filter
cannot be exploited as a vector in this way — IF the
fetching server validates the Content-Type of
fetched objects (Akkoma itself does this already).

However, restricting uploads to only multimedia files may be a bit too
heavy-handed. Instead this commit will restrict the returned
Content-Type headers for user uploaded files to a safe subset, falling
back to generic 'application/octet-stream' for anything else.
This will also protect against non-AP payloads as e.g. used in
past frontend code injection attacks.

It’s a slight regression in user comfort, if say PDFs are uploaded,
but this trade-off seems fairly acceptable.

(Note, just excluding our own custom types would offer no protection
 against non-AP payloads and bear a (perhaps small) risk of a silent
 regression should MIME ever decide to add a canonical extension for
 ActivityPub objects)

Now, one might expect there to be other defence mechanisms
besides Content-Type preventing counterfeits from being accepted,
like e.g. validation of the queried URL and AP ID matching.
Inserting a self-reference into our uploads is hard, but unfortunately
*oma does not verify the id in such a way and happily accepts _anything_
from the same domain (without even considering redirects).
E.g. Sharkey (and possibly other *keys) seem to attempt to guard
against this by immediately refetching the object from its ID, but
this is easily circumvented by just uploading two payloads with the
ID of one linking to the other.

Unfortunately *oma is thus _both_ a vector for spoofing and
vulnerable to those spoof payloads, resulting in an easy way
to impersonate our users.

Similar flaws exists for emoji and media proxy.

Subsequent commits will fix this by rigorously sanitising
content types in more areas, hardening our checks, improving
the default config and discouraging insecure config options.
2024-03-18 22:33:10 -01:00
fc95519dbf Allow fetching over IPv6
Mint/Finch disable IPv6 by default preventing us from
fetching anything from IPv6-only hosts without this.
2024-02-25 23:50:51 +01:00
889b57df82 2024.02 release 2024-02-24 13:54:21 +00:00
e99e2407f3 Add background_removal to SimplePolicy MRF 2024-02-16 16:36:45 +01:00
7622aa27ca Federate user profile background
Currently our own frontend doesn’t show backgrounds of other users, this
property is already publicly readable via REST API and likely was always
intended to be shown and federated.

Recently Sharkey added support for profile backgrounds and
immediately made them federate and be displayed to others.
We use the same AP field as Sharkey here which should make
it interoperable both ways out-of-the-box.

Ref.: 4e64397635
2024-02-16 16:35:51 +01:00
0ed815b8a1 Merge branch 'followback' into develop 2024-02-16 13:27:40 +00:00
6bb455702d Document Akkoma API 2024-02-15 16:04:33 +01:00
376f6b15ca Add ability to auto-approve followbacks
Resolves: AkkomaGang/akkoma#148
2024-02-13 15:42:37 +01:00
e47c50666d Fix obfuscation of short domains
Fixes AkkomaGang/akkoma#645
2024-02-02 14:50:13 +00:00
2858cd81e1 Move changelog into our format 2023-12-15 16:32:41 +00:00
d1af78aba1 changelog 2023-09-15 12:00:45 +01:00
063e3c0d34 Disallow nil hosts in should_federate 2023-08-15 23:12:04 +01:00