This gives feedback when to stop rerunning limited batches.
Most of the diff is just adjusting indentation; best reviewed
with whitespace-only changes hidden, e.g. `git diff -w`.
This part of pruning can be very expensive and bog down the whole
instance to an unusable sate for a long time. It can thus be desireable
to split it from prune_objects and run it on its own in smaller limited batches.
If the batches are smaller enough and spaced out a bit, it may even be possible
to avoid any downtime. If not, the limit can still help to at least make the
downtime duration somewhat more predictable.
Using only the admin key works as well currently
and Akkoma needs to know the admin key to be able
to add new entries etc. However the Meilisearch
key descriptions suggest the admin key is not
supposed to be used for searches, so let’s not.
For compatibility with existings configs, search_key remains optional.
This makes show-key’s output match our documentation as of Meilisearch
1.8.0-8-g4d5971f343c00d45c11ef0cfb6f61e83a8508208. Since I’m not sure
if older versions maybe only provided description, it will fallback to
the latter if no name parameter exists.
Meilisearch is already configured to return results sorted by a
particular ranking configured in the meilisearch CLI task.
Resorting the returned top results by date partially negates this and
runs counter to what someone with tweaked settings expects.
Issue and fix identified by AdamK2003 in
AkkomaGang/akkoma#579
But instead of using a O(n^2) resorting, this commit directly
retrieves results in the correct order from the database.
Closes: AkkomaGang/akkoma#579
And while add it point to this via a top-level
FEDERATION.md document as standardised by FEP-67ff.
Also add a few missing descriptions to the config cheatsheet
and move the recently removed C2S extension into an appropiate
subsection.
Trying to display non-media as media crashed the renderer,
but when posting a status with a valid, non-media object id
the post was still created, but then crashed e.g. timeline rendering.
It also crashed C2S inbox reads, so this could not be used to leak
private posts.
Afaict this was never used, but keeping this (in theory) possible
hinders detecting which objects are actually media uploads and
which proper ActivityPub objects.
It was originally added as part of upload support itself in
02d3dc6869 without being used
and `git log -S:activity_type` and `git log -Sactivity_type:`
don't find any other commits using this.
In Mastodon media can only be used by owners and only be associated with
a single post. We currently allow media to be associated with several
posts and until now did not limit their usage in posts to media owners.
However, media update and GET lookup was already limited to owners.
(In accordance with allowing media reuse, we also still allow GET
lookups of media already used in a post unlike Mastodon)
Allowing reuse isn’t problematic per se, but allowing use by non-owners
can be problematic if media ids of private-scoped posts can be guessed
since creating a new post with this media id will reveal the uploaded
file content and alt text.
Given media ids are currently just part of a sequentieal series shared
with some other objects, guessing media ids is with some persistence
indeed feasible.
E.g. sampline some public media ids from a real-world
instance with 112 total and 61 monthly-active users:
17.465.096 at t0
17.472.673 at t1 = t0 + 4h
17.473.248 at t2 = t1 + 20min
This gives about 30 new ids per minute of which most won't be
local media but remote and local posts, poll answers etc.
Assuming the default ratelimit of 15 post actions per 10s, scraping all
media for the 4h interval takes about 84 minutes and scraping the 20min
range mere 6.3 minutes. (Until the preceding commit, post updates were
not rate limited at all, allowing even faster scraping.)
If an attacker can infer (e.g. via reply to a follower-only post not
accessbile to the attacker) some sensitive information was uploaded
during a specific time interval and has some pointers regarding the
nature of the information, identifying the specific upload out of all
scraped media for this timerange is not impossible.
Thus restrict media usage to owners.
Checking ownership just in ActivitDraft would already be sufficient,
since when a scheduled status actually gets posted it goes through
ActivityDraft again, but would erroneously return a success status
when scheduling an illegal post.
Independently discovered and fixed by mint in Pleroma
1afde067b1
In MastoAPI media descriptions are updated via the
media update API not upon post creation or post update.
This functionality was originally added about 6 years ago in
ba93396649 which was part of
https://git.pleroma.social/pleroma/pleroma/-/merge_requests/626 and
https://git.pleroma.social/pleroma/pleroma-fe/-/merge_requests/450.
They introduced image descriptions to the front- and backend,
but predate adoption of Mastodon API.
For a while adding an `descriptions` array on post creation might have
continued to work as an undocumented Pleroma extension to Masto API, but
at latest when OpenAPI specs were added for those endpoints four years
ago in 7803a85d2c, these codepaths ceased
to be used. The API specs don’t list a `descriptions` parameter and
any unknown parameters are stripped out.
The attachments_from_ids function is only called from
ScheduledActivity and ActivityDraft.create with the latter
only being called by CommonAPI.{post,update} whihc in turn
are only called from ScheduledActivity again, MastoAPI controller
and without any attachment or description parameter WelcomeMessage.
Therefore no codepath can contain a descriptions parameter.
Direct users to add in the appropriate headers and update the listening
port instead of copy/pasting a config that's already outdated and
probably would otherwise have to be synced with the main example nginx
config.
Since the configuration options on the nginx side already exist in the
sample config, there's no need to tell users to copy-paste those
settings in again.
The /var/tmp directory is not mounted as tmpfs unlike /tmp which is
mounted as such on some distros like Fedora or Arch. Since there isn't
really a benefit to having the cache on tmpfs, this change should allow
for a larger cache if needed without worrying about running out of RAM.
Applying works fine with a 20220220135625 version, but it won’t be
rolled back in the right order. Fortunately this action is idempotent
so we can just rename and reapply it with a new id.
To also not break large-scale rollbacks past 2022 for anyone
who already applied it with the old id, keep a stub migration.