Merge 2024-11 stable #13

Merged
fedward merged 175 commits from AkkomaGang/akkoma:stable into stable 2024-11-26 21:26:45 +00:00
Owner
No description provided.
fedward added 175 commits 2024-11-26 21:26:27 +00:00
CREATE DATABASE was running in a transaction block with CREATE USER. This isn't allowed (any more?).
This is now two separate commands.

I also did some other touch-ups including
* making it OTP-first,
* add backup of static directory because this contains e.g. custom emoji, and
* remove the suggestion for using the setup_db.psql file. The reason is because I fear it causes more confusion than what it's worth.
    * Firstly, OTP installations won't have this file because it's created in /tmp.
    * Secondly, the instance has been reinstalled and thus a new setup_db.psql with different password may have been created, causing only more confusion.
There were two warnings, these are now fixed.

I moved the fonts folder into the css folder. Antother option was to change the relative path,
but it seems that after changing it in the css file, the path got changed back when rebuilding the site.
Maybe it needs to be changed somewhere else, idk, this worked.
This promotes and expands our existing optional migration.
Based on usage statistics from several instances, see:
AkkomaGang/akkoma#764

activities_hosts is now retained after all since it’s essential
for the "instance" query parameter of *oma’s public timeline to
reliably work in a reasonable amount of time. (Although akkoma-fe has
no support for this feature and apparently barely anyone uses it.)

activities_actor_index was already dropped before in
20221211234352_remove_unused_indices; no need to drop it again.

Birthday indices were introduced in pleroma starting with
20220116183110_add_birthday_to_users which is past the
last common migration 20210416051708.
The /var/tmp directory is not mounted as tmpfs unlike /tmp which is
mounted as such on some distros like Fedora or Arch. Since there isn't
really a benefit to having the cache on tmpfs, this change should allow
for a larger cache if needed without worrying about running out of RAM.
Since the configuration options on the nginx side already exist in the
sample config, there's no need to tell users to copy-paste those
settings in again.
Direct users to add in the appropriate headers and update the listening
port instead of copy/pasting a config that's already outdated and
probably would otherwise have to be synced with the main example nginx
config.
Fixes: AkkomaGang/akkoma#773
Reviewed-on: AkkomaGang/akkoma#782
And while add it point to this via a top-level
FEDERATION.md document as standardised by FEP-67ff.

Also add a few missing descriptions to the config cheatsheet
and move the recently removed C2S extension into an appropiate
subsection.
Reviewed-on: AkkomaGang/akkoma#778
Reviewed-on: AkkomaGang/akkoma#783
Reviewed-on: AkkomaGang/akkoma#785
Reviewed-on: AkkomaGang/akkoma#767
Reviewed-on: AkkomaGang/akkoma#776
Reviewed-by: floatingghost <hannah@coffee-and-dreams.uk>
Meilisearch is already configured to return results sorted by a
particular ranking configured in the meilisearch CLI task.
Resorting the returned top results by date partially negates this and
runs counter to what someone with tweaked settings expects.

Issue and fix identified by AdamK2003 in
AkkomaGang/akkoma#579
But instead of using a O(n^2) resorting, this commit directly
retrieves results in the correct order from the database.

Closes: AkkomaGang/akkoma#579
This makes show-key’s output match our documentation as of Meilisearch
1.8.0-8-g4d5971f343c00d45c11ef0cfb6f61e83a8508208. Since I’m not sure
if older versions maybe only provided description, it will fallback to
the latter if no name parameter exists.
Using only the admin key works as well currently
and Akkoma needs to know the admin key to be able
to add new entries etc. However the Meilisearch
key descriptions suggest the admin key is not
supposed to be used for searches, so let’s not.

For compatibility with existings configs, search_key remains optional.
Reviewed-on: AkkomaGang/akkoma#788
Co-authored-by: Floatingghost <hannah@coffee-and-dreams.uk>
Co-committed-by: Floatingghost <hannah@coffee-and-dreams.uk>
Reviewed-on: AkkomaGang/akkoma#772
No logic changes. Preparation for standalone orphan pruning.
This part of pruning can be very expensive and bog down the whole
instance to an unusable sate for a long time. It can thus be desireable
to split it from prune_objects and run it on its own in smaller limited batches.

If the batches are smaller enough and spaced out a bit, it may even be possible
to avoid any downtime. If not, the limit can still help to at least make the
downtime duration somewhat more predictable.
This gives feedback when to stop rerunning limited batches.

Most of the diff is just adjusting indentation; best reviewed
with whitespace-only changes hidden, e.g. `git diff -w`.
May sometimes be helpful to get more predictable runtime
than just with an age-based limit.

The subquery for the non-keep-threads path is required
since delte_all does not directly accept limit().

Again most of the diff is just adjusting indentation, best
hide whitespace-only changes with git diff -w or similar.
This brought down query costs from 7,953,740.90 to 47,600.97
Pruning can go on for a long time; give admins some insight into that
something is happening to make it less frustrating and to make it easier
which part of the process is stalled should this happen.

Again most of the changes are merely reindents;
review with whitespace changes hidden recommended.
The former is typically just a few reports; it doesn't make sense to
rerun it over and over again in batched prunes or if a full prune OOMed.
This query is less costly; if something goes wrong or gets aborted later
at least this part will arelady be done.
Logger output being visible depends on user configuration, but most of
the prints in mix tasks should always be shown. When running inside a
mix shell, it’s probably preferable to send output directly to it rather
than using raw IO.puts and we already have shell_* functions for this,
let’s use them everywhere.
All headers are strings, always.
In this case it didn't matter atm,
but let’s not provide confusing examples.
Headers are strings, but this expected to already get an int
thus always failing the comparison if the header was set.

Fixes mistake in d6d838cbe8
Fixes omission in d6d838cbe8
From experience, setting DB type to "Online transaction processing
system" seems to give the most optimal configuration in terms of
performance.

I also increased the recomended max connections to 25-30 as that leaves
some room for maintenance tasks to run without running out of
connections.

Finally, I removed the example configs since they're probably out of
date and I think it's better to direct people to use PGTune instead.
Reviewed-on: AkkomaGang/akkoma#795
Reviewed-on: AkkomaGang/akkoma#791
Apparently got jumbled during some rebase(s)
This lets us:
 - avoid issues with broken hash indices for PostgreSQL <10
 - drop runtime checks and legacy codepaths for <11 in db search
 - always enable custom query plans for performance optimisation

PostgreSQL 11 is already EOL since 2023-11-09, so
in theory everyone should already have moved on to 12 anyway.
Reviewed-on: AkkomaGang/akkoma#786
Rich Media parsing was previously handled on-demand with a 2 second HTTP request timeout and retained only in Cachex. Every time a Pleroma instance is restarted it will have to request and parse the data for each status with a URL detected. When fetching a batch of statuses they were processed in parallel to attempt to keep the maximum latency at 2 seconds, but often resulted in a timeline appearing to hang during loading due to a URL that could not be successfully reached. URLs which had images links that expire (Amazon AWS) were parsed and inserted with a TTL to ensure the image link would not break.

Rich Media data is now cached in the database and fetched asynchronously. Cachex is used as a read-through cache. When the data becomes available we stream an update to the clients. If the result is returned quickly the experience is almost seamless. Activities were already processed for their Rich Media data during ingestion to warm the cache, so users should not normally encounter the asynchronous loading of the Rich Media data.

Implementation notes:

- The async worker is a Task with a globally unique process name to prevent duplicate processing of the same URL
- The Task will attempt to fetch the data 3 times with increasing sleep time between attempts
- The HTTP request obeys the default HTTP request timeout value instead of 2 seconds
- URLs that cannot be successfully parsed due to an unexpected error receives a negative cache entry for 15 minutes
- URLs that fail with an expected error will receive a negative cache with no TTL
- Activities that have no detected URLs insert a nil value in the Cachex :scrubber_cache so we do not repeat parsing the object content with Floki every time the activity is rendered
- Expiring image URLs are handled with an Oban job
- There is no automatic cleanup of the Rich Media data in the database, but it is safe to delete at any time
- The post draft/preview feature makes the URL processing synchronous so the rendered post preview will have an accurate rendering

Overall performance of timelines and creating new posts which contain URLs is greatly improved.
Websites are increasingly getting more bloated with tricks like inlining content (e.g., CNN.com) which puts pages at or above 5MB. This value may still be too low.
Removed back in 2019

https://github.com/mastodon/mastodon/pull/11213
warning: "else" clauses will never match because all patterns in "with" will always match
  lib/pleroma/web/rich_media/parser/ttl/opengraph.ex:10
Reviewed-on: AkkomaGang/akkoma#796
Reviewed-on: AkkomaGang/akkoma#793
Ever since 364b6969eb
this setting wasn't used by the backend and a noop.
The stated usecase is better served by setting the base_url
to a local subdomain and using proxying in nginx/Caddy/...
Reviewed-on: AkkomaGang/akkoma#805
Reviewed-on: AkkomaGang/akkoma#655
Reviewed-on: AkkomaGang/akkoma#800
Updated by "Update PO files to match POT (msgmerge)" hook in Weblate.

Co-authored-by: Weblate <noreply@weblate.org>
Translate-URL: http://translate.akkoma.dev/projects/akkoma/akkoma-backend-config-descriptions/
Translation: Pleroma fe/Akkoma Backend (Config Descriptions)
Updated by "Update PO files to match POT (msgmerge)" hook in Weblate.

Co-authored-by: Weblate <noreply@weblate.org>
Translate-URL: http://translate.akkoma.dev/projects/akkoma/akkoma-backend-config-descriptions/
Translation: Pleroma fe/Akkoma Backend (Config Descriptions)
Updated by "Update PO files to match POT (msgmerge)" hook in Weblate.

Co-authored-by: Weblate <noreply@weblate.org>
Translate-URL: http://translate.akkoma.dev/projects/akkoma/akkoma-backend-config-descriptions/
Translation: Pleroma fe/Akkoma Backend (Config Descriptions)
Updated by "Update PO files to match POT (msgmerge)" hook in Weblate.

Co-authored-by: Weblate <noreply@weblate.org>
Translate-URL: http://translate.akkoma.dev/projects/akkoma/akkoma-backend-config-descriptions/
Translation: Pleroma fe/Akkoma Backend (Config Descriptions)
Updated by "Update PO files to match POT (msgmerge)" hook in Weblate.

Co-authored-by: Weblate <noreply@weblate.org>
Translate-URL: http://translate.akkoma.dev/projects/akkoma/akkoma-backend-config-descriptions/
Translation: Pleroma fe/Akkoma Backend (Config Descriptions)
Updated by "Update PO files to match POT (msgmerge)" hook in Weblate.

Co-authored-by: Weblate <noreply@weblate.org>
Translate-URL: http://translate.akkoma.dev/projects/akkoma/akkoma-backend-config-descriptions/
Translation: Pleroma fe/Akkoma Backend (Config Descriptions)
Updated by "Update PO files to match POT (msgmerge)" hook in Weblate.

Co-authored-by: Weblate <noreply@weblate.org>
Translate-URL: http://translate.akkoma.dev/projects/akkoma/akkoma-backend-config-descriptions/
Translation: Pleroma fe/Akkoma Backend (Config Descriptions)
Updated by "Update PO files to match POT (msgmerge)" hook in Weblate.

Co-authored-by: Weblate <noreply@weblate.org>
Translate-URL: http://translate.akkoma.dev/projects/akkoma/akkoma-backend-config-descriptions/
Translation: Pleroma fe/Akkoma Backend (Config Descriptions)
Updated by "Update PO files to match POT (msgmerge)" hook in Weblate.

Co-authored-by: Weblate <noreply@weblate.org>
Translate-URL: http://translate.akkoma.dev/projects/akkoma/akkoma-backend-config-descriptions/
Translation: Pleroma fe/Akkoma Backend (Config Descriptions)
Updated by "Update PO files to match POT (msgmerge)" hook in Weblate.

Co-authored-by: Weblate <noreply@weblate.org>
Translate-URL: http://translate.akkoma.dev/projects/akkoma/akkoma-backend-config-descriptions/
Translation: Pleroma fe/Akkoma Backend (Config Descriptions)
Updated by "Update PO files to match POT (msgmerge)" hook in Weblate.

Co-authored-by: Weblate <noreply@weblate.org>
Translate-URL: http://translate.akkoma.dev/projects/akkoma/akkoma-backend-config-descriptions/
Translation: Pleroma fe/Akkoma Backend (Config Descriptions)
Updated by "Update PO files to match POT (msgmerge)" hook in Weblate.

Co-authored-by: Weblate <noreply@weblate.org>
Translate-URL: http://translate.akkoma.dev/projects/akkoma/akkoma-backend-config-descriptions/
Translation: Pleroma fe/Akkoma Backend (Config Descriptions)
Updated by "Update PO files to match POT (msgmerge)" hook in Weblate.

Co-authored-by: Weblate <noreply@weblate.org>
Translate-URL: http://translate.akkoma.dev/projects/akkoma/akkoma-backend-config-descriptions/
Translation: Pleroma fe/Akkoma Backend (Config Descriptions)
Updated by "Update PO files to match POT (msgmerge)" hook in Weblate.

Co-authored-by: Weblate <noreply@weblate.org>
Translate-URL: http://translate.akkoma.dev/projects/akkoma/akkoma-backend-config-descriptions/
Translation: Pleroma fe/Akkoma Backend (Config Descriptions)
Updated by "Update PO files to match POT (msgmerge)" hook in Weblate.

Co-authored-by: Weblate <noreply@weblate.org>
Translate-URL: http://translate.akkoma.dev/projects/akkoma/akkoma-backend-config-descriptions/
Translation: Pleroma fe/Akkoma Backend (Config Descriptions)
Updated by "Update PO files to match POT (msgmerge)" hook in Weblate.

Co-authored-by: Weblate <noreply@weblate.org>
Translate-URL: http://translate.akkoma.dev/projects/akkoma/akkoma-backend-config-descriptions/
Translation: Pleroma fe/Akkoma Backend (Config Descriptions)
Updated by "Update PO files to match POT (msgmerge)" hook in Weblate.

Co-authored-by: Weblate <noreply@weblate.org>
Translate-URL: http://translate.akkoma.dev/projects/akkoma/akkoma-backend-config-descriptions/
Translation: Pleroma fe/Akkoma Backend (Config Descriptions)
Updated by "Update PO files to match POT (msgmerge)" hook in Weblate.

Co-authored-by: Weblate <noreply@weblate.org>
Translate-URL: http://translate.akkoma.dev/projects/akkoma/akkoma-backend-config-descriptions/
Translation: Pleroma fe/Akkoma Backend (Config Descriptions)
Now that a media subdomain is strongly recommended for security reasons,
there is no reason for them to be commented out by default.
Currently Akkoma doesn't have any proper mitigations against BREACH,
which exploits the use of HTTP compression to exfiltrate sensitive data.
(see: AkkomaGang/akkoma#721 (comment))

To err on the side of caution, disable gzip compression for now until we
can confirm that there's some sort of mitigation in place (whether that
would be Heal-The-Breach on the Caddy side or any Akkoma-side
mitigations).
Since those old migrations will now most likely only run during db init,
there’s not much point in running them in the background concurrently
anyway, so just drop the cncurrent setting rather than disabling
migration locks.
We’ve received reports of some specific instances slowly accumulating
more and more binary data over time up to OOMs and globally setting
ERL_FULLSWEEP_AFTER=0 has proven to be an effective countermeasure.
However, this incurs increased cpu perf costs everywhere and is
thus not suitable to apply out of the box.

Apparently long-lived Phoenix websocket processes are known to
often cause exactly this by getting into a state unfavourable
for the garbage collector.
Therefore it seems likely affected instances are using timeline
streaming and do so in just the right way to trigger this. We
can tune the garbage collector just for websocket processes
and use a more lenient value of 20 to keep the added perf cost
in check.

Testing on one affected instance appears to confirm this theory

Ref.:
  https://www.erlang.org/doc/man/erlang#ghlink-process_flag-2-idp226
  https://blog.guzman.codes/using-phoenix-channels-high-memory-usage-save-money-with-erlfullsweepafter
  https://git.pleroma.social/pleroma/pleroma/-/merge_requests/4060

Tested-by: bjo
Reviewed-on: AkkomaGang/akkoma#810
Reviewed-on: AkkomaGang/akkoma#806
Reviewed-by: floatingghost <hannah@coffee-and-dreams.uk>
Reviewed-on: AkkomaGang/akkoma#809
Reviewed-on: AkkomaGang/akkoma#808
Usually an id should point to another AP object
and the image file isn’t an AP object. We currently
do not provide standalone AP objects for emoji and
don't keep track of remote emoji at all.
Thus just federate them as anonymous objects,
i.e. objects only existing within a parent context
and using an explicit null id.

IceShrimp.NET previously adopted anonymous objects
for remote emoji without any apparent issues. See:
333611f65e

Fixes: AkkomaGang/akkoma#694
Reviewed-on: AkkomaGang/akkoma#554
Fixes a deprecation warning showing up each mix call
when using elixir 1.17
Fragments are already always stripped anyway
so listing one specific fragment here is
unnecessary and potentially confusing.

This effectively reverts
4457928e32
but keeps the added bridgy testcase.
Fixes: AkkomaGang/akkoma#748
We have a bunch of mysterious sporadic failures which usually disappear
when rerunning failed jobs only. Ideally we should locate and fix the
cause of those psoradic failures, but until we figure this out retrying
once makes CI status less useless.
Currently `mix test` prints a slew of logs in the terminal
with messages from different tests intermsparsed. Globally
enabling capture log hides log messages unless a test fails
reducing noise and making it easier to anylse the important
(from failed tests) messages.

Compiler warnings and a few messages not printed via Logger
still show up but its much more readable than before.

Ported from: 3aed111a42
The debug logs are very noisy and can be enabled during analysis
of a specific error believed to be SQL-related

--

Before log capturing those debug messages were still hidden,
but with log capturing they show up in the output of failed
tests unless disabled.

Cherry-picked-from: e628d00a81
Not _yet_ supported as of exiftool 12.87, though
at first glance it seems like standard BMP files
can't store any metadata besides colour profiles

Fixes the specific case from
AkkomaGang/akkoma-fe#396
although the frontend shouldn’t get bricked regardless.
Multiple profiles can be specified as a space-separated list
and the possibility of additional profiles is explicitly brought up
in ActivityStream spec
Mastodon API demands this be null unless it’s a multi-selection poll.
Not abiding by this can mess up display in some clients.

Fixes: AkkomaGang/akkoma#190
Ever since the browser frontend switcher was introduced in
de64c6c54a /akkoma counts as
an API prefix and thus gets skipped by frontend plugs
breaking the old swagger ui path of /akkoma/swagger-ui.

Do the simple thing and change the frontend path to
/pleroma/swaggerui which isn't an API path and can't collide
with frontend user paths given pleroma is areserved nickname.

Reported in
  https://meta.akkoma.dev/t/view-all-endpoints/269/7
  https://meta.akkoma.dev/t/swagger-ui-not-loading/728
Reviewed-on: AkkomaGang/akkoma#804
- pass env vars the proper™ way
- write log to file
- drop superfluous command_background
- make settings easily overwritable via conf.d
  to avoid needing to edit the service file directly
  if e.g. Akkoma was installed to another location
Reviewed-on: AkkomaGang/akkoma#834
Reviewed-on: AkkomaGang/akkoma#832
As hinted at in the commit message when strict checking
was added in 8684964c5d,
refetching is more robust than display URL comparison
but in exchange is harder to implement correctly.

A similar refetch approach is also employed by
e.g. Mastodon, IceShrimp and FireFish.

To make sure no checks can be bypassed by forcing
a refetch, id checking is placed at the very end.

This will fix:
 - Peertube display URL arrays our transmogrifier fails to normalise
 - non-canonical display URLs from alternative frontends
   (theoretical; we didnt’t get any actual reports about this)

It will also be helpful in the planned key handling overhaul.

The modified user collision test was introduced in
https://git.pleroma.social/pleroma/pleroma/-/merge_requests/461
and unfortunately the issues this fixes aren’t public.
Afaict it was just meant to guard against someone serving
faked data belonging to an unrelated domain. Since we now
refetch and the id actually is mocked, lookup now succeeds
but will use the real data from the authorative server
making it unproblematic. Instead modify the fake data further
and make sure we don’t end up using the spoofed version.
Since we now remember the final location redirects lead to
and use it for all further checks since
3e134b07fa, these redirects
can no longer be exploited to serve counterfeit objects.

This fixes:
 - display URLs from independent webapp clients
   redirecting to the canonical domain
 - Peertube display URLs for remote content
   (acting like the above)
Reviewed-on: AkkomaGang/akkoma#819
Reviewed-on: AkkomaGang/akkoma#815
Reviewed-on: AkkomaGang/akkoma#839
Reviewed-on: AkkomaGang/akkoma#841
Reviewed-on: AkkomaGang/akkoma#816
fedward merged commit 63673f61be into stable 2024-11-26 21:26:45 +00:00
Sign in to join this conversation.
No reviewers
No labels
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: fedward/distraction.party#13
No description provided.