akkoma

Author	SHA1	Message	Date
Oneric	940792f8ba	Refetch on AP ID mismatch As hinted at in the commit message when strict checking was added in `8684964c5d`, refetching is more robust than display URL comparison but in exchange is harder to implement correctly. A similar refetch approach is also employed by e.g. Mastodon, IceShrimp and FireFish. To make sure no checks can be bypassed by forcing a refetch, id checking is placed at the very end. This will fix: - Peertube display URL arrays our transmogrifier fails to normalise - non-canonical display URLs from alternative frontends (theoretical; we didnt’t get any actual reports about this) It will also be helpful in the planned key handling overhaul. The modified user collision test was introduced in https://git.pleroma.social/pleroma/pleroma/-/merge_requests/461 and unfortunately the issues this fixes aren’t public. Afaict it was just meant to guard against someone serving faked data belonging to an unrelated domain. Since we now refetch and the id actually is mocked, lookup now succeeds but will use the real data from the authorative server making it unproblematic. Instead modify the fake data further and make sure we don’t end up using the spoofed version.	2024-10-14 01:42:43 +02:00
floatingghost	3bb31117e6	Merge pull request 'Handle domain mutes on the backend' (#804 ) from domain-mute-backend-processing into develop Reviewed-on: AkkomaGang/akkoma#804	2024-08-20 10:32:47 +00:00
Floatingghost	2c5c531c35	readd comment about domain mutes	2024-08-20 11:05:36 +01:00
floatingghost	f66135ed08	Merge pull request 'Avoid accumulation of stale data in websockets' (#806 ) from Oneric/akkoma:websocket_fullsweep into develop Reviewed-on: AkkomaGang/akkoma#806 Reviewed-by: floatingghost <hannah@coffee-and-dreams.uk>	2024-06-23 02:19:36 +00:00
Oneric	13e2a811ec	Avoid accumulation of stale data in websockets We’ve received reports of some specific instances slowly accumulating more and more binary data over time up to OOMs and globally setting ERL_FULLSWEEP_AFTER=0 has proven to be an effective countermeasure. However, this incurs increased cpu perf costs everywhere and is thus not suitable to apply out of the box. Apparently long-lived Phoenix websocket processes are known to often cause exactly this by getting into a state unfavourable for the garbage collector. Therefore it seems likely affected instances are using timeline streaming and do so in just the right way to trigger this. We can tune the garbage collector just for websocket processes and use a more lenient value of 20 to keep the added perf cost in check. Testing on one affected instance appears to confirm this theory Ref.: https://www.erlang.org/doc/man/erlang#ghlink-process_flag-2-idp226 https://blog.guzman.codes/using-phoenix-channels-high-memory-usage-save-money-with-erlfullsweepafter https://git.pleroma.social/pleroma/pleroma/-/merge_requests/4060 Tested-by: bjo	2024-06-22 22:22:33 +02:00
Oneric	c3069b9478	cosmetic: fix elixir 1.17 compiler warnings in main application	2024-06-19 01:49:59 +02:00
floatingghost	5992e8bb16	Merge pull request 'Update http-signatures dep, allow created header' (#800 ) from created-pseudoheader into develop Reviewed-on: AkkomaGang/akkoma#800	2024-06-17 21:52:59 +00:00
Floatingghost	57273754b7	we may as well handle (expires) as well	2024-06-17 22:30:14 +01:00
floatingghost	59bfdf2ca4	Merge pull request 'Add limit CLI flags to prune jobs' (#655 ) from Oneric/akkoma:prune-batch into develop Reviewed-on: AkkomaGang/akkoma#655	2024-06-17 20:47:53 +00:00
Oneric	bf8f493ffd	Remove proxy_remote vestiges Ever since `364b6969eb` this setting wasn't used by the backend and a noop. The stated usecase is better served by setting the base_url to a local subdomain and using proxying in nginx/Caddy/...	2024-06-16 01:21:52 +02:00
Floatingghost	3b197503d2	me me stupid person	2024-06-15 15:30:02 +01:00
Floatingghost	c0b2bba55e	revert subdomain change until i can look at why i did that	2024-06-15 15:14:42 +01:00
Floatingghost	4b765b1886	mix format	2024-06-15 15:06:28 +01:00
Floatingghost	cba2c5725f	Filter emoji reaction accounts by domain blocks	2024-06-15 15:05:52 +01:00
Floatingghost	2b96c3b224	Update http-signatures dep, allow created header	2024-06-12 18:40:44 +01:00
floatingghost	b03edb4ff4	Merge pull request 'Fix StealEmoji’s max size check' (#793 ) from Oneric/akkoma:emojistealer_contentlength into develop Reviewed-on: AkkomaGang/akkoma#793	2024-06-12 17:09:05 +00:00
Floatingghost	4d6fb43cbd	No need to spawn() any more	2024-06-12 02:09:24 +01:00
Floatingghost	ad52135bf5	Convert rich media backfill to oban task	2024-06-11 18:06:51 +01:00
Floatingghost	9c5feb81aa	fix tests	2024-06-09 21:26:29 +01:00
Floatingghost	a360836ce3	fix oembed test	2024-06-09 21:17:12 +01:00
Floatingghost	840c70c4fa	remove prints	2024-06-09 18:52:09 +01:00
Floatingghost	c65379afea	attempt to fix some tests	2024-06-09 18:45:38 +01:00
Floatingghost	16bed0562d	Fix tests	2024-06-09 18:28:00 +01:00
Mark Felder	a801dd7b07	Fix module struct matching	2024-06-09 17:38:28 +01:00
Mark Felder	1e86da43f5	Credo	2024-06-09 17:38:24 +01:00
Mark Felder	411831458c	Credo	2024-06-09 17:38:18 +01:00
Mark Felder	56463b2121	Fix compile warning warning: "else" clauses will never match because all patterns in "with" will always match lib/pleroma/web/rich_media/parser/ttl/opengraph.ex:10	2024-06-09 17:38:12 +01:00
Mark Felder	2f5eb79473	Mastodon API: Remove deprecated GET /api/v1/statuses/:id/card endpoint Removed back in 2019 https://github.com/mastodon/mastodon/pull/11213	2024-06-09 17:38:06 +01:00
Mark Felder	4746f98851	Fix broken Rich Media parsing when the image URL is a relative path	2024-06-09 17:36:28 +01:00
Mark Felder	765c7e98d2	Respect the TTL returned in OpenGraph tags	2024-06-09 17:36:15 +01:00
Floatingghost	4a3dd5f65e	lost in cherry-pick	2024-06-09 17:34:41 +01:00
Mark Felder	bfe4152385	Increase the :max_body for Rich Media to 5MB Websites are increasingly getting more bloated with tricks like inlining content (e.g., CNN.com) which puts pages at or above 5MB. This value may still be too low.	2024-06-09 17:34:29 +01:00
Mark Felder	5da9cbd8a5	RichMedia refactor Rich Media parsing was previously handled on-demand with a 2 second HTTP request timeout and retained only in Cachex. Every time a Pleroma instance is restarted it will have to request and parse the data for each status with a URL detected. When fetching a batch of statuses they were processed in parallel to attempt to keep the maximum latency at 2 seconds, but often resulted in a timeline appearing to hang during loading due to a URL that could not be successfully reached. URLs which had images links that expire (Amazon AWS) were parsed and inserted with a TTL to ensure the image link would not break. Rich Media data is now cached in the database and fetched asynchronously. Cachex is used as a read-through cache. When the data becomes available we stream an update to the clients. If the result is returned quickly the experience is almost seamless. Activities were already processed for their Rich Media data during ingestion to warm the cache, so users should not normally encounter the asynchronous loading of the Rich Media data. Implementation notes: - The async worker is a Task with a globally unique process name to prevent duplicate processing of the same URL - The Task will attempt to fetch the data 3 times with increasing sleep time between attempts - The HTTP request obeys the default HTTP request timeout value instead of 2 seconds - URLs that cannot be successfully parsed due to an unexpected error receives a negative cache entry for 15 minutes - URLs that fail with an expected error will receive a negative cache with no TTL - Activities that have no detected URLs insert a nil value in the Cachex :scrubber_cache so we do not repeat parsing the object content with Floki every time the activity is rendered - Expiring image URLs are handled with an Oban job - There is no automatic cleanup of the Rich Media data in the database, but it is safe to delete at any time - The post draft/preview feature makes the URL processing synchronous so the rendered post preview will have an accurate rendering Overall performance of timelines and creating new posts which contain URLs is greatly improved.	2024-06-09 17:33:48 +01:00
Floatingghost	a924e117fd	Add pool timeouts	2024-06-09 17:20:29 +01:00
Oneric	2180d068ae	Raise log level for start failures	2024-06-07 16:21:21 +02:00
Oneric	a3840e7d1f	Raise minimum PostgreSQL version to 12 This lets us: - avoid issues with broken hash indices for PostgreSQL <10 - drop runtime checks and legacy codepaths for <11 in db search - always enable custom query plans for performance optimisation PostgreSQL 11 is already EOL since 2023-11-09, so in theory everyone should already have moved on to 12 anyway.	2024-06-07 16:21:09 +02:00
Oneric	df27567d99	mrf/steal_emoji: display download_unknown_size in admin-fe Fixes omission in `d6d838cbe8`	2024-06-05 20:14:10 +02:00
Oneric	be5440c5e8	mrf/steal_emoji: fix size limit check Headers are strings, but this expected to already get an int thus always failing the comparison if the header was set. Fixes mistake in `d6d838cbe8`	2024-06-05 20:11:53 +02:00
Floatingghost	0f65dd3ebe	remove pointless logger	2024-06-04 14:34:59 +01:00
Floatingghost	38d09cb0ce	remove now-pointless clause	2024-06-04 14:34:18 +01:00
Floatingghost	c9a03af7c1	Move rescue to the HTTP request itself	2024-06-04 14:30:16 +01:00
Floatingghost	0f7ae0fa21	am i baka	2024-06-04 14:26:33 +01:00
Floatingghost	30e13a8785	Don't error on rich media fail	2024-06-04 14:21:40 +01:00
Floatingghost	778b213945	enqueue pin fetches after changeset validation	2024-06-01 08:25:35 +01:00
Oneric	bed7ff8e89	mix: consistently use shell_info and shell_error Logger output being visible depends on user configuration, but most of the prints in mix tasks should always be shown. When running inside a mix shell, it’s probably preferable to send output directly to it rather than using raw IO.puts and we already have shell_* functions for this, let’s use them everywhere.	2024-05-31 17:17:42 +02:00
Oneric	70cd5f91d8	dbprune/activites: prune array activities first This query is less costly; if something goes wrong or gets aborted later at least this part will arelady be done.	2024-05-31 17:16:40 +02:00
Oneric	aeaebb566c	dbprune: allow splitting array and single activity prunes The former is typically just a few reports; it doesn't make sense to rerun it over and over again in batched prunes or if a full prune OOMed.	2024-05-31 17:16:40 +02:00
Oneric	5751637926	dbprune: use query!	2024-05-31 17:16:40 +02:00
Oneric	24bab63cd8	dbprune: add more logs Pruning can go on for a long time; give admins some insight into that something is happening to make it less frustrating and to make it easier which part of the process is stalled should this happen. Again most of the changes are merely reindents; review with whitespace changes hidden recommended.	2024-05-31 17:16:40 +02:00
Oneric	1d4c212441	dbprune: shortcut array activity search This brought down query costs from 7,953,740.90 to 47,600.97	2024-05-31 17:16:40 +02:00

1 2 3 4 5 ...

9387 commits