akkoma

Author	SHA1	Message	Date
Oneric	bed7ff8e89	mix: consistently use shell_info and shell_error Logger output being visible depends on user configuration, but most of the prints in mix tasks should always be shown. When running inside a mix shell, it’s probably preferable to send output directly to it rather than using raw IO.puts and we already have shell_* functions for this, let’s use them everywhere.	2024-05-31 17:17:42 +02:00
Oneric	70cd5f91d8	dbprune/activites: prune array activities first This query is less costly; if something goes wrong or gets aborted later at least this part will arelady be done.	2024-05-31 17:16:40 +02:00
Oneric	aeaebb566c	dbprune: allow splitting array and single activity prunes The former is typically just a few reports; it doesn't make sense to rerun it over and over again in batched prunes or if a full prune OOMed.	2024-05-31 17:16:40 +02:00
Oneric	5751637926	dbprune: use query!	2024-05-31 17:16:40 +02:00
Oneric	24bab63cd8	dbprune: add more logs Pruning can go on for a long time; give admins some insight into that something is happening to make it less frustrating and to make it easier which part of the process is stalled should this happen. Again most of the changes are merely reindents; review with whitespace changes hidden recommended.	2024-05-31 17:16:40 +02:00
Oneric	1d4c212441	dbprune: shortcut array activity search This brought down query costs from 7,953,740.90 to 47,600.97	2024-05-31 17:16:40 +02:00
Oneric	225f87ad62	Also allow limiting the initial prune_object May sometimes be helpful to get more predictable runtime than just with an age-based limit. The subquery for the non-keep-threads path is required since delte_all does not directly accept limit(). Again most of the diff is just adjusting indentation, best hide whitespace-only changes with git diff -w or similar.	2024-05-31 17:16:40 +02:00
Oneric	e64f031167	Log number of deleted rows in prune_orphaned_activities This gives feedback when to stop rerunning limited batches. Most of the diff is just adjusting indentation; best reviewed with whitespace-only changes hidden, e.g. `git diff -w`.	2024-05-31 17:16:40 +02:00
Oneric	fa52093bac	Add standalone prune_orphaned_activities CLI task This part of pruning can be very expensive and bog down the whole instance to an unusable sate for a long time. It can thus be desireable to split it from prune_objects and run it on its own in smaller limited batches. If the batches are smaller enough and spaced out a bit, it may even be possible to avoid any downtime. If not, the limit can still help to at least make the downtime duration somewhat more predictable.	2024-05-31 17:16:40 +02:00
Oneric	3126d15ffc	refactor: move prune_orphaned_activities into own function No logic changes. Preparation for standalone orphan pruning.	2024-05-31 17:16:39 +02:00
floatingghost	8f97c15b07	Merge pull request 'Preserve Meilisearch’s result ranking' (#772 ) from Oneric/akkoma:search-meili-order into develop Reviewed-on: AkkomaGang/akkoma#772	2024-05-31 14:12:05 +00:00
Floatingghost	3af0c53a86	use proper workers for fetching pins instead of an ad-hoc task (#788 ) Reviewed-on: AkkomaGang/akkoma#788 Co-authored-by: Floatingghost <hannah@coffee-and-dreams.uk> Co-committed-by: Floatingghost <hannah@coffee-and-dreams.uk>	2024-05-31 08:58:52 +00:00
Oneric	59685e25d2	meilisearch: show keys by name not description This makes show-key’s output match our documentation as of Meilisearch 1.8.0-8-g4d5971f343c00d45c11ef0cfb6f61e83a8508208. Since I’m not sure if older versions maybe only provided description, it will fallback to the latter if no name parameter exists.	2024-05-29 23:17:27 +00:00
Oneric	a95af3ee4c	exiftool: strip all non-essential tags Documentation was already clear on this only stripping GPS tags. But there are more potentially sensitive metadata tags (e.g. author and possibly description) and the name alone suggests a broader effect. Thus change the filter to strip all metadata except for colourspace info and orientation (technically it strips everything and then readds selected tags). Explicitly stripping CommonIFD0 is needed since -all does not modify IFD0 due to TIFF storing some actual image data there. CommonIFD0 then strips a bunch of commonly used actual metadata tags from IFD0, to my understanding leaving TIFF image data and custom metadata tags intact.	2024-04-25 23:00:42 +02:00
timorl	09d3ccf770	Read description before stripping metadata	2024-04-19 20:51:54 +02:00
timorl	cd7af81896	Rename StripLocation to StripMetadata for temporal-proofing reasons	2024-04-16 20:37:00 +02:00
timorl	b144218dce	Merge branch 'develop' into elseinspe	2024-04-14 20:31:33 +02:00
FloatingGhost	2d439034ca	Ensure that spoof-inserted does not time out	2024-03-30 12:55:22 +00:00
Oneric	0648d9ebaa	Add mix tasks to detect spoofed posts and users At least as far as we can	2024-03-26 16:05:20 -01:00
Oneric	d441101200	Add mix task to detect uploaded spoof payloads	2024-03-26 16:05:20 -01:00
Oneric	0ec62acb9d	Always insert Dedupe upload filter This actually was already intended before to eradict all future path-traversal-style exploits and to fix issues with some characters like akkoma#610 in `0b2ec0ccee`. However, Dedupe and AnonymizeFilename got mixed up. The latter only anonymises the name in Content-Disposition headers GET parameters (with link_name), _not_ the upload path. Even without Dedupe, the upload path is prefixed by an UUID, so it _should_ already be hard to guess for attackers. But now we actually can be sure no path shenanigangs occur, uploads reliably work and save some disk space. While this makes the final path predictable, this prediction is not exploitable. Insertion of a back-reference to the upload itself requires pulling off a successfull preimage attack against SHA-256, which is deemed infeasible for the foreseeable futures. Dedupe was already included in the default list in config.exs since `28cfb2c37a`, but this will get overridde by whatever the config generated by the "pleroma.instance gen" task chose. Upload+delete tests running in parallel using Dedupe might be flaky, but this was already true before and needs its own commit to fix eventually.	2024-03-18 22:33:10 -01:00
Oneric	fef773ca35	Drop media base_url default and recommend different domain Same-domain setups enabled now at least two exploits, so they ought to be discouraged and definitely not be the default.	2024-03-18 22:33:10 -01:00
FloatingGhost	6cb40bee26	Migrate to phoenix 1.7 (#626 ) Closes #612 Co-authored-by: tusooa <tusooa@kazv.moe> Reviewed-on: AkkomaGang/akkoma#626 Co-authored-by: FloatingGhost <hannah@coffee-and-dreams.uk> Co-committed-by: FloatingGhost <hannah@coffee-and-dreams.uk>	2023-08-15 10:22:18 +00:00
floatingghost	0b32beb051	Merge pull request 'meilisearch: Move published date to lower priority' (#623 ) from norm/akkoma:meilisearch-order into develop Reviewed-on: AkkomaGang/akkoma#623	2023-08-12 14:36:53 +00:00
floatingghost	7bb41bffb3	Merge pull request 'Reload emoji when using mix pleroma.emoji gen-pack and get-packs' (#563 ) from norm/akkoma:emoji-reload into develop Reviewed-on: AkkomaGang/akkoma#563	2023-08-12 14:07:23 +00:00
Norm	d79c92f9c6	meilisearch: Move published date to lower priority Currently, Akkoma sorts by published date first before everything else. This however makes search results pretty bad since Meilisearch uses a bucket sort algorithm in order of the ranking rules specified: https://www.meilisearch.com/docs/learn/core_concepts/relevancy#behavior Since the `published` attribute is a unix timestamp, the resulting buckets are pretty small so the other rules essentially have little to no effect on the rankings of search results. This fixes that issue by moving the `published:desc` rule further down so it still sorts by date, but only after considering everything else. AFAIK attribute and sort doesn't really affect results for Akkoma since the only attribute considered is the `content` attribute and the `sort` parameter isn't used in Akkoma searches. Everything else is made to match more closely to Meilisearch's defaults.	2023-08-11 11:07:14 -04:00
Haelwenn (lanodan) Monnier	4f57c87be4	instance gen: Reduce permissions of pleroma directories and config files Original: `69caedc591`	2023-08-04 14:13:50 -04:00
FloatingGhost	98cb255d12	Support elixir1.15 OTP builds to 1.15 Changelog entry Ensure policies are fully loaded Fix :warn use main branch for linkify Fix warn in tests Migrations for phoenix 1.17 Revert "Migrations for phoenix 1.17" This reverts commit 6a3b2f15b74ea5e33150529385215b7a531f3999. Oban upgrade Add default empty whitelist mix format limit test to amd64 OTP 26 tests for 1.15 use OTP_VERSION tag baka just 1.15 Massive deps update Update locale, deps Mix format shell???? multiline??? ? max cases 1 use assert_recieve don't put_env in async tests don't async conn/fs tests mix format FIx some uploader issues Fix tests	2023-08-03 17:44:09 +01:00
Norm	b99053d2c2	Reload emoji when using mix pleroma.emoji gen-pack and get-packs I think it makes more sense that the emoji cache gets reloaded in Akkoma if you add or create emoji packs.	2023-06-04 02:43:18 +00:00
floatingghost	6225f24f5f	Merge pull request 'Clean up bookmarks after prune_objects' (#544 ) from ilja/akkoma:clean_up_bookmarks_after_prune_objects into develop Reviewed-on: AkkomaGang/akkoma#544	2023-05-22 21:28:48 +00:00
ilja	f49e9e6d4c	Clean up bookmarks after prune_objects When doing prune_objects, it's possible that bookmarked objects are deleted. This gave problems when fetching the bookmark TL. Here we clean up the bookmarks during pruning in the case were it's possible that bookmarked objects are deleted.	2023-05-21 13:02:28 +02:00
FloatingGhost	522221f7fb	Mix format	2023-04-14 17:56:34 +01:00
FloatingGhost	2a8c1f4192	Add extra diagnostic tasks in	2023-03-29 14:11:00 +01:00
ilja	57eef6d764	prune_objects can prune orphaned activities who reference an array of objects E.g. Flag activities have an array of objects We prune the activity when NONE of the objects can be found Note that the cost of finding and deleting these is ~4x higher than finding and deleting the non-array ones Only string: Delete on activities (cost=506573.48..506580.38 rows=0 width=0) Only Array: Delete on activities (cost=3570359.68..4276365.34 rows=0 width=0) (They are still executed separately, so the total cost is the sum of the two)	2023-02-26 14:41:50 +01:00
ilja	a7ec6e039c	prune_objects can prune orphaned activities We add an option to also prune remote activities who don't have existing objects any more they reference. Rn, we only check for activities who only reference one object, not an array or embeded object.	2023-02-26 14:41:50 +01:00
ilja	7695010268	Prune Objects --keep-threads option (#350 ) This adds an option to the prune_objects mix task. The original way deleted all non-local public posts older than a certain time frame. Here we add a different query which you can call using the option --keep-threads. We query from the activities table all context id's where 1. the newest activity with this context is still old 2. none of the activities with this context is is local 3. none of the activities with this context is bookmarked and delete all objects with these contexts. The idea is that posts with local activities (posts, replies, likes, repeats...) may be interesting to keep. Besides that, a post lives in a certain context (the thread), so we keep the whole thread as well. Caveats: * ~~Quotes have a different context. Therefore, when someone quotes a post, it's possible the quoted post will still be deleted.~~ fixed in AkkomaGang/akkoma#379 * Although undocumented (in docs/docs/administration/CLI_tasks/database.md/#prune-old-remote-posts-from-the-database), the 'normal' delete action still kept old remote non-public posts. I added an option to keep this behaviour, but this also means that you now have to explicitly provide that option. This could be considered a breaking change! * ~~Note that this removes from the objects table, but not from the activities.~~ See AkkomaGang/akkoma#427 for that. Some statistics from explain analyse: (cost=1402845.92..1933782.00 rows=3810907 width=62) (actual time=2562455.486..2562455.495 rows=0 loops=1) Planning Time: 505.327 ms Trigger for constraint chat_message_references_object_id_fkey: time=651939.797 calls=921740 Trigger for constraint deliveries_object_id_fkey: time=52036.009 calls=921740 Trigger for constraint hashtags_objects_object_id_fkey: time=20665.778 calls=921740 Execution Time: 3287933.902 ms * TODO 1. [x] Question:** Is it OK to keep it like this in regard to quote posts? If not (ie post quoted by local users should also be kept), should we give quotes the same context as the post they are quoting? (If we don't want to give them the same context, I'll have to see how/if I can do it without being too costly) * See AkkomaGang/akkoma#379 2. [x] Question: the "original" query only deletes public posts (this is undocumented, but you can check the code). This new one doesn't care for scope. From the docs I get that the idea is that posts can be refetched when needed. But I have from a trusted source that Pleroma can't refetch non-public posts. I assume that's the reason why they are kept here. I see different options to deal with this 1. ~~We keep it as currently implemented and just don't care about scope with this option~~ 2. ~~We add logic to not delete non-public posts either (I'll have to see how costly that becomes)~~ 3. We add an extra --keep-non-public parameter. This is technically speaking breakage (you didn't have to provide a param before for this, now you do), but I'm inclined to not care much because it wasn't documented nor tested in the first place. 3. [x] See if we can do the query using Elixir 4. [x] Test on a bigger DB to see that we don't run into a timeout 5. [x] Add docs Co-authored-by: ilja <git@ilja.space> Reviewed-on: AkkomaGang/akkoma#350 Co-authored-by: ilja <akkoma.dev@ilja.space> Co-committed-by: ilja <akkoma.dev@ilja.space>	2023-01-09 22:15:41 +00:00
floatingghost	9be6caf125	argon2 password hashing (#406 ) Co-authored-by: FloatingGhost <hannah@coffee-and-dreams.uk> Reviewed-on: AkkomaGang/akkoma#406	2022-12-30 02:46:58 +00:00
FloatingGhost	5a405bdadf	document dump_to_file and load_from_file	2022-12-29 20:00:04 +00:00
FloatingGhost	d1bf8aa9ed	Add dump_to_file and load_from_file tasks	2022-12-29 19:56:35 +00:00
floatingghost	07a48b9293	giant massive dep upgrade and dialyxir-found error emporium (#371 ) Co-authored-by: FloatingGhost <hannah@coffee-and-dreams.uk> Reviewed-on: AkkomaGang/akkoma#371	2022-12-14 12:38:48 +00:00
FloatingGhost	e6da301296	Add diagnostics http	2022-12-11 22:57:18 +00:00
floatingghost	09326ffa56	Diagnostics tasks (#348 ) a bunch of ways to get query plans to help with debugging Co-authored-by: FloatingGhost <hannah@coffee-and-dreams.uk> Reviewed-on: AkkomaGang/akkoma#348	2022-12-07 11:12:34 +00:00
floatingghost	d55de5debf	Remerge of hashtag following (#341 ) this time with less idiot Co-authored-by: FloatingGhost <hannah@coffee-and-dreams.uk> Reviewed-on: AkkomaGang/akkoma#341	2022-12-05 12:58:48 +00:00
floatingghost	ec6bf8c3f7	revert `4a94c9a31e` revert Add ability to follow hashtags (#336) Co-authored-by: FloatingGhost <hannah@coffee-and-dreams.uk> Reviewed-on: AkkomaGang/akkoma#336	2022-12-04 20:04:09 +00:00
floatingghost	4a94c9a31e	Add ability to follow hashtags (#336 ) Co-authored-by: FloatingGhost <hannah@coffee-and-dreams.uk> Reviewed-on: AkkomaGang/akkoma#336	2022-12-04 17:36:59 +00:00
floatingghost	6b882a2c0b	Purge Rejected Follow requests in daily task (#334 ) Co-authored-by: FloatingGhost <hannah@coffee-and-dreams.uk> Reviewed-on: AkkomaGang/akkoma#334	2022-12-03 23:17:43 +00:00
floatingghost	db60640c5b	Fixing up deletes a bit (#327 ) Co-authored-by: FloatingGhost <hannah@coffee-and-dreams.uk> Reviewed-on: AkkomaGang/akkoma#327	2022-12-01 15:00:53 +00:00
floatingghost	e3085c495c	fix tests broken by relay defaults changing (#314 ) Co-authored-by: FloatingGhost <hannah@coffee-and-dreams.uk> Reviewed-on: AkkomaGang/akkoma#314	2022-11-26 20:45:47 +00:00
Ilja	338612d72b	Use EXIF data of image to prefill image description During attachment upload Pleroma returns a "description" field. * This MR allows Pleroma to read the EXIF data during upload and return the description to the FE using this field. * If a description is already present (e.g. because a previous module added it), it will use that * Otherwise it will read from the EXIF data. First it will check -ImageDescription, if that's empty, it will check -iptc:Caption-Abstract * If no description is found, it will simply return nil, which is the default value * When people set up a new instance, they will be asked if they want to read metadata and this module will be activated if so There was an Exiftool module, which has now been renamed to Exiftool.StripLocation	2022-10-23 14:46:16 +02:00
FloatingGhost	856c57208b	Ensure deletes are handled after everything else	2022-10-11 14:30:08 +01:00

1 2 3 4 5 ...

713 commits