Add standalone prune_orphaned_activities CLI task

This part of pruning can be very expensive and bog down the whole
instance to an unusable sate for a long time. It can thus be desireable
to split it from prune_objects and run it on its own in smaller limited batches.

If the batches are smaller enough and spaced out a bit, it may even be possible
to avoid any downtime. If not, the limit can still help to at least make the
downtime duration somewhat more predictable.
This commit is contained in:
Oneric 2023-10-23 01:01:07 +02:00
parent 3126d15ffc
commit fa52093bac
3 changed files with 60 additions and 1 deletions

View file

@ -106,6 +106,7 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/).
- Akkoma API is now documented
- ability to auto-approve follow requests from users you are already following
- The SimplePolicy MRF can now strip user backgrounds from selected remote hosts
- New standalone `prune_orphaned_activities` mix task with configurable batch limit
## Changed
- OTP builds are now built on erlang OTP26

View file

@ -53,6 +53,28 @@ This will prune remote posts older than 90 days (configurable with [`config :ple
- `--prune-orphaned-activities` - Also prune orphaned activities afterwards. Activities are things like Like, Create, Announce, Flag (aka reports)... They can significantly help reduce the database size.
- `--vacuum` - Run `VACUUM FULL` after the objects are pruned. This should not be used on a regular basis, but is useful if your instance has been running for a long time before pruning.
## Prune orphaned activities from the database
This will prune activities which are no longer referenced by anything.
Such activities might be the result of running `prune_objects` without `--prune-orphaned-activities`.
The same notes and warnings apply as for `prune_objects`.
=== "OTP"
```sh
./bin/pleroma_ctl database prune_orphaned_activities [option ...]
```
=== "From Source"
```sh
mix pleroma.database prune_orphaned_activities [option ...]
```
### Options
- `--limit n` - Only delete up to `n` activities in each query making up this job, i.e. if this job runs two queries at most `2n` activities will be deleted. Running this task repeatedly in limited batches can help maintain the instances responsiveness while still freeing up some space.
## Create a conversation for all existing DMs
Can be safely re-run

View file

@ -20,7 +20,14 @@ defmodule Mix.Tasks.Pleroma.Database do
@shortdoc "A collection of database related tasks"
@moduledoc File.read!("docs/docs/administration/CLI_tasks/database.md")
def prune_orphaned_activities() do
def prune_orphaned_activities(limit \\ 0) when is_number(limit) do
limit_arg =
if limit > 0 do
"LIMIT #{limit}"
else
""
end
# Prune activities who link to a single object
"""
delete from public.activities
@ -34,6 +41,7 @@ def prune_orphaned_activities() do
and o.id is null
and a2.id is null
and u.id is null
#{limit_arg}
)
"""
|> Repo.query([], timeout: :infinity)
@ -51,6 +59,7 @@ def prune_orphaned_activities() do
having max(o.data ->> 'id') is null
and max(a2.data ->> 'id') is null
and max(u.ap_id) is null
#{limit_arg}
)
"""
|> Repo.query([], timeout: :infinity)
@ -98,6 +107,33 @@ def run(["update_users_following_followers_counts"]) do
)
end
def run(["prune_orphaned_activities" | args]) do
{options, [], []} =
OptionParser.parse(
args,
strict: [
limit: :integer
]
)
start_pleroma()
limit = Keyword.get(options, :limit, 0)
log_message = "Pruning orphaned activities"
log_message =
if limit > 0 do
log_message <> ", limiting deletion to #{limit} rows"
else
log_message
end
Logger.info(log_message)
prune_orphaned_activities(limit)
end
def run(["prune_objects" | args]) do
{options, [], []} =
OptionParser.parse(