Add limit CLI flags to prune jobs #655
No reviewers
Labels
No Label
approved, awaiting change
bug
configuration
documentation
duplicate
enhancement
extremely low priority
feature request
Fix it yourself
help wanted
invalid
mastodon_api
needs docs
needs tests
not a bug
planned
pleroma_api
privacy
question
static_fe
triage
wontfix
No Milestone
No project
No Assignees
2 Participants
Notifications
Due Date
No due date set.
Dependencies
No dependencies set.
Reference: AkkomaGang/akkoma#655
Loading…
Reference in New Issue
No description provided.
Delete Branch "Oneric/akkoma:prune-batch"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
The prune tasks can incur heavy databse load and take a long time, grinding the instance to an halt for the entire duration.
To still free up some space, but lessen or ideally avoid downtime limits on the prune queriess, especially the orphaned activities have proven useful in recent tests. With these patches, after the initial
prune_objects
(without--prune-orphaned-activities
), a script similar to the following could be run to free up space while keeping the instance reasonably responsive (parameters are just examples; need adjusting for specific instance):Resolves #653 ; cc @norm
Best reviewed commit by commit and as noted in the commit messages, many of the diff lines are just indentation adjustments and for review its probably a good idea to hide whitespace-only changes.
@ -56,0 +79,4 @@
### Options
- `--limit n` - Only delete up to `n` activities in each query. Running this task in limited batches can help maintain the instance’s responsiveness while still freeing up some space.
I'm a little confused about if there's a difference in the behavior between this and prune_objects.
"in each query" I would understand as limiting the database lock by having smaller limit delete operations.
For prune_objects it says "limits how many remote objects get pruned initially". What does initially mean here?
The task executes multiple DELETE queries on the database, each of these queries will have the given limit applied. Currently it executes two queries, so running the task once wiht
--limit 100
will delete at most 200 rows.It would be possible to limit the overall deleted rows to at most exactly the given amount, but this gives preferential treatment to the first queries and since the purpose is just to limit the load and allow breaks inbetween, I figured this is not needed. But if there’s a reason to, this could be changed.
prune_objects
first deletes remote posts, then (optionally, if such flags were passed) it will run more cleanup jobs. Only the initial prune is affected by the limit the cleanup not. Reason being, that except forprune_orphaned_activities
those cleanup jobs are comparatively cheap anyway.And
prune_orphaned_activities
now has its own task. So if you want to cleanup some space, while not continuously hogging the db, you can first (repeatedly) runprune_objects --limit n
without--prune-orphaned-activities
, but all other desired cleanups in the last run. Then afterwards, repeatedly run the standaloneprune_orphaned_activities --limit n
as long as a single run finishes fast enough.I pushed a new rebased version with tweaked documentation (and a typo in a commit message was fixed). Can you take a look if it’s clearer now?
I see what you mean, and the docs updates are clearer thanks! The steps you describe is how I was running it, I did a few
prune_objects
and then did a fewprune_orphaned_activities
.This seems to be working for me! Usually pruning makes the RAM fills up on my small VPS and the instance crashes but this is running well.
80ba73839c
to3bc63afbe0
3bc63afbe0
to732bc96493
732bc96493
to800acfa81d
Rebased this with two updates:
IO.puts
is no longer needed iand has been dropped.This change also slightly confused the script from the comments; I updated it to work with the new output and made it a bit more robust wrt ordering.
prune_orphaned_activities
is now used in one of the orphan-pruning tests. Since both modes use the same function and the only difference is the argument parser, I figured it wasn’t worth to duplicate the test setup and instead switched one of the two orphan tests to the standalone task.Also just because, here’s an alternative version of the script which tries to scale batch size down between some max and min value instead of immediately ceasing the prune. May be more convenient in some cases, though too low min values prob don’t make much sense (and as before time and batch sizes need tweaking for real instances).
afa01cb8dd
to790b552030