Massive Performance impact when delivery Queue starts working #249

Closed
opened 2022-11-24 10:43:00 +00:00 by puniko · 6 comments
Contributor

Every time, when a bunch of jobs have piled up as delayed and the delivery queue starts rerunning them, the frontend gets very unresponsive, to a point where sending a post takes minutes.

i upped the worked count, i lowered the worker count, but regardless of what i do, it keeps getting unresponsive. even now, where foundkey only manages to use 40 - 50 % of cpu for the worker stuff, it still looks up depsite there being enough resources available to serve frontend normally.

memory, disk io, network and stuff all seem fine and there is plenty for foundkey to use, but it seems to refuse using it.

Sooooo, i'm out of ideas. idk how to fix this, idk how to work around this, let alone why this happens.

Every time, when a bunch of jobs have piled up as delayed and the delivery queue starts rerunning them, the frontend gets very unresponsive, to a point where sending a post takes minutes. i upped the worked count, i lowered the worker count, but regardless of what i do, it keeps getting unresponsive. even now, where foundkey only manages to use 40 - 50 % of cpu for the worker stuff, it still looks up depsite there being enough resources available to serve frontend normally. memory, disk io, network and stuff all seem fine and there is plenty for foundkey to use, but it seems to refuse using it. Sooooo, i'm out of ideas. idk how to fix this, idk how to work around this, let alone why this happens.
Johann150 added a new dependency 2022-11-25 12:03:17 +00:00
Owner

I separated web and queue workers in #252 but I can't check if it really helps. If this doesn't do the trick the queue workers could be re-niced even more from PRIORITY_BELOW_NORMAL to PRIORITY_LOW.

I separated web and queue workers in #252 but I can't check if it really helps. If this doesn't do the trick the queue workers could be re-niced even more from `PRIORITY_BELOW_NORMAL` to `PRIORITY_LOW`.
Author
Contributor

Thanks, will merge it in, tomorrow or on sunday maybe and let you know

Thanks, will merge it in, tomorrow or on sunday maybe and let you know
Author
Contributor

Just for completness. merged it yesterday, seems to run well so far. will keep it for a week without the charts stuff, than enable charts again to see how it runs with them

Just for completness. merged it yesterday, seems to run well so far. will keep it for a week without the charts stuff, than enable charts again to see how it runs with them
Author
Contributor

can confirm that the split works well with on my instance. now lockdowns of fe anymore even when queue is doing its thing.
i still have to disable charts tho, but thats another issue. at least it also doesn't block web, when gathering stuff for the charts (but it does stop the queue while it does so)

can confirm that the split works well with on my instance. now lockdowns of fe anymore even when queue is doing its thing. i still have to disable charts tho, but thats another issue. at least it also doesn't block web, when gathering stuff for the charts (but it does stop the queue while it does so)
Owner

Can you make a new issue for charts?
In the meanwhile closing this one since we got there 🎉

Can you make a new issue for charts? In the meanwhile closing this one since we got there 🎉
Owner

@toast #252 is not merged yet because there is an outstanding review comment by you. After that is merged this can be closed.

Chart problems are already tracked in #237 and #253.

@toast #252 is not merged yet because there is an outstanding review comment by you. After that is merged this can be closed. Chart problems are already tracked in #237 and #253.
Sign in to join this conversation.
No Label
feature
fix
upkeep
No Milestone
No Assignees
3 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Depends on
Reference: FoundKeyGang/FoundKey#249
No description provided.