Compare commits

...

2 commits

Author SHA1 Message Date
f66135ed08 Merge pull request 'Avoid accumulation of stale data in websockets' (#806) from Oneric/akkoma:websocket_fullsweep into develop
Some checks are pending
ci/woodpecker/push/build-amd64 Pipeline is pending
ci/woodpecker/push/build-arm64 Pipeline is pending
ci/woodpecker/push/docs Pipeline is pending
ci/woodpecker/push/lint Pipeline is pending
ci/woodpecker/push/test Pipeline is pending
Reviewed-on: #806
Reviewed-by: floatingghost <hannah@coffee-and-dreams.uk>
2024-06-23 02:19:36 +00:00
13e2a811ec Avoid accumulation of stale data in websockets
Some checks are pending
ci/woodpecker/pr/build-amd64 Pipeline is pending
ci/woodpecker/pr/build-arm64 Pipeline is pending
ci/woodpecker/pr/docs Pipeline is pending
ci/woodpecker/pr/lint Pipeline is pending
ci/woodpecker/pr/test Pipeline is pending
We’ve received reports of some specific instances slowly accumulating
more and more binary data over time up to OOMs and globally setting
ERL_FULLSWEEP_AFTER=0 has proven to be an effective countermeasure.
However, this incurs increased cpu perf costs everywhere and is
thus not suitable to apply out of the box.

Apparently long-lived Phoenix websocket processes are known to
often cause exactly this by getting into a state unfavourable
for the garbage collector.
Therefore it seems likely affected instances are using timeline
streaming and do so in just the right way to trigger this. We
can tune the garbage collector just for websocket processes
and use a more lenient value of 20 to keep the added perf cost
in check.

Testing on one affected instance appears to confirm this theory

Ref.:
  https://www.erlang.org/doc/man/erlang#ghlink-process_flag-2-idp226
  https://blog.guzman.codes/using-phoenix-channels-high-memory-usage-save-money-with-erlfullsweepafter
  https://git.pleroma.social/pleroma/pleroma/-/merge_requests/4060

Tested-by: bjo
2024-06-22 22:22:33 +02:00

View file

@ -18,6 +18,8 @@ defmodule Pleroma.Web.MastodonAPI.WebsocketHandler do
@timeout :timer.seconds(60) @timeout :timer.seconds(60)
# Hibernate every X messages # Hibernate every X messages
@hibernate_every 100 @hibernate_every 100
# Tune garabge collect for long-lived websocket process
@fullsweep_after 20
def init(%{qs: qs} = req, state) do def init(%{qs: qs} = req, state) do
with params <- Enum.into(:cow_qs.parse_qs(qs), %{}), with params <- Enum.into(:cow_qs.parse_qs(qs), %{}),
@ -59,6 +61,10 @@ def websocket_init(state) do
"#{__MODULE__} accepted websocket connection for user #{(state.user || %{id: "anonymous"}).id}, topic #{state.topic}" "#{__MODULE__} accepted websocket connection for user #{(state.user || %{id: "anonymous"}).id}, topic #{state.topic}"
) )
# process is long-lived and can sometimes accumulate stale data in such a way it's
# not freed by young garbage cycles, thus make full collection sweeps more frequent
:erlang.process_flag(:fullsweep_after, @fullsweep_after)
Streamer.add_socket(state.topic, state.oauth_token) Streamer.add_socket(state.topic, state.oauth_token)
{:ok, %{state | timer: timer()}} {:ok, %{state | timer: timer()}}
end end