[bug] Active processes get terminated by nimble_pool idle shutdown #880

Open
opened 2025-03-14 15:00:56 +00:00 by Oneric · 0 comments
Member

Your setup

From source

Extra details

Alpine 3.21

Version

current master (74182abb5b)

PostgreSQL version

17

What’s up?

I’ve recently seen /inbox requests failing for the first time with several errors like the following being logged:

2025-03-14 02:15:21.503 request_id=GCyJqoyectBATr8AAB8x [error] Internal server error: {{:shutdown, :idle_timeout}, {NimblePool, :checkout, [#PID<0.3967.0>]}}

2025-03-14 02:15:21.508 [error] #PID<0.3976.0> running Pleroma.Web.Endpoint (connection #PID<0.3975.0>, stream id 1) terminated
Server: mydomain.example:80 (http)
Request: POST /inbox
** (exit) exited in: NimblePool.checkout(#PID<0.3967.0>)
    ** (EXIT) shutdown: :idle_timeout

This appears to be from a bug in finch causing the whole pool to get terminated when only some connections are idle. We use finch via Tesla for outgoing HTTP requests; in case of /inbox requests it was probably currently trying to fetch the remote pubkey for signature verification when the pool got terminated.

The weird thing is neither nimble_pool nor finch were recently upgraded, perhaps its just coincidence it happened now but not before.

There’s an open PR in finch supposed to fix this, but it remains unmerged so far for lack of confirmation that it actually resolves this: https://github.com/sneako/finch/issues/292

If you can spot errors like this with any frequency in your logs, it’d be great if you could try out the finch patch so we can provide this confirmation or at least give feedback about it not being sufficient.
To use the patched version replace {:finch, "~> 0.18.0"} in your mix.exs with the following; then mix deps.get, recompile and restart.

{:finch,
  git: "https://github.com/oliveigah/finch.git",
  # this is upstream finch v0.19.0 with some additional upstream commits since and the proposed fix
  ref: "af60cc38c6d42fd154d80d9c6a5872d3683a1953",
  override: true
}

Severity

I can manage

Have you searched for this issue?

  • I have double-checked and have not found this issue mentioned anywhere.
### Your setup From source ### Extra details Alpine 3.21 ### Version current master (74182abb5b4b1186e4c68c426ad6ee680ebc804d) ### PostgreSQL version 17 ### What’s up? I’ve recently seen `/inbox` requests failing for the first time with several errors like the following being logged: ``` 2025-03-14 02:15:21.503 request_id=GCyJqoyectBATr8AAB8x [error] Internal server error: {{:shutdown, :idle_timeout}, {NimblePool, :checkout, [#PID<0.3967.0>]}} 2025-03-14 02:15:21.508 [error] #PID<0.3976.0> running Pleroma.Web.Endpoint (connection #PID<0.3975.0>, stream id 1) terminated Server: mydomain.example:80 (http) Request: POST /inbox ** (exit) exited in: NimblePool.checkout(#PID<0.3967.0>) ** (EXIT) shutdown: :idle_timeout ``` This appears to be from a [bug in finch](https://github.com/sneako/finch/issues/291) causing the whole pool to get terminated when only some connections are idle. We use finch via Tesla for outgoing HTTP requests; in case of `/inbox` requests it was probably currently trying to fetch the remote pubkey for signature verification when the pool got terminated. The weird thing is neither nimble_pool nor finch were recently upgraded, perhaps its just coincidence it happened now but not before. There’s an open PR in finch supposed to fix this, but it remains unmerged so far for lack of confirmation that it actually resolves this: https://github.com/sneako/finch/issues/292 If you can spot errors like this with any frequency in your logs, it’d be great if you could try out the finch patch so we can provide this confirmation or at least give feedback about it not being sufficient. To use the patched version replace `{:finch, "~> 0.18.0"}` in your `mix.exs` with the following; then `mix deps.get`, recompile and restart. ``` {:finch, git: "https://github.com/oliveigah/finch.git", # this is upstream finch v0.19.0 with some additional upstream commits since and the proposed fix ref: "af60cc38c6d42fd154d80d9c6a5872d3683a1953", override: true } ``` ### Severity I can manage ### Have you searched for this issue? - [x] I have double-checked and have not found this issue mentioned anywhere.
Oneric added the
bug
label 2025-03-14 15:00:56 +00:00
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: AkkomaGang/akkoma#880
No description provided.