wf_akkoma/patches/wip_17_stats-estimate-remote-user-count.patch
Oneric 0e55849a54 Regresh and extend WIP perf patches
This should hopefully fix valid Announces/Likes/etc being dropped
sometimes due to intermittent network errors while resolving the
referenced object.

Additionally this now includes ap_enabled purge, since it proved buggy
and the resulting logs are distracting while screening for issues with
the WIP patches.
2024-12-14 05:00:23 +01:00

98 lines
3.2 KiB
Diff

From 28cb7dd970245e579ee7b79db95682af024febb6 Mon Sep 17 00:00:00 2001
From: Oneric <oneric@oneric.stub>
Date: Wed, 11 Dec 2024 03:03:14 +0100
Subject: [PATCH 17/22] stats: estimate remote user count
This value is currently only used by Prometheus metrics
but (after optimisng the peer query inthe preceeding commit)
the most costly part of instance stats.
---
lib/pleroma/stats.ex | 20 +++++-----
...00_remote_user_count_estimate_function.exs | 37 +++++++++++++++++++
2 files changed, 46 insertions(+), 11 deletions(-)
create mode 100644 priv/repo/migrations/20241211000000_remote_user_count_estimate_function.exs
diff --git a/lib/pleroma/stats.ex b/lib/pleroma/stats.ex
index 7bbe089f8..f33c378dd 100644
--- a/lib/pleroma/stats.ex
+++ b/lib/pleroma/stats.ex
@@ -79,24 +79,22 @@ def calculate_stat_data do
status_count = Repo.aggregate(User.Query.build(%{local: true}), :sum, :note_count)
- users_query =
+ # there are few enough local users for postgres to use an index scan
+ # (also here an exact count is a bit more important)
+ user_count =
from(u in User,
where: u.is_active == true,
where: u.local == true,
where: not is_nil(u.nickname),
where: not u.invisible
)
+ |> Repo.aggregate(:count, :id)
- remote_users_query =
- from(u in User,
- where: u.is_active == true,
- where: u.local == false,
- where: not is_nil(u.nickname),
- where: not u.invisible
- )
-
- user_count = Repo.aggregate(users_query, :count, :id)
- remote_user_count = Repo.aggregate(remote_users_query, :count, :id)
+ # but mostly numerous remote users leading to a full a full table scan
+ # (ecto currently doesn't allow building queries without explicit table)
+ %{rows: [[remote_user_count]]} =
+ "SELECT estimate_remote_user_count();"
+ |> Pleroma.Repo.query!()
%{
peers: peers,
diff --git a/priv/repo/migrations/20241211000000_remote_user_count_estimate_function.exs b/priv/repo/migrations/20241211000000_remote_user_count_estimate_function.exs
new file mode 100644
index 000000000..e67da1058
--- /dev/null
+++ b/priv/repo/migrations/20241211000000_remote_user_count_estimate_function.exs
@@ -0,0 +1,37 @@
+defmodule Pleroma.Repo.Migrations.RemoteUserCountEstimateFunction do
+ use Ecto.Migration
+
+ @function_name "estimate_remote_user_count"
+
+ # real time and cost estimate:
+ # count(*) query: 0.980ms 0.26
+ # explain estimate: 47.777ms 14053.98
+ def up() do
+ # yep, this EXPLAIN (ab)use is blessed by the PostgreSQL wiki:
+ # https://wiki.postgresql.org/wiki/Count_estimate
+ """
+ CREATE OR REPLACE FUNCTION #{@function_name}()
+ RETURNS integer
+ LANGUAGE plpgsql AS $$
+ DECLARE plan jsonb;
+ BEGIN
+ EXECUTE '
+ EXPLAIN (FORMAT JSON)
+ SELECT *
+ FROM public.users
+ WHERE local = false AND
+ is_active = true AND
+ invisible = false AND
+ nickname IS NOT NULL;
+ ' INTO plan;
+ RETURN plan->0->'Plan'->'Plan Rows';
+ END;
+ $$;
+ """
+ |> execute()
+ end
+
+ def down() do
+ execute("DROP FUNCTION IF EXISTS #{@function_name}()")
+ end
+end
--
2.39.5