Compare commits

...

10 Commits

Author SHA1 Message Date
Norm 841b3d88e4 Merge Oneric's prune-batch patches 2024-02-10 00:00:02 +00:00
floatingghost e97d08ee98 Merge pull request 'MRF transparency: don’t forget to obfuscate short domains' (#676) from Oneric/akkoma:mrf-obfuscation into develop
Reviewed-on: AkkomaGang/akkoma#676
2024-02-05 08:43:43 +00:00
Oneric 3cd882528e More prominently document MRF transparency and obfuscation
And point to the cheat sheet for all other MRF policies
and their configuration details.
2024-02-02 14:50:21 +00:00
Oneric e47c50666d Fix obfuscation of short domains
Fixes AkkomaGang/akkoma#645
2024-02-02 14:50:13 +00:00
floatingghost b4ccddab39 Merge pull request 'Fix OAuth consumer mode' (#668) from tcmal/akkoma:develop into develop
Reviewed-on: AkkomaGang/akkoma#668
2024-02-02 10:05:42 +00:00
Oneric 732bc96493 Also allow limiting the initial prune_object
May sometimes be helpful to get more predictable runtime
than just with an age-based limit.

The subquery for the non-keep-threads path is required
since delte_all does not directly accept limit().

Again most of the diff is just adjusting indentation, best
hide whitespace-only changes with git diff -w or similar.
2024-01-31 17:44:23 +01:00
Oneric 044ad034e6 Log number of deleted rows in prune_orphaned_activities
This gives feedback when to stop rerunning limited batches.

Since the Logger’s output does not show up on stdout, but we
want users to be able to easily see and process this in scripts
the information is additionally printed with IO.puts.

Most of the diff is just adjusting indentation; best reviewed
with whitespace-only changes hidden, e.g. `git diff -w`.
2024-01-31 17:43:37 +01:00
Oneric 5d269cb55a Add standalone prune_orphaned_activities CLI task
This part of pruning can be very expensive and bog down the whole
instance to an unusable sate for a long time. It can thus be desireable
to split it from prune_objects and run it on its own in smaller limited batches.

If the batches are smaller enough and spaced out a bit, it may even be possible
to avoid any downtime. If not, the limit can still help to at least make the
downtime duration somewhat more predictable.
2024-01-31 17:43:31 +01:00
Oneric c93b113624 refactor: move prune_orphaned_activities into own function
No logic changes. Preparation for standalone orphan pruning.
2023-12-25 00:16:51 +01:00
Aria 77000b8ffd update tests for oauth consumer 2023-12-17 21:48:19 +00:00
8 changed files with 250 additions and 77 deletions

View File

@ -9,6 +9,8 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/).
## Added
- Full compatibility with Erlang OTP26
- handling of GET /api/v1/preferences
- New standalone `prune_orphaned_activities` mix task with configurable batch limit
- The `prune_objects` mix task now accepts a `--limit` parameter for initial object pruning
## Changed
- OTP builds are now built on erlang OTP26
@ -19,6 +21,7 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/).
- Documentation issue in which a non-existing nginx file was referenced
- Issue where a bad inbox URL could break federation
- Issue where hashtag rel values would be scrubbed
- Issue where short domains listed in `transparency_obfuscate_domains` were not actually obfuscated
## 2023.08

View File

@ -50,9 +50,37 @@ This will prune remote posts older than 90 days (configurable with [`config :ple
- `--keep-threads` - Don't prune posts when they are part of a thread where at least one post has seen local interaction (e.g. one of the posts is a local post, or is favourited by a local user, or has been repeated by a local user...). It also wont delete posts when at least one of the posts in that thread is kept (e.g. because one of the posts has seen recent activity).
- `--keep-non-public` - Keep non-public posts like DM's and followers-only, even if they are remote.
- `--limit` - limits how many remote posts get pruned. This limit does **not** apply to any of the follow up jobs. If wanting to keep the database load in check it is thus advisable to run the standalone `prune_orphaned_activities` task with a limit afterwards instead of passing `--prune-orphaned-activities` to this task.
- `--prune-orphaned-activities` - Also prune orphaned activities afterwards. Activities are things like Like, Create, Announce, Flag (aka reports)... They can significantly help reduce the database size.
- `--vacuum` - Run `VACUUM FULL` after the objects are pruned. This should not be used on a regular basis, but is useful if your instance has been running for a long time before pruning.
## Prune orphaned activities from the database
This will prune activities which are no longer referenced by anything.
Such activities might be the result of running `prune_objects` without `--prune-orphaned-activities`.
The same notes and warnings apply as for `prune_objects`.
The task will print out how many rows were freed in total in its last
line of output in the form `Deleted 345 rows`.
When running the job in limited batches this can be used to determine
when all orphaned activities have been deleted.
=== "OTP"
```sh
./bin/pleroma_ctl database prune_orphaned_activities [option ...]
```
=== "From Source"
```sh
mix pleroma.database prune_orphaned_activities [option ...]
```
### Options
- `--limit n` - Only delete up to `n` activities in each query making up this job, i.e. if this job runs two queries at most `2n` activities will be deleted. Running this task repeatedly in limited batches can help maintain the instances responsiveness while still freeing up some space.
## Create a conversation for all existing DMs
Can be safely re-run

View File

@ -61,6 +61,32 @@ config :pleroma, :mrf_simple,
The effects of MRF policies can be very drastic. It is important to use this functionality carefully. Always try to talk to an admin before writing an MRF policy concerning their instance.
## Hiding or Obfuscating Policies
You can opt out of publicly displaying all MRF policies or only hide or obfuscate selected domains.
To just hide everything set:
```elixir
config :pleroma, :mrf,
...
transparency: false,
```
To hide or obfuscate only select entries, use:
```elixir
config :pleroma, :mrf,
...
transparency_obfuscate_domains: ["handholdi.ng", "badword.com"],
transparency_exclusions: [{"ghost.club", "even a fragment is too spoopy for humans"}]
```
## More MRF Policies
See the [documentation cheatsheet](cheatsheet.md)
for all available MRF policies and their options.
## Writing your own MRF Policy
As discussed above, the MRF system is a modular system that supports pluggable policies. This means that an admin may write a custom MRF policy in Elixir or any other language that runs on the Erlang VM, by specifying the module name in the `policies` config setting.

View File

@ -20,6 +20,63 @@ defmodule Mix.Tasks.Pleroma.Database do
@shortdoc "A collection of database related tasks"
@moduledoc File.read!("docs/docs/administration/CLI_tasks/database.md")
def maybe_limit(query, limit_cnt) do
if is_number(limit_cnt) and limit_cnt > 0 do
limit(query, [], ^limit_cnt)
else
query
end
end
def prune_orphaned_activities(limit \\ 0) when is_number(limit) do
limit_arg =
if limit > 0 do
"LIMIT #{limit}"
else
""
end
# Prune activities who link to a single object
{:ok, %{:num_rows => del_single}} =
"""
delete from public.activities
where id in (
select a.id from public.activities a
left join public.objects o on a.data ->> 'object' = o.data ->> 'id'
left join public.activities a2 on a.data ->> 'object' = a2.data ->> 'id'
left join public.users u on a.data ->> 'object' = u.ap_id
where not a.local
and jsonb_typeof(a."data" -> 'object') = 'string'
and o.id is null
and a2.id is null
and u.id is null
#{limit_arg}
)
"""
|> Repo.query([], timeout: :infinity)
# Prune activities who link to an array of objects
{:ok, %{:num_rows => del_array}} =
"""
delete from public.activities
where id in (
select a.id from public.activities a
join json_array_elements_text((a."data" -> 'object')::json) as j on jsonb_typeof(a."data" -> 'object') = 'array'
left join public.objects o on j.value = o.data ->> 'id'
left join public.activities a2 on j.value = a2.data ->> 'id'
left join public.users u on j.value = u.ap_id
group by a.id
having max(o.data ->> 'id') is null
and max(a2.data ->> 'id') is null
and max(u.ap_id) is null
#{limit_arg}
)
"""
|> Repo.query([], timeout: :infinity)
del_single + del_array
end
def run(["remove_embedded_objects" | args]) do
{options, [], []} =
OptionParser.parse(
@ -62,6 +119,36 @@ defmodule Mix.Tasks.Pleroma.Database do
)
end
def run(["prune_orphaned_activities" | args]) do
{options, [], []} =
OptionParser.parse(
args,
strict: [
limit: :integer
]
)
start_pleroma()
limit = Keyword.get(options, :limit, 0)
log_message = "Pruning orphaned activities"
log_message =
if limit > 0 do
log_message <> ", limiting deletion to #{limit} rows"
else
log_message
end
Logger.info(log_message)
deleted = prune_orphaned_activities(limit)
Logger.info("Deleted #{deleted} rows")
IO.puts("Deleted #{deleted} rows")
end
def run(["prune_objects" | args]) do
{options, [], []} =
OptionParser.parse(
@ -70,7 +157,8 @@ defmodule Mix.Tasks.Pleroma.Database do
vacuum: :boolean,
keep_threads: :boolean,
keep_non_public: :boolean,
prune_orphaned_activities: :boolean
prune_orphaned_activities: :boolean,
limit: :integer
]
)
@ -79,6 +167,8 @@ defmodule Mix.Tasks.Pleroma.Database do
deadline = Pleroma.Config.get([:instance, :remote_post_retention_days])
time_deadline = NaiveDateTime.utc_now() |> NaiveDateTime.add(-(deadline * 86_400))
limit_cnt = Keyword.get(options, :limit, 0)
log_message = "Pruning objects older than #{deadline} days"
log_message =
@ -110,6 +200,13 @@ defmodule Mix.Tasks.Pleroma.Database do
log_message
end
log_message =
if limit_cnt > 0 do
log_message <> ", limiting to #{limit_cnt} rows"
else
log_message
end
Logger.info(log_message)
if Keyword.get(options, :keep_threads) do
@ -143,31 +240,38 @@ defmodule Mix.Tasks.Pleroma.Database do
|> having([a], max(a.updated_at) < ^time_deadline)
|> having([a], not fragment("bool_or(?)", a.local))
|> having([_, b], fragment("max(?::text) is null", b.id))
|> maybe_limit(limit_cnt)
|> select([a], fragment("? ->> 'context'::text", a.data))
Pleroma.Object
|> where([o], fragment("? ->> 'context'::text", o.data) in subquery(deletable_context))
else
if Keyword.get(options, :keep_non_public) do
Pleroma.Object
deletable =
if Keyword.get(options, :keep_non_public) do
Pleroma.Object
|> where(
[o],
fragment(
"?->'to' \\? ? OR ?->'cc' \\? ?",
o.data,
^Pleroma.Constants.as_public(),
o.data,
^Pleroma.Constants.as_public()
)
)
else
Pleroma.Object
end
|> where([o], o.updated_at < ^time_deadline)
|> where(
[o],
fragment(
"?->'to' \\? ? OR ?->'cc' \\? ?",
o.data,
^Pleroma.Constants.as_public(),
o.data,
^Pleroma.Constants.as_public()
)
fragment("split_part(?->>'actor', '/', 3) != ?", o.data, ^Pleroma.Web.Endpoint.host())
)
else
Pleroma.Object
end
|> where([o], o.updated_at < ^time_deadline)
|> where(
[o],
fragment("split_part(?->>'actor', '/', 3) != ?", o.data, ^Pleroma.Web.Endpoint.host())
)
|> maybe_limit(limit_cnt)
|> select([o], o.id)
Pleroma.Object
|> where([o], o.id in subquery(deletable))
end
|> Repo.delete_all(timeout: :infinity)
@ -187,39 +291,7 @@ defmodule Mix.Tasks.Pleroma.Database do
end
if Keyword.get(options, :prune_orphaned_activities) do
# Prune activities who link to a single object
"""
delete from public.activities
where id in (
select a.id from public.activities a
left join public.objects o on a.data ->> 'object' = o.data ->> 'id'
left join public.activities a2 on a.data ->> 'object' = a2.data ->> 'id'
left join public.users u on a.data ->> 'object' = u.ap_id
where not a.local
and jsonb_typeof(a."data" -> 'object') = 'string'
and o.id is null
and a2.id is null
and u.id is null
)
"""
|> Repo.query([], timeout: :infinity)
# Prune activities who link to an array of objects
"""
delete from public.activities
where id in (
select a.id from public.activities a
join json_array_elements_text((a."data" -> 'object')::json) as j on jsonb_typeof(a."data" -> 'object') = 'array'
left join public.objects o on j.value = o.data ->> 'id'
left join public.activities a2 on j.value = a2.data ->> 'id'
left join public.users u on j.value = u.ap_id
group by a.id
having max(o.data ->> 'id') is null
and max(a2.data ->> 'id') is null
and max(u.ap_id) is null
)
"""
|> Repo.query([], timeout: :infinity)
prune_orphaned_activities()
end
"""

View File

@ -314,6 +314,20 @@ defmodule Pleroma.Web.ActivityPub.MRF.SimplePolicy do
def filter(object), do: {:ok, object}
defp obfuscate(string) when is_binary(string) do
# Want to strip at least two neighbouring chars
# to ensure at least one non-dot char is in the obfuscation area
stripped = String.length(string) - 6
{keepstart, keepend} =
if stripped > 1 do
{3, 3}
else
{
2 - div(1 - stripped, 2),
2 + div(stripped, 2)
}
end
string
|> to_charlist()
|> Enum.with_index()
@ -322,7 +336,7 @@ defmodule Pleroma.Web.ActivityPub.MRF.SimplePolicy do
?.
{char, index} ->
if 3 <= index && index < String.length(string) - 3, do: ?*, else: char
if keepstart <= index && index < String.length(string) - keepend, do: ?*, else: char
end)
|> to_string()
end

View File

@ -39,6 +39,7 @@ defmodule Pleroma.Web.OAuth.OAuthController do
action_fallback(Pleroma.Web.OAuth.FallbackController)
@oob_token_redirect_uri "urn:ietf:wg:oauth:2.0:oob"
@state_cookie_name "akkoma_oauth_state"
# Note: this definition is only called from error-handling methods with `conn.params` as 2nd arg
def authorize(%Plug.Conn{} = conn, %{"authorization" => _} = params) do
@ -445,7 +446,7 @@ defmodule Pleroma.Web.OAuth.OAuthController do
# Handing the request to Ueberauth
conn
|> put_resp_cookie("akkoma_oauth_state", state)
|> put_resp_cookie(@state_cookie_name, state)
|> redirect(to: ~p"/oauth/#{provider}")
end
@ -469,12 +470,18 @@ defmodule Pleroma.Web.OAuth.OAuthController do
messages = for e <- Map.get(failure, :errors, []), do: e.message
message = Enum.join(messages, "; ")
conn
|> put_flash(
:error,
dgettext("errors", "Failed to authenticate: %{message}.", message: message)
)
|> redirect(external: redirect_uri(conn, params["redirect_uri"]))
error_message = dgettext("errors", "Failed to authenticate: %{message}.", message: message)
if params["redirect_uri"] do
conn
|> put_flash(
:error,
error_message
)
|> redirect(external: redirect_uri(conn, params["redirect_uri"]))
else
send_resp(conn, :bad_request, error_message)
end
end
def callback(%Plug.Conn{} = conn, params) do
@ -510,7 +517,7 @@ defmodule Pleroma.Web.OAuth.OAuthController do
defp callback_params(%Plug.Conn{} = conn, params) do
fetch_cookies(conn)
Map.merge(params, Jason.decode!(Map.get(conn.req_cookies, "akkoma_oauth_state", "{}")))
Map.merge(params, Jason.decode!(Map.get(conn.req_cookies, @state_cookie_name, "{}")))
end
def registration_details(%Plug.Conn{} = conn, %{"authorization" => auth_attrs}) do

View File

@ -283,7 +283,7 @@ defmodule Pleroma.Web.ActivityPub.MRF.SimplePolicyTest do
assert {:ok,
%{
mrf_simple: %{reject: ["rem***.*****nce", "a.b"]},
mrf_simple: %{reject: ["rem***.*****nce", "*.b"]},
mrf_simple_info: %{reject: %{"rem***.*****nce" => %{}}}
}} = SimplePolicy.describe()
end

View File

@ -81,9 +81,7 @@ defmodule Pleroma.Web.OAuth.OAuthControllerTest do
assert html_response(conn, 302)
redirect_query = URI.parse(redirected_to(conn)).query
assert %{"state" => state_param} = URI.decode_query(redirect_query)
assert {:ok, state_components} = Jason.decode(state_param)
assert {:ok, state_components} = Jason.decode(conn.resp_cookies["akkoma_oauth_state"].value)
expected_client_id = app.client_id
expected_redirect_uri = app.redirect_uris
@ -97,7 +95,7 @@ defmodule Pleroma.Web.OAuth.OAuthControllerTest do
end
test "with user-bound registration, GET /oauth/<provider>/callback redirects to `redirect_uri` with `code`",
%{app: app, conn: conn} do
%{app: app, conn: _} do
registration = insert(:registration)
redirect_uri = OAuthController.default_redirect_uri(app)
@ -109,15 +107,17 @@ defmodule Pleroma.Web.OAuth.OAuthControllerTest do
}
conn =
conn
build_conn()
|> put_req_cookie("akkoma_oauth_state", Jason.encode!(state_params))
|> Plug.Session.call(Plug.Session.init(@session_opts))
|> fetch_session()
|> assign(:ueberauth_auth, %{provider: registration.provider, uid: registration.uid})
|> get(
"/oauth/twitter/callback",
%{
"oauth_token" => "G-5a3AAAAAAAwMH9AAABaektfSM",
"oauth_verifier" => "QZl8vUqNvXMTKpdmUnGejJxuHG75WWWs",
"provider" => "twitter",
"state" => Jason.encode!(state_params)
"provider" => "twitter"
}
)
@ -162,15 +162,42 @@ defmodule Pleroma.Web.OAuth.OAuthControllerTest do
test "on authentication error, GET /oauth/<provider>/callback redirects to `redirect_uri`", %{
app: app,
conn: conn
conn: _
} do
state_params = %{
"scope" => Enum.join(app.scopes, " "),
"client_id" => app.client_id,
"redirect_uri" => OAuthController.default_redirect_uri(app),
"state" => ""
"redirect_uri" => OAuthController.default_redirect_uri(app)
}
conn =
build_conn()
|> put_req_cookie("akkoma_oauth_state", Jason.encode!(state_params))
|> Plug.Session.call(Plug.Session.init(@session_opts))
|> fetch_session()
|> assign(:ueberauth_failure, %{errors: [%{message: "(error description)"}]})
|> get(
"/oauth/twitter/callback",
%{
"oauth_token" => "G-5a3AAAAAAAwMH9AAABaektfSM",
"oauth_verifier" => "QZl8vUqNvXMTKpdmUnGejJxuHG75WWWs",
"provider" => "twitter",
"state" => ""
}
)
assert html_response(conn, 302)
assert redirected_to(conn) == app.redirect_uris
assert Phoenix.Flash.get(conn.assigns.flash, :error) ==
"Failed to authenticate: (error description)."
end
test "on authentication error with no prior state, GET /oauth/<provider>/callback returns 400",
%{
app: _,
conn: conn
} do
conn =
conn
|> assign(:ueberauth_failure, %{errors: [%{message: "(error description)"}]})
@ -180,15 +207,11 @@ defmodule Pleroma.Web.OAuth.OAuthControllerTest do
"oauth_token" => "G-5a3AAAAAAAwMH9AAABaektfSM",
"oauth_verifier" => "QZl8vUqNvXMTKpdmUnGejJxuHG75WWWs",
"provider" => "twitter",
"state" => Jason.encode!(state_params)
"state" => ""
}
)
assert html_response(conn, 302)
assert redirected_to(conn) == app.redirect_uris
assert Phoenix.Flash.get(conn.assigns.flash, :error) ==
"Failed to authenticate: (error description)."
assert response(conn, 400)
end
test "GET /oauth/registration_details renders registration details form", %{