[bug] Detecting public posts is being mishandled #670

Closed
opened 2023-12-20 15:53:23 +00:00 by helge · 4 comments
Contributor

Your setup

Docker

Extra details

No response

Version

No response

PostgreSQL version

No response

What were you trying to do?

Federate a post to akkoma that is Public. To federate with akkoma, one needs to use "https://www.w3.org/ns/activitystreams#Public" instead of "as:Public" or "Public". My understanding from the Note at https://www.w3.org/TR/activitypub/#public-addressing is that ActivityPub requires all three to valid.

This means that your public post detecting needs to also check for "as:Public" and "Public"

What did you expect to happen?

No response

What actually happened?

No response

Logs

See:

https://codeberg.org/bovine/bovine/issues/53

https://codeberg.org/bovine/bovine/issues/130

Severity

I cannot use it as easily as I'd like

Have you searched for this issue?

  • I have double-checked and have not found this issue mentioned anywhere.
### Your setup Docker ### Extra details _No response_ ### Version _No response_ ### PostgreSQL version _No response_ ### What were you trying to do? Federate a post to akkoma that is Public. To federate with akkoma, one needs to use "https://www.w3.org/ns/activitystreams#Public" instead of "as:Public" or "Public". My understanding from the Note at https://www.w3.org/TR/activitypub/#public-addressing is that ActivityPub requires all three to valid. This means that your public post detecting needs to also check for "as:Public" and "Public" ### What did you expect to happen? _No response_ ### What actually happened? _No response_ ### Logs ```shell See: https://codeberg.org/bovine/bovine/issues/53 https://codeberg.org/bovine/bovine/issues/130 ``` ### Severity I cannot use it as easily as I'd like ### Have you searched for this issue? - [x] I have double-checked and have not found this issue mentioned anywhere.
helge added the
bug
label 2023-12-20 15:53:23 +00:00

This logic that it can only be one thing is littered everywhere; from elixir code to SQL with the string https://www.w3.org/ns/activitystreams#Public; there is a constant which is exported from Constants.as_public which then spreads the single-ness check.

It feels like, a migration edit would solve postgres classifying; using RIGHT(field, 6) == "Public"; while broader than the spec; I think it still allows for a single operation to verify; and should be trivial to express s a shared predicate function which returns a boolean, simplifying the usage everywhere.

(Pleroma.Constants.as_public() in object.data["to"] or
            Pleroma.Constants.as_public() in object.data["cc"])

would simplify to

(isPublic(object.data["to"]) or isPublic(object.data["cc"]))

which could further simplify (taking into account Elixir and Erlang lack of variadic arguments)

isPublic([object.data["to"], object.data["cc"]])

so that isPublic then knows, or encodes that it expects a series of things to check and is doing an is/areAny check (default return false, loop over results if one is public then return true.

This logic that it can only be one thing is littered everywhere; from elixir code to SQL with the string `https://www.w3.org/ns/activitystreams#Public`; there is a constant which is exported from `Constants.as_public` which then spreads the single-ness check. It feels like, a migration edit would solve postgres classifying; using `RIGHT(field, 6) == "Public"`; while broader than the spec; I think it still allows for a single operation to verify; and should be trivial to express s a shared predicate function which returns a boolean, simplifying the usage everywhere. ``` (Pleroma.Constants.as_public() in object.data["to"] or Pleroma.Constants.as_public() in object.data["cc"]) ``` would simplify to ``` (isPublic(object.data["to"]) or isPublic(object.data["cc"])) ``` which could further simplify (taking into account Elixir and Erlang lack of variadic arguments) ``` isPublic([object.data["to"], object.data["cc"]]) ``` so that isPublic then knows, or encodes that it expects a series of things to check and is doing an is/areAny check (default return false, loop over results if one is public then return `true`.
Member

imho it makes things more straightforward if there’s only a single representation of public scope internally, thus it would make sense to normalise the other ones when received.

The below can be ignored see next post should fix the issue for new objects I think, but since I don’t run an actual (federating) instance I can for now only confirm for sure that it builds. If anyone can test this feel free to open a PR with this yourself, but if you prefer you can also let me know you successfully tested the patch and I’ll handle the PR.

From 05d11f097d1860df2962c2ecd6ebf70438804609 Mon Sep 17 00:00:00 2001
From: Oneric <oneric@oneric.stub>
Date: Tue, 22 Jan 2024 00:00:00 +0000
Subject: [PATCH] Normalise public addressing to fix federation

Due to JSON LD compacting the full address of public scope
may also occur in shorter forms and the spec requires us to treat them
all equivalently. To save us the pain internally, normalise all inbound
data to just one form.
See note at: https://www.w3.org/TR/activitypub/#public-addressing
---
 lib/pleroma/web/activity_pub/transmogrifier.ex | 17 +++++++++++++++++
 1 file changed, 17 insertions(+)

diff --git a/lib/pleroma/web/activity_pub/transmogrifier.ex b/lib/pleroma/web/activity_pub/transmogrifier.ex
index 435343bf0..4301ae3e2 100644
--- a/lib/pleroma/web/activity_pub/transmogrifier.ex
+++ b/lib/pleroma/web/activity_pub/transmogrifier.ex
@@ -73,6 +73,22 @@ def fix_addressing_list(map, field) do
     end
   end
 
+  # Due to JSON-LD simply "Public" and "as:Public" are equivalent to the full URI
+  # but to simplify later checks we only want to deal with one reperesentation internally
+  def fix_addressing_public_list(map, field) do
+    full_uri = "https://www.w3.org/ns/activitystreams#Public"
+
+    Map.put(map, field, Enum.map(map, fn
+      "Public" -> full_uri
+      "as:Public" -> full_uri
+      x -> x
+    end))
+  end
+
+  def fix_addressing_public(map, fields) do
+    Enum.each(fields, fn field -> fix_addressing_public_list(map, field) end)
+  end
+
   # if directMessage flag is set to true, leave the addressing alone
   def fix_explicit_addressing(%{"directMessage" => true} = object, _follower_collection),
     do: object
@@ -107,6 +123,7 @@ def fix_addressing(object) do
     |> fix_addressing_list("cc")
     |> fix_addressing_list("bto")
     |> fix_addressing_list("bcc")
+    |> fix_addressing_public(["to", "cc", "bto", "bcc"])
     |> fix_explicit_addressing(follower_collection)
     |> CommonFixes.fix_implicit_addressing(follower_collection)
   end
-- 
2.39.2

As for already received objects, this query should be able to fix them and seems to work in simple tests with manually edited objects:

--- Normalise retroactively
START TRANSACTION;

UPDATE activities
SET
  data['to'] = to_jsonb(array_replace(array_replace(
     ARRAY(SELECT jsonb_array_elements(data['to'])),
    '"Public"', '"https://www.w3.org/ns/activitystreams#Public"'),
    '"as:Public"', '"https://www.w3.org/ns/activitystreams#Public"')
  ),
  data['cc'] = to_jsonb(array_replace(array_replace(
     ARRAY(SELECT jsonb_array_elements(data['cc'])),
    '"Public"', '"https://www.w3.org/ns/activitystreams#Public"'),
    '"as:Public"', '"https://www.w3.org/ns/activitystreams#Public"')
  ),
  data['bto'] = to_jsonb(array_replace(array_replace(
     ARRAY(SELECT jsonb_array_elements(data['bto'])),
    '"Public"', '"https://www.w3.org/ns/activitystreams#Public"'),
    '"as:Public"', '"https://www.w3.org/ns/activitystreams#Public"')
  ),
  data['bcc'] = to_jsonb(array_replace(array_replace(
     ARRAY(SELECT jsonb_array_elements(data['bcc'])),
    '"Public"', '"https://www.w3.org/ns/activitystreams#Public"'),
    '"as:Public"', '"https://www.w3.org/ns/activitystreams#Public"')
  ) 
WHERE  -- while we need double quotes in replace, for ?| we MUST NOT use quotes!
  data['to']  ?| array['Public', 'as:Public'] OR
  data['cc']  ?| array['Public', 'as:Public'] OR
  data['bto'] ?| array['Public', 'as:Public'] OR
  data['bcc'] ?| array['Public', 'as:Public']
;

--- Repeat exact same query but for objects table
UPDATE objects
...
;

COMMIT TRANSACTION;

This will have to check all stored objects, so idk if it makes sense to force this onto all admins with a migration, some (many?) prune old remote objects regularly anyway), but it’d presumably be simple to turn this into one.

imho it makes things more straightforward if there’s only a single representation of public scope internally, thus it would make sense to normalise the other ones when received. The below *can be ignored see next post* ~~*should* fix the issue for new objects I think, but since I don’t run an actual (federating) instance I can for now only confirm for sure that it builds. If anyone can test this feel free to open a PR with this yourself, but if you prefer you can also let me know you successfully tested the patch and I’ll handle the PR.~~ ```patch From 05d11f097d1860df2962c2ecd6ebf70438804609 Mon Sep 17 00:00:00 2001 From: Oneric <oneric@oneric.stub> Date: Tue, 22 Jan 2024 00:00:00 +0000 Subject: [PATCH] Normalise public addressing to fix federation Due to JSON LD compacting the full address of public scope may also occur in shorter forms and the spec requires us to treat them all equivalently. To save us the pain internally, normalise all inbound data to just one form. See note at: https://www.w3.org/TR/activitypub/#public-addressing --- lib/pleroma/web/activity_pub/transmogrifier.ex | 17 +++++++++++++++++ 1 file changed, 17 insertions(+) diff --git a/lib/pleroma/web/activity_pub/transmogrifier.ex b/lib/pleroma/web/activity_pub/transmogrifier.ex index 435343bf0..4301ae3e2 100644 --- a/lib/pleroma/web/activity_pub/transmogrifier.ex +++ b/lib/pleroma/web/activity_pub/transmogrifier.ex @@ -73,6 +73,22 @@ def fix_addressing_list(map, field) do end end + # Due to JSON-LD simply "Public" and "as:Public" are equivalent to the full URI + # but to simplify later checks we only want to deal with one reperesentation internally + def fix_addressing_public_list(map, field) do + full_uri = "https://www.w3.org/ns/activitystreams#Public" + + Map.put(map, field, Enum.map(map, fn + "Public" -> full_uri + "as:Public" -> full_uri + x -> x + end)) + end + + def fix_addressing_public(map, fields) do + Enum.each(fields, fn field -> fix_addressing_public_list(map, field) end) + end + # if directMessage flag is set to true, leave the addressing alone def fix_explicit_addressing(%{"directMessage" => true} = object, _follower_collection), do: object @@ -107,6 +123,7 @@ def fix_addressing(object) do |> fix_addressing_list("cc") |> fix_addressing_list("bto") |> fix_addressing_list("bcc") + |> fix_addressing_public(["to", "cc", "bto", "bcc"]) |> fix_explicit_addressing(follower_collection) |> CommonFixes.fix_implicit_addressing(follower_collection) end -- 2.39.2 ``` As for already received objects, this query should be able to fix them and seems to work in simple tests with manually edited objects: ```sql --- Normalise retroactively START TRANSACTION; UPDATE activities SET data['to'] = to_jsonb(array_replace(array_replace( ARRAY(SELECT jsonb_array_elements(data['to'])), '"Public"', '"https://www.w3.org/ns/activitystreams#Public"'), '"as:Public"', '"https://www.w3.org/ns/activitystreams#Public"') ), data['cc'] = to_jsonb(array_replace(array_replace( ARRAY(SELECT jsonb_array_elements(data['cc'])), '"Public"', '"https://www.w3.org/ns/activitystreams#Public"'), '"as:Public"', '"https://www.w3.org/ns/activitystreams#Public"') ), data['bto'] = to_jsonb(array_replace(array_replace( ARRAY(SELECT jsonb_array_elements(data['bto'])), '"Public"', '"https://www.w3.org/ns/activitystreams#Public"'), '"as:Public"', '"https://www.w3.org/ns/activitystreams#Public"') ), data['bcc'] = to_jsonb(array_replace(array_replace( ARRAY(SELECT jsonb_array_elements(data['bcc'])), '"Public"', '"https://www.w3.org/ns/activitystreams#Public"'), '"as:Public"', '"https://www.w3.org/ns/activitystreams#Public"') ) WHERE -- while we need double quotes in replace, for ?| we MUST NOT use quotes! data['to'] ?| array['Public', 'as:Public'] OR data['cc'] ?| array['Public', 'as:Public'] OR data['bto'] ?| array['Public', 'as:Public'] OR data['bcc'] ?| array['Public', 'as:Public'] ; --- Repeat exact same query but for objects table UPDATE objects ... ; COMMIT TRANSACTION; ``` This will have to check _all_ stored objects, so idk if it makes sense to force this onto all admins with a migration, some (many?) prune old remote objects regularly anyway), but it’d presumably be simple to turn this into one.
Member

I realised there are still tests for federation and I can use those to test the change. The first draft from above was placed to late and had some other runtime issues. A fixed up patch, adding a new test and passing mix test can be found at: #675

I realised there are still tests for federation and I can use those to test the change. The first draft from above was placed to late and had some other runtime issues. A fixed up patch, adding a new test and passing `mix test` can be found at: https://akkoma.dev/AkkomaGang/akkoma/pulls/675

This issue is currently preventing implementations running full json-ld (iceshrimp.net, takahe, ...) from federating with akkoma. The compaction step at the end of JSON-LD processing replaces https://www.w3.org/ns/activitystreams#Public with as:Public.

This issue is currently preventing implementations running full json-ld (iceshrimp.net, takahe, ...) from federating with akkoma. The compaction step at the end of JSON-LD processing replaces `https://www.w3.org/ns/activitystreams#Public` with `as:Public`.
Sign in to join this conversation.
No Milestone
No project
No Assignees
4 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: AkkomaGang/akkoma#670
No description provided.