Normalise JSON-LD compacted public addressing #675

Closed
Oneric wants to merge 0 commits from Oneric:fedfix-public-ld into develop
Member

Fixes federation with bovine and implements the note from https://www.w3.org/TR/activitypub/#public-addressing explicitly pointing out all three forms are valid and equivalent.

It’s necessary to insert this normalisation early, even before the other fix_addressing_* calls, else the object will be rejected before it gets far enough. Since I’m not sure which AP data types can all contain addressing lists, the most reliable way seemed to be inserting JSON-LD normalisation as an earlier step before the pre-existing handle_incoming.
To correctly get lists instead of bianreis etc, this also required moving some addressing fixes to this earlier step.
If there’s a more targeted way to integrate this without code duplication, let me know.

See #670 for more details.

Note: i only have a local, non-federating test instance, so this change was only ever tested with mix test

To retroactively normalise already received objects and activities, the following query can be used (tested by manually editing some activities):

CREATE FUNCTION pg_temp.fix_list(jsonb) RETURNS jsonb LANGUAGE plpgsql AS
$$ BEGIN
  RETURN to_jsonb(array_replace(array_replace(
    ARRAY(SELECT jsonb_array_elements($1)),
    '"Public"', '"https://www.w3.org/ns/activitystreams#Public"'),
    '"as:Public"', '"https://www.w3.org/ns/activitystreams#Public"')
  );
END; $$
;

CREATE FUNCTION pg_temp.affected(jsonb) RETURNS boolean LANGUAGE plpgsql AS
$$ BEGIN
  --- while replace above needs double quotes, for ?| we MUST NOT use quotes!
  RETURN $1 ?| array['Public', 'as:Public'];
END; $$
;

--- Normalise retroactively
START TRANSACTION;

UPDATE activities
SET
  data['to'] = pg_temp.fix_list(data['to']),
  data['cc'] = pg_temp.fix_list(data['cc']),
  data['bto'] = pg_temp.fix_list(data['bto']),
  data['bcc'] = pg_temp.fix_list(data['bcc'])
WHERE
  pg_temp.affected(data['to'])  OR pg_temp.affected(data['cc']) OR
  pg_temp.affected(data['bto']) OR pg_temp.affected(data['bcc'])
;

--- Repeat exact same query but for objects table
UPDATE objects
SET
  data['to'] = pg_temp.fix_list(data['to']),
  data['cc'] = pg_temp.fix_list(data['cc']),
  data['bto'] = pg_temp.fix_list(data['bto']),
  data['bcc'] = pg_temp.fix_list(data['bcc'])
WHERE
  pg_temp.affected(data['to'])  OR pg_temp.affected(data['cc']) OR
  pg_temp.affected(data['bto']) OR pg_temp.affected(data['bcc'])
;

COMMIT TRANSACTION;

OR without temporary functions

--- Normalise retroactively
START TRANSACTION;

UPDATE activities
SET
  data['to'] = to_jsonb(array_replace(array_replace(
     ARRAY(SELECT jsonb_array_elements(data['to'])),
    '"Public"', '"https://www.w3.org/ns/activitystreams#Public"'),
    '"as:Public"', '"https://www.w3.org/ns/activitystreams#Public"')
  ),
  data['cc'] = to_jsonb(array_replace(array_replace(
     ARRAY(SELECT jsonb_array_elements(data['cc'])),
    '"Public"', '"https://www.w3.org/ns/activitystreams#Public"'),
    '"as:Public"', '"https://www.w3.org/ns/activitystreams#Public"')
  ),
  data['bto'] = to_jsonb(array_replace(array_replace(
     ARRAY(SELECT jsonb_array_elements(data['bto'])),
    '"Public"', '"https://www.w3.org/ns/activitystreams#Public"'),
    '"as:Public"', '"https://www.w3.org/ns/activitystreams#Public"')
  ),
  data['bcc'] = to_jsonb(array_replace(array_replace(
     ARRAY(SELECT jsonb_array_elements(data['bcc'])),
    '"Public"', '"https://www.w3.org/ns/activitystreams#Public"'),
    '"as:Public"', '"https://www.w3.org/ns/activitystreams#Public"')
  ) 
WHERE  -- while we need double quotes in replace, for ?| we MUST NOT use quotes!
  data['to']  ?| array['Public', 'as:Public'] OR
  data['cc']  ?| array['Public', 'as:Public'] OR
  data['bto'] ?| array['Public', 'as:Public'] OR
  data['bcc'] ?| array['Public', 'as:Public']
;

--- Repeat exact same query but for objects table
UPDATE objects
...
;

COMMIT TRANSACTION;

Fixes #670

Fixes federation with bovine and implements the note from https://www.w3.org/TR/activitypub/#public-addressing explicitly pointing out all three forms are valid and equivalent. It’s necessary to insert this normalisation early, even before the other `fix_addressing_*` calls, else the object will be rejected before it gets far enough. Since I’m not sure which AP data types can all contain addressing lists, the most reliable way seemed to be inserting JSON-LD normalisation as an earlier step before the pre-existing `handle_incoming`. To correctly get lists instead of bianreis etc, this also required moving some addressing fixes to this earlier step. If there’s a more targeted way to integrate this without code duplication, let me know. See #670 for more details. **Note:** i only have a local, non-federating test instance, so this change was only ever tested with `mix test` To retroactively normalise already received objects and activities, the following query can be used (tested by manually editing some activities): ```sql CREATE FUNCTION pg_temp.fix_list(jsonb) RETURNS jsonb LANGUAGE plpgsql AS $$ BEGIN RETURN to_jsonb(array_replace(array_replace( ARRAY(SELECT jsonb_array_elements($1)), '"Public"', '"https://www.w3.org/ns/activitystreams#Public"'), '"as:Public"', '"https://www.w3.org/ns/activitystreams#Public"') ); END; $$ ; CREATE FUNCTION pg_temp.affected(jsonb) RETURNS boolean LANGUAGE plpgsql AS $$ BEGIN --- while replace above needs double quotes, for ?| we MUST NOT use quotes! RETURN $1 ?| array['Public', 'as:Public']; END; $$ ; --- Normalise retroactively START TRANSACTION; UPDATE activities SET data['to'] = pg_temp.fix_list(data['to']), data['cc'] = pg_temp.fix_list(data['cc']), data['bto'] = pg_temp.fix_list(data['bto']), data['bcc'] = pg_temp.fix_list(data['bcc']) WHERE pg_temp.affected(data['to']) OR pg_temp.affected(data['cc']) OR pg_temp.affected(data['bto']) OR pg_temp.affected(data['bcc']) ; --- Repeat exact same query but for objects table UPDATE objects SET data['to'] = pg_temp.fix_list(data['to']), data['cc'] = pg_temp.fix_list(data['cc']), data['bto'] = pg_temp.fix_list(data['bto']), data['bcc'] = pg_temp.fix_list(data['bcc']) WHERE pg_temp.affected(data['to']) OR pg_temp.affected(data['cc']) OR pg_temp.affected(data['bto']) OR pg_temp.affected(data['bcc']) ; COMMIT TRANSACTION; ``` OR without temporary functions ```sql --- Normalise retroactively START TRANSACTION; UPDATE activities SET data['to'] = to_jsonb(array_replace(array_replace( ARRAY(SELECT jsonb_array_elements(data['to'])), '"Public"', '"https://www.w3.org/ns/activitystreams#Public"'), '"as:Public"', '"https://www.w3.org/ns/activitystreams#Public"') ), data['cc'] = to_jsonb(array_replace(array_replace( ARRAY(SELECT jsonb_array_elements(data['cc'])), '"Public"', '"https://www.w3.org/ns/activitystreams#Public"'), '"as:Public"', '"https://www.w3.org/ns/activitystreams#Public"') ), data['bto'] = to_jsonb(array_replace(array_replace( ARRAY(SELECT jsonb_array_elements(data['bto'])), '"Public"', '"https://www.w3.org/ns/activitystreams#Public"'), '"as:Public"', '"https://www.w3.org/ns/activitystreams#Public"') ), data['bcc'] = to_jsonb(array_replace(array_replace( ARRAY(SELECT jsonb_array_elements(data['bcc'])), '"Public"', '"https://www.w3.org/ns/activitystreams#Public"'), '"as:Public"', '"https://www.w3.org/ns/activitystreams#Public"') ) WHERE -- while we need double quotes in replace, for ?| we MUST NOT use quotes! data['to'] ?| array['Public', 'as:Public'] OR data['cc'] ?| array['Public', 'as:Public'] OR data['bto'] ?| array['Public', 'as:Public'] OR data['bcc'] ?| array['Public', 'as:Public'] ; --- Repeat exact same query but for objects table UPDATE objects ... ; COMMIT TRANSACTION; ``` Fixes https://akkoma.dev/AkkomaGang/akkoma/issues/670
Oneric force-pushed fedfix-public-ld from f8696d1204 to fd4cc5b17d 2024-01-31 16:30:45 +00:00 Compare
Oneric force-pushed fedfix-public-ld from fd4cc5b17d to 73a531a2d0 2024-02-02 16:02:53 +00:00 Compare
Author
Member

Applied mix format (sorry forgot before) which led to more lines being re-indented from the def handle_incomingdefp handle_incoming_normalised rename. Probably best to review with whitespace changes hidden to have less clutter

Applied `mix format` (sorry forgot before) which led to more lines being re-indented from the `def handle_incoming` → `defp handle_incoming_normalised` rename. Probably best to review with whitespace changes hidden to have less clutter
Oneric force-pushed fedfix-public-ld from 73a531a2d0 to 505e498b1a 2024-02-19 18:37:39 +00:00 Compare
floatingghost reviewed 2024-04-24 17:34:28 +00:00
floatingghost left a comment
Owner

seems sensible, just one cleanup you could do to remove hardcoded values

seems sensible, just one cleanup you could do to remove hardcoded values
@ -76,0 +71,4 @@
defp normalise_addressing_public_list(map, all_fields)
defp normalise_addressing_public_list(%{} = map, [field | fields]) do
full_uri = "https://www.w3.org/ns/activitystreams#Public"
you can get this from Pleroma.Constants https://akkoma.dev/AkkomaGang/akkoma/src/branch/develop/lib/pleroma/constants.ex#L8
Author
Member

fixed and rebased

fixed and rebased
Oneric marked this conversation as resolved
Oneric force-pushed fedfix-public-ld from 505e498b1a to b0a46c1e2e 2024-04-25 16:50:35 +00:00 Compare

merge conflict fixed and merged via 828158ef49

merge conflict fixed and merged via 828158ef49d7f39661ba112727f384638ee277fb
floatingghost closed this pull request 2024-04-26 17:49:56 +00:00
Oneric deleted branch fedfix-public-ld 2024-04-26 18:24:51 +00:00
Some checks are pending
ci/woodpecker/pr/build-amd64 Pipeline is pending
ci/woodpecker/pr/build-arm64 Pipeline is pending
ci/woodpecker/pr/docs Pipeline is pending
ci/woodpecker/pr/lint Pipeline is pending
ci/woodpecker/pr/test Pipeline is pending

Pull request closed

Sign in to join this conversation.
No description provided.