Add htmlMfm key when relevant #878

Merged
Oneric merged 4 commits from ilja/akkoma:add_fep-c16b_discovery_mechanism_to_not_always_reparse_mfm into develop 2025-06-22 14:31:33 +00:00
Contributor

Related: #381

Now that our content uses html for expressing MFM (#823), FEP-c16b allows us to use a detection mechanism to indicate that the MFM does not require reparsing. This PR is the implementation for that.

I first thought about always adding the key, but then realised that older posts aren't FEP-c16b formatted. I therefor decided to explicitly add the key to newly created MFM statusses. There will be some posts between latest release and this PR who are properly formatted while not having this discovery mechanism, but that's OK. Receiving servers will re-parse the content just like what happened previously.

TODO in this PR

  • Add the key to newly created MFM statusses
    • Still needs proper tests
  • Adapt the @context to inlude the htmlMfm key
    • Still needs proper tests
  • Don't reparse incomming MFM statusses who have this key
    • Still needs proper tests
  • Adapt FEDERATION.md
  • 🐶, running on ilja.space since 2025-04-06 2025-06-21 (in latest form)
Related: https://akkoma.dev/AkkomaGang/akkoma/issues/381 Now that our `content` uses html for expressing MFM (https://akkoma.dev/AkkomaGang/akkoma/pulls/823), FEP-c16b allows us to use [a detection mechanism to indicate that the MFM does not require reparsing](https://codeberg.org/fediverse/fep/src/branch/main/fep/c16b/fep-c16b.md#discovery). This PR is the implementation for that. I first thought about always adding the key, but then realised that older posts aren't FEP-c16b formatted. I therefor decided to explicitly add the key to newly created MFM statusses. There will be some posts between latest release and this PR who are properly formatted while not having this discovery mechanism, but that's OK. Receiving servers will re-parse the content just like what happened previously. TODO in this PR - [x] Add the key to newly created MFM statusses - [x] Still needs proper tests - [x] Adapt the `@context` to inlude the htmlMfm key - [x] Still needs proper tests - [x] Don't reparse incomming MFM statusses who have this key - [x] Still needs proper tests - [x] Adapt FEDERATION.md - [x] 🐶, running on ilja.space since ~~2025-04-06~~ 2025-06-21 (in latest form)
ilja added 1 commit 2025-03-09 19:32:41 +00:00
Add htmlMfm key when relevant
Some checks are pending
ci/woodpecker/pr/build-amd64 Pipeline is pending approval
ci/woodpecker/pr/build-arm64 Pipeline is pending approval
ci/woodpecker/pr/docs Pipeline is pending approval
ci/woodpecker/pr/lint Pipeline is pending approval
ci/woodpecker/pr/test Pipeline is pending approval
16bb564380
This is just a first commit for this. We now add the htmlMfm key when relevant,
store this in the database, and we see it when fetching using
`curl -L -H 'Accept: application/activity+json' "$ap_id"`

we still need
- tests (add with posts MFM, and only with MFM)
- `@context` requires the `htmlMfm: https://w3id.org/fep/c16b#htmlMfm`
- don't reparse incomming MFM statusses who have this key (and tests for that obviously)
ilja force-pushed add_fep-c16b_discovery_mechanism_to_not_always_reparse_mfm from 16bb564380 to 47766297e3 2025-04-06 15:35:37 +00:00 Compare
ilja force-pushed add_fep-c16b_discovery_mechanism_to_not_always_reparse_mfm from 47766297e3 to 1b12168063 2025-04-06 15:36:25 +00:00 Compare
ilja force-pushed add_fep-c16b_discovery_mechanism_to_not_always_reparse_mfm from 1b12168063 to 454244d13a 2025-04-06 15:37:00 +00:00 Compare
ilja force-pushed add_fep-c16b_discovery_mechanism_to_not_always_reparse_mfm from 454244d13a to 4a57821093 2025-04-06 15:37:51 +00:00 Compare
ilja force-pushed add_fep-c16b_discovery_mechanism_to_not_always_reparse_mfm from 4a57821093 to d64b730714 2025-04-06 15:39:36 +00:00 Compare
ilja force-pushed add_fep-c16b_discovery_mechanism_to_not_always_reparse_mfm from d64b730714 to 908ec0a738 2025-04-06 15:42:25 +00:00 Compare
ilja force-pushed add_fep-c16b_discovery_mechanism_to_not_always_reparse_mfm from 908ec0a738 to 5c8537aa25 2025-04-06 17:01:52 +00:00 Compare
ilja force-pushed add_fep-c16b_discovery_mechanism_to_not_always_reparse_mfm from 5c8537aa25 to f82fbb431a 2025-04-06 17:02:43 +00:00 Compare
ilja force-pushed add_fep-c16b_discovery_mechanism_to_not_always_reparse_mfm from f82fbb431a to 90ec2168fe 2025-04-06 17:09:04 +00:00 Compare
ilja force-pushed add_fep-c16b_discovery_mechanism_to_not_always_reparse_mfm from 90ec2168fe to 3b701e9974 2025-04-06 17:21:38 +00:00 Compare
ilja changed title from WIP: Add htmlMfm key when relevant to Add htmlMfm key when relevant 2025-04-06 17:25:21 +00:00
ilja force-pushed add_fep-c16b_discovery_mechanism_to_not_always_reparse_mfm from 3b701e9974 to c9a36e4340 2025-04-06 17:57:00 +00:00 Compare
Oneric approved these changes 2025-04-06 23:21:40 +00:00
Dismissed
Oneric left a comment
Owner

looks good to me; just some cosmetic nits

will real-world test it too now

looks good to me; just some cosmetic nits will real-world test it too now
CHANGELOG.md Outdated
@ -8,1 +8,4 @@
### Added
- We mark our MFM posts as FEP-c16b compliant, and do not re-parsed when receiving an MFM post who is marked as FEP-c16b compliant.
Owner

nit: currently the second part of the sentence is ungrammatical; consider e.g..

[...] and retain remote HTML representations for incoming posts marked as FEP-c16b-compliant. (Safety scrubbers are still applied)

nit: currently the second part of the sentence is ungrammatical; consider e.g.. *[...] and retain remote HTML representations for incoming posts marked as FEP-c16b-compliant. (Safety scrubbers are still applied)*
Author
Contributor

This indeed sounds better thk you! Fixed now

This indeed sounds better thk you! Fixed now
ilja marked this conversation as resolved
FEDERATION.md Outdated
@ -36,2 +36,3 @@
The optional extension term `htmlMfm` is currently not used.
We set the optional extension term `htmlMfm: true` when using content type "text/x.misskeymarkdown".
Incomming messages containing `htmlMfm: true` will not have their content re-parsed.
Owner

nit: Incoming

nit: Incoming
ilja marked this conversation as resolved
@ -49,3 +49,2 @@
def create(user, params) do
user
|> new(params)
new(user, params)
Owner

unrelated cosmetic change and the previous user |> new(params) form also seems more common in the current code base

unrelated cosmetic change and the previous `user |> new(params)` form also seems more common in the current code base
Author
Contributor

ah, yes. I changed this to make it easier for myself to reason about the pipeline, and then kept it bc it seemed an improvement for readability.
The way it was written (user |> new(params) |> ...) gives an initial impression that you have a User struct who's then being changed. But that's not what's happening; The pipeline is about adapting an ActivityDraft struct. Starting the first line with the creation of said ActivityDraft (new(user, params) |> ...) makes this more explicit to whoever reads it.

If you still prefer me to change it back, let me know and I'll change it back.

ah, yes. I changed this to make it easier for myself to reason about the pipeline, and then kept it bc it seemed an improvement for readability. The way it was written (`user |> new(params) |> ...`) gives an initial impression that you have a User struct who's then being changed. But that's not what's happening; The pipeline is about adapting an ActivityDraft struct. Starting the first line with the creation of said ActivityDraft (`new(user, params) |> ...`) makes this more explicit to whoever reads it. If you still prefer me to change it back, let me know and I'll change it back.
Owner

The reasoning for the change is convincing.

As a general remark, I believe it’s good to separate functional and unrelated cosmetic changes (as well as distinct functional changes) into different commits. This makes reviewing easier (no need to wonder how a change relates to the rest and knowing whether something is supposed to change the logic or merely refactoring it without affecting the overall logic also helps), can make backporting easier if needed (and for distinct functional changes allows bisects to be more effective).
Admittedly this isn’t really followed in this project though so its probably fine as is (i cannot mark this thread as resolved, otherwise i would)

The reasoning for the change is convincing. As a general remark, I believe it’s good to separate functional and unrelated cosmetic changes (as well as distinct functional changes) into different commits. This makes reviewing easier (no need to wonder how a change relates to the rest and knowing whether something is _supposed_ to change the logic or merely refactoring it without affecting the overall logic also helps), can make backporting easier if needed (and for distinct functional changes allows bisects to be more effective). Admittedly this isn’t really followed in this project though so its probably fine as is *(i cannot mark this thread as resolved, otherwise i would)*
Author
Contributor

This is good feedback, and of a type I often feel I miss (and probably explains why not often followed), thank you. I'll leave it as is now bc you said you'll also be testing it, and otherwise it may give unwanted problems due to diverting branches, but I'll keep it in mind for the future. I approached this one a bit more chaotic than I usually do, but it would not have been difficult to keep the changes better separated.

This is good feedback, and of a type I often feel I miss (and probably explains why not often followed), thank you. I'll leave it as is now bc you said you'll also be testing it, and otherwise it may give unwanted problems due to diverting branches, but I'll keep it in mind for the future. I approached this one a bit more chaotic than I usually do, but it would not have been difficult to keep the changes better separated.
ilja marked this conversation as resolved
ilja added 1 commit 2025-04-30 09:53:06 +00:00
Fix grammatical errors after code review
Some checks are pending
ci/woodpecker/pr/build-amd64 Pipeline is pending approval
ci/woodpecker/pr/build-arm64 Pipeline is pending approval
ci/woodpecker/pr/docs Pipeline is pending approval
ci/woodpecker/pr/lint Pipeline is pending approval
ci/woodpecker/pr/test/1 Pipeline is pending approval
ci/woodpecker/pr/test/2 Pipeline is pending approval
39d768af82
Owner

Can you fix the merge conflict?

I believe I tested this before applying it myself, but it would be reassuring to have a test making sure any received HTML is still passed through the sanitizer to remove potential malicious bits

Can you fix the merge conflict? I believe I tested this before applying it myself, but it would be reassuring to have a test making sure any received HTML is still passed through the sanitizer to remove potential malicious bits
ilja changed title from Add htmlMfm key when relevant to WIP: Add htmlMfm key when relevant 2025-06-21 08:31:02 +00:00
ilja added 2 commits 2025-06-21 09:34:58 +00:00
Add extra tests + fix storing htmlMfm in database when federated
All checks were successful
ci/woodpecker/pr/lint Pipeline was successful
ci/woodpecker/pr/test/2 Pipeline was successful
ci/woodpecker/pr/test/1 Pipeline was successful
ci/woodpecker/pr/build-arm64 Pipeline was successful
ci/woodpecker/pr/build-amd64 Pipeline was successful
ci/woodpecker/pr/docs Pipeline was successful
ci/woodpecker/pull_request_closed/lint Pipeline was successful
ci/woodpecker/pull_request_closed/build-arm64 Pipeline was successful
ci/woodpecker/pull_request_closed/test/1 Pipeline was successful
ci/woodpecker/pull_request_closed/test/2 Pipeline was successful
ci/woodpecker/pull_request_closed/build-amd64 Pipeline was successful
ci/woodpecker/pull_request_closed/docs Pipeline was successful
d200c7e11c
- This adds extra tests to be sure that scrubbing still happens.
- When doing this I notices that the htmlMfm key wasn't stored in the database when comming through the federator. This has been now been fixed too.
- We also test that values true, false or no attribute all work for incomming messages.
ilja changed title from WIP: Add htmlMfm key when relevant to Add htmlMfm key when relevant 2025-06-21 09:35:30 +00:00
Author
Contributor

@Oneric wrote in #878 (comment):

Can you fix the merge conflict?

I believe I tested this before applying it myself, but it would be reassuring to have a test making sure any received HTML is still passed through the sanitizer to remove potential malicious bits

Merge conflict resolved + I added extra tests. Doing this, I also noticed that the key was not stored in the database for incoming messages. This is now fixed and I added tests for the key having value true, false, or not be present.

@Oneric wrote in https://akkoma.dev/AkkomaGang/akkoma/pulls/878#issuecomment-14194: > Can you fix the merge conflict? > > I believe I tested this before applying it myself, but it would be reassuring to have a test making sure any received HTML is still passed through the sanitizer to remove potential malicious bits Merge conflict resolved + I added extra tests. Doing this, I also noticed that the key was not stored in the database for incoming messages. This is now fixed and I added tests for the key having value true, false, or not be present.
Oneric approved these changes 2025-06-22 14:31:12 +00:00
Oneric merged commit f2c2ec5e27 into develop 2025-06-22 14:31:33 +00:00
Owner

thanks!

I deduplicated the test payloads in #943 which makes it easier to see at a glance how they differ and makes it imho easier to read in general

thanks! I deduplicated the test payloads in #943 which makes it easier to see at a glance how they differ and makes it imho easier to read in general
Sign in to join this conversation.
No description provided.