Federate MFM in content field using HTML #343

Open
opened 2023-02-18 15:39:51 +00:00 by ilja · 14 comments

See Akkoma issue AkkomaGang/akkoma#381

Generally fedi servers use html for content which can then be directly used. For example if you input using Markdown *something*, then it will be federated by using <i>something</i> in the content field. The advantage is that implementations just need to understand html, not every input method some server comes up with.

In case of MFM this isn't always true, however. This causes compatibility problems for other software who now have to fully implement a new parser and re-parse the incoming source if it's MFM.

Example (from https://ilja.space/notice/ASo2tpidQ5yVRUYxG4):
When you post

\(x= \frac{-b' \pm \sqrt{(b')^2-ac}}{a}\)
$[flip.h,v FoundKey expands the world of the Fediverse]

It now federates as

<code>x= \\frac{-b' \\pm \\sqrt{(b')^2-ac}}{a}</code><span><br></span><i><span>FoundKey expands the world of the Fediverse</span></i>

I expect it to be something more like how the html looks when I check the post through the Foundkey FE

show example
<span><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>x</mi><mo>=</mo><mfrac><mrow><mo></mo><msup><mi>b</mi><mo lspace="0em" rspace="0em" mathvariant="normal"></mo></msup><mo>±</mo><msqrt><mrow><mo stretchy="false">(</mo><msup><mi>b</mi><mo lspace="0em" rspace="0em" mathvariant="normal"></mo></msup><msup><mo stretchy="false">)</mo><mn>2</mn></msup><mo></mo><mi>a</mi><mi>c</mi></mrow></msqrt></mrow><mi>a</mi></mfrac></mrow><annotation encoding="application/x-tex">x= \frac{-b' \pm \sqrt{(b')^2-ac}}{a}</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.4306em;"></span><span class="mord mathnormal">x</span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2778em;"></span></span><span class="base"><span class="strut" style="height:1.6746em;vertical-align:-0.345em;"></span><span class="mord"><span class="mopen nulldelimiter"></span><span class="mfrac"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:1.3296em;"><span style="top:-2.655em;"><span class="pstrut" style="height:3em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight">a</span></span></span></span><span style="top:-3.23em;"><span class="pstrut" style="height:3em;"></span><span class="frac-line" style="border-bottom-width:0.04em;"></span></span><span style="top:-3.6038em;"><span class="pstrut" style="height:3em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mtight"></span><span class="mord mtight"><span class="mord mathnormal mtight">b</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.8278em;"><span style="top:-2.931em;margin-right:0.0714em;"><span class="pstrut" style="height:2.5em;"></span><span class="sizing reset-size3 size1 mtight"><span class="mord mtight"><span class="mord mtight"></span></span></span></span></span></span></span></span></span><span class="mbin mtight">±</span><span class="mord sqrt mtight"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:1.0369em;"><span class="svg-align" style="top:-3.4286em;"><span class="pstrut" style="height:3.4286em;"></span><span class="mord mtight" style="padding-left:1.19em;"><span class="mopen mtight">(</span><span class="mord mtight"><span class="mord mathnormal mtight">b</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.6828em;"><span style="top:-2.786em;margin-right:0.0714em;"><span class="pstrut" style="height:2.5em;"></span><span class="sizing reset-size3 size1 mtight"><span class="mord mtight"><span class="mord mtight"></span></span></span></span></span></span></span></span></span><span class="mclose mtight"><span class="mclose mtight">)</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.7463em;"><span style="top:-2.786em;margin-right:0.0714em;"><span class="pstrut" style="height:2.5em;"></span><span class="sizing reset-size3 size1 mtight"><span class="mord mtight">2</span></span></span></span></span></span></span></span><span class="mbin mtight"></span><span class="mord mathnormal mtight">a</span><span class="mord mathnormal mtight">c</span></span></span><span style="top:-3.0089em;"><span class="pstrut" style="height:3.4286em;"></span><span class="hide-tail mtight" style="min-width:0.853em;height:1.5429em;"><svg xmlns="http://www.w3.org/2000/svg" width="400em" height="1.5429em" viewBox="0 0 400000 1080" preserveAspectRatio="xMinYMin slice"><path d="M95,702
c-2.7,0,-7.17,-2.7,-13.5,-8c-5.8,-5.3,-9.5,-10,-9.5,-14
c0,-2,0.3,-3.3,1,-4c1.3,-2.7,23.83,-20.7,67.5,-54
c44.2,-33.3,65.8,-50.3,66.5,-51c1.3,-1.3,3,-2,5,-2c4.7,0,8.7,3.3,12,10
s173,378,173,378c0.7,0,35.3,-71,104,-213c68.7,-142,137.5,-285,206.5,-429
c69,-144,104.5,-217.7,106.5,-221
l0 -0
c5.3,-9.3,12,-14,20,-14
H400000v40H845.2724
s-225.272,467,-225.272,467s-235,486,-235,486c-2.7,4.7,-9,7,-19,7
c-6,0,-10,-1,-12,-3s-194,-422,-194,-422s-65,47,-65,47z
M834 80h400000v40h-400000z"></path></svg></span></span></span><span class="vlist-s"></span></span><span class="vlist-r"><span class="vlist" style="height:0.4197em;"><span></span></span></span></span></span></span></span></span></span><span class="vlist-s"></span></span><span class="vlist-r"><span class="vlist" style="height:0.345em;"><span></span></span></span></span></span><span class="mclose nulldelimiter"></span></span></span></span></span></span><br><span style="display: inline-block; transform: scale(-1);">FoundKey expands the world of the Fediverse</span>
See Akkoma issue https://akkoma.dev/AkkomaGang/akkoma/issues/381 Generally fedi servers use html for content which can then be directly used. For example if you input using Markdown `*something*`, then it will be federated by using `<i>something</i>` in the content field. The advantage is that implementations just need to understand html, not every input method some server comes up with. In case of MFM this isn't always true, however. This causes compatibility problems for other software who now have to fully implement a new parser and re-parse the incoming `source` if it's MFM. Example (from <https://ilja.space/notice/ASo2tpidQ5yVRUYxG4>): When you post ``` \(x= \frac{-b' \pm \sqrt{(b')^2-ac}}{a}\) $[flip.h,v FoundKey expands the world of the Fediverse] ``` It now federates as ```html <code>x= \\frac{-b' \\pm \\sqrt{(b')^2-ac}}{a}</code><span><br></span><i><span>FoundKey expands the world of the Fediverse</span></i> ``` I expect it to be something more like how the html looks when I check the post through the Foundkey FE <details><summary>show example</summary> ```html <span><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>x</mi><mo>=</mo><mfrac><mrow><mo>−</mo><msup><mi>b</mi><mo lspace="0em" rspace="0em" mathvariant="normal">′</mo></msup><mo>±</mo><msqrt><mrow><mo stretchy="false">(</mo><msup><mi>b</mi><mo lspace="0em" rspace="0em" mathvariant="normal">′</mo></msup><msup><mo stretchy="false">)</mo><mn>2</mn></msup><mo>−</mo><mi>a</mi><mi>c</mi></mrow></msqrt></mrow><mi>a</mi></mfrac></mrow><annotation encoding="application/x-tex">x= \frac{-b' \pm \sqrt{(b')^2-ac}}{a}</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.4306em;"></span><span class="mord mathnormal">x</span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2778em;"></span></span><span class="base"><span class="strut" style="height:1.6746em;vertical-align:-0.345em;"></span><span class="mord"><span class="mopen nulldelimiter"></span><span class="mfrac"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:1.3296em;"><span style="top:-2.655em;"><span class="pstrut" style="height:3em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight">a</span></span></span></span><span style="top:-3.23em;"><span class="pstrut" style="height:3em;"></span><span class="frac-line" style="border-bottom-width:0.04em;"></span></span><span style="top:-3.6038em;"><span class="pstrut" style="height:3em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mtight">−</span><span class="mord mtight"><span class="mord mathnormal mtight">b</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.8278em;"><span style="top:-2.931em;margin-right:0.0714em;"><span class="pstrut" style="height:2.5em;"></span><span class="sizing reset-size3 size1 mtight"><span class="mord mtight"><span class="mord mtight">′</span></span></span></span></span></span></span></span></span><span class="mbin mtight">±</span><span class="mord sqrt mtight"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:1.0369em;"><span class="svg-align" style="top:-3.4286em;"><span class="pstrut" style="height:3.4286em;"></span><span class="mord mtight" style="padding-left:1.19em;"><span class="mopen mtight">(</span><span class="mord mtight"><span class="mord mathnormal mtight">b</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.6828em;"><span style="top:-2.786em;margin-right:0.0714em;"><span class="pstrut" style="height:2.5em;"></span><span class="sizing reset-size3 size1 mtight"><span class="mord mtight"><span class="mord mtight">′</span></span></span></span></span></span></span></span></span><span class="mclose mtight"><span class="mclose mtight">)</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.7463em;"><span style="top:-2.786em;margin-right:0.0714em;"><span class="pstrut" style="height:2.5em;"></span><span class="sizing reset-size3 size1 mtight"><span class="mord mtight">2</span></span></span></span></span></span></span></span><span class="mbin mtight">−</span><span class="mord mathnormal mtight">a</span><span class="mord mathnormal mtight">c</span></span></span><span style="top:-3.0089em;"><span class="pstrut" style="height:3.4286em;"></span><span class="hide-tail mtight" style="min-width:0.853em;height:1.5429em;"><svg xmlns="http://www.w3.org/2000/svg" width="400em" height="1.5429em" viewBox="0 0 400000 1080" preserveAspectRatio="xMinYMin slice"><path d="M95,702 c-2.7,0,-7.17,-2.7,-13.5,-8c-5.8,-5.3,-9.5,-10,-9.5,-14 c0,-2,0.3,-3.3,1,-4c1.3,-2.7,23.83,-20.7,67.5,-54 c44.2,-33.3,65.8,-50.3,66.5,-51c1.3,-1.3,3,-2,5,-2c4.7,0,8.7,3.3,12,10 s173,378,173,378c0.7,0,35.3,-71,104,-213c68.7,-142,137.5,-285,206.5,-429 c69,-144,104.5,-217.7,106.5,-221 l0 -0 c5.3,-9.3,12,-14,20,-14 H400000v40H845.2724 s-225.272,467,-225.272,467s-235,486,-235,486c-2.7,4.7,-9,7,-19,7 c-6,0,-10,-1,-12,-3s-194,-422,-194,-422s-65,47,-65,47z M834 80h400000v40h-400000z"></path></svg></span></span></span><span class="vlist-s"></span></span><span class="vlist-r"><span class="vlist" style="height:0.4197em;"><span></span></span></span></span></span></span></span></span></span><span class="vlist-s"></span></span><span class="vlist-r"><span class="vlist" style="height:0.345em;"><span></span></span></span></span></span><span class="mclose nulldelimiter"></span></span></span></span></span></span><br><span style="display: inline-block; transform: scale(-1);">FoundKey expands the world of the Fediverse</span> ``` </details>
Owner

general case

As you have probably noticed, in general we do federate HTML markup for "normal" things.

line breaks

From looking at https://ilja.space/notice/ASo4tEHCpV7nE1ZaEK I see that you are ignoring the line break markup though, in the snippet you gave there is <span><br></span> to mark the newline which is apparently ignored. I'm not sure why the line break is inside of a span, if that is causing issues I could remove that.

reparsing MFM

If you are re-parsing MFM, note that every line break in the input will result in a line break in the output.

Side note because I also learned this while embarking on the MFM parser journey in #338: A blockquote ends at a linebreak as well. However a single empty line between blockquotes is treated as if it didn't exist:

> a

> b
c

means

<blockquote>asdf<br>asdf</blockquote><br>c

KaTeX

The reason we do not federate as in your example is simple: We wouldn't understand it if we received it and thus don't expect others to understand it. As far as I am aware there is no KaTeX in Pleroma (and I think Misskey even removed it?).

Misskey and derivatives do not store HTML. Incoming notes will be converted from HTML to MFM. As you can probably guess from the monstrosity of the generated HTML you cannot transform that back into KaTeX, much less MFM. And HTML tags are not supported in MFM (apart from some exceptions). Thus if you were to federate the last example to Foundkey (and there was no MFM), it would throw out the KaTeX part completely.

(In theory I would like to fix this and store HTML at some point. But probably in the very distant future because it would be a HUGE refactor.)

MFM functions

Regarding $[flip.h,v example] and its transformation to <span style="display: inline-block; transform: scale(-1);">example</span>: I would be surprised if any HTML sanitizer of fediverse software would allow the style attribute to pass through.

I am currently working on using an "actual" Markdown parser. Taking inspiration from sfr's MFM extension for marked.js it would render this example as:

<span class="mfm-flip" data-mfm-h data-mfm-v>example</span>

My idea is that this output would eventually also be federated like that. However I'm only beginning to experiment with that for Foundkey Pages, see also #338.

### general case As you have probably noticed, in general we do federate HTML markup for "normal" things. ### line breaks From looking at https://ilja.space/notice/ASo4tEHCpV7nE1ZaEK I see that you are ignoring the line break markup though, in the snippet you gave there is `<span><br></span>` to mark the newline which is apparently ignored. I'm not sure why the line break is inside of a `span`, if that is causing issues I could remove that. #### reparsing MFM If you are re-parsing MFM, note that every line break in the input will result in a line break in the output. Side note because I also learned this while embarking on the MFM parser journey in #338: A blockquote ends at a linebreak as well. However a single empty line between blockquotes is treated as if it didn't exist: ``` > a > b c ``` means ```html <blockquote>asdf<br>asdf</blockquote><br>c ``` ### KaTeX The reason we do not federate as in your example is simple: We wouldn't understand it if we received it and thus don't expect others to understand it. As far as I am aware there is no KaTeX in Pleroma (and I think Misskey even *removed* it?). Misskey and derivatives ***do not store HTML***. Incoming notes will be converted from HTML to MFM. As you can probably guess from the monstrosity of the generated HTML you cannot transform that back into KaTeX, much less MFM. And HTML tags are not supported in MFM (apart from some exceptions). Thus if you were to federate the last example to Foundkey (and there was no MFM), it would throw out the KaTeX part completely. (In theory I would like to fix this and store HTML *at some point*. But probably in the very distant future because it would be a HUGE refactor.) ### MFM functions Regarding `$[flip.h,v example]` and its transformation to `<span style="display: inline-block; transform: scale(-1);">example</span>`: I would be surprised if any HTML sanitizer of fediverse software would allow the `style` attribute to pass through. I am currently working on using an "actual" Markdown parser. Taking inspiration from sfr's MFM extension for marked.js it would render this example as: ```html <span class="mfm-flip" data-mfm-h data-mfm-v>example</span> ``` My idea is that this output would eventually also be federated like that. However I'm only beginning to experiment with that for Foundkey Pages, see also https://akkoma.dev/FoundKeyGang/FoundKey/pulls/338.
Author

you are ignoring the line break markup though, in the snippet you gave there is <span><br></span>

Indeed, we do ignore this! Akkoma will check if the source is of mediaType text/x.misskeymarkdown. If it is, then we can't use the content (hence this issue), so we reprocess the MFM source. So we don't just ignore the <span><br></span>, but actually the entire content!
The reason why we don't have a newline, is because we interpret this as markdown and markdown doesn't consider \n to be a newline. For a newline in markdown you either need \n\n (which is more like a paragraph because it also adds a blank line), or \n with two trailing spaces.
I'm sure we can fix this on Akkoma's side for MFM, though. So I'll do that (Done).

You don't really need to store the HTML I think, you just process it on sending out (which already happens), but you do make a good point about the HTML sanitizer. I never actually worked with that, so I'm unsure how fine grained you can tell it to go 🤔
Using custom classes/attributes instead of what we have now, indeed sounds like a good solution to me. In practice remote instance still need to add some support, but then it's mostly just some css and not a whole parser.

I understand this isn't something that will happen very soon and I understand why, so thk you for a quick response 🤗❤️

> you are ignoring the line break markup though, in the snippet you gave there is `<span><br></span>` Indeed, we do ignore this! Akkoma will check if the `source` is of mediaType `text/x.misskeymarkdown`. If it is, then we can't use the `content` (hence this issue), so we reprocess the MFM source. So we don't just ignore the `<span><br></span>`, but actually the entire `content`! The reason why we don't have a newline, is because we interpret this as markdown and markdown doesn't consider `\n` to be a newline. For a newline in markdown you either need `\n\n` (which is more like a paragraph because it also adds a blank line), or `\n` with two trailing spaces. I'm sure we can fix this on Akkoma's side for MFM, though. ~~So I'll do that~~ (Done). You don't really need to store the HTML I think, you just process it on sending out (which already happens), but you do make a good point about the HTML sanitizer. I never actually worked with that, so I'm unsure how fine grained you can tell it to go 🤔 Using custom classes/attributes instead of what we have now, indeed sounds like a good solution to me. In practice remote instance still need to add some support, but then it's mostly just some css and not a whole parser. I understand this isn't something that will happen very soon and I understand why, so thk you for a quick response 🤗❤️
Owner

Using classes instead of attributes indeed seems like a good solution to me. In practice remote instance still need to add some support, but then it's mostly just some css and not a whole parser.

Well the nice thing is, since we would have to do the same we have to write such a style sheet anyway, which you could of course use. But still there would be some additional support required because some data-mfm-... attributes would have to be made accessible to CSS. (Unless CSS attr() is suddenly implemented by all major browsers.) But that would be minor compared with having to reparse MFM.

As I said the eventual goal is to federate proper markup for everything but as you noted it's probably going to take a long time.

> Using classes instead of attributes indeed seems like a good solution to me. In practice remote instance still need to add some support, but then it's mostly just some css and not a whole parser. Well the nice thing is, since we would have to do the same we have to write such a style sheet anyway, which you could of course use. But still there would be some additional support required because some `data-mfm-...` attributes would have to be made accessible to CSS. (Unless [CSS `attr()`](https://caniuse.com/css3-attr) is suddenly implemented by all major browsers.) But that would be minor compared with having to reparse MFM. As I said the eventual goal is to federate proper markup for everything but as you noted it's probably going to take a long time.
Johann150 added this to the replace MFM with HTML milestone 2023-02-18 19:18:55 +00:00
Author

Maybe relevant for the katex part:

In Akkoma chat someone mentioned this https://codeberg.org/fediverse/fep/src/branch/main/fep/dc88/fep-dc88.md

Apparently there's something called MathML, which is basically some extra elements in HTML. And it seems browsers already support it. I don't know if it can go as complex as what MFM currently allows, but if so, maybe one day the Katex parts could be transformed to MathML.

Maybe relevant for the katex part: In Akkoma chat someone mentioned this https://codeberg.org/fediverse/fep/src/branch/main/fep/dc88/fep-dc88.md Apparently there's something called MathML, which is basically some extra elements in HTML. And it seems browsers already support it. I don't know if it can go as complex as what MFM currently allows, but if so, maybe one day the Katex parts could be transformed to MathML.
Owner

KaTeX already renders to MathML, so adding that for federating HTML was not that hard.

I did not see a fallback level in the FEP though, so I guess formulas will be completely scrubbed by some receiving implementations now. I don't have an idea for how a fallback could be implemented either though. 🤷

KaTeX already renders to MathML, so [adding that for federating HTML](https://akkoma.dev/FoundKeyGang/FoundKey/commit/f6c3d442655931533d5ec52e6515275a3e10a2b2) was not that hard. I did not see a fallback level in the FEP though, so I guess formulas will be completely scrubbed by some receiving implementations now. I don't have an idea for how a fallback could be implemented either though. 🤷
Author

Ahoi, I have time to look at this again 🎉

The first thing I want to do is figure out how to do the HTML representation. Basing myself on the SCSS from #338/files and what Akkoma already does, I think the following makes sense:

  • An MFM Function consists of a name, optionally some attributes who may or may not have a value, and a content. It has the form $[name.attribute1,attribute2=value content]
  • We use a span element to represent the MFM in HTML
  • The content of the MFM Function is expressed as the innerHTML of the span element
  • The name is encoded in HTML using a class of the form mfm-<name>
    • For example $[spin tofu 🐈] has the name spin, and thus becomes <span class="mfm-spin">tofu 🐈</span>
  • Attributes are expressed in HTML as data-attributes of the form data-mfm-<attribute-name>.
    • For example $[spin.x,speed=0.5s tofu 🐈] becomes <span class="mfm-spin" data-mfm-x data-mfm-speed="0.5s">tofu 🐈</span>

To this I would add

  • You MAY add a class mfm to a span representing an MFM Function
  • You MAY add a class animated-mfm to a span representing an animated MFM Function
  • You MAY add a variable in the span's style attribute for the MFM attributes who have a value. The variable name should be of the form --mfm-<attribute-name>

For example, this could turn $[spin.x,speed=0.5s tofu 🐈] into <span class="mfm animated-mfm mfm-spin" data-mfm-speed="0.5s" style="--mfm-speed: 0.5s;">tofu 🐈</span>

I'm thinking this could become a FEP, but this is all very new to me, so first I would like to hear your idea on this. Does this make sense? Are there things you think should be done differently?

@a If possible, I would like to hear your thoughts on this too. The goal is to express MFM properly in HTML so it can be more properly federated over AP in the content field, without getting too much into trouble with HTML scrubbers and such.

EDIT: I hadn't checked https://akkoma.dev/nbsp/marked-mfm From what I gather from the README, it uses _mfm_x2_ instead of mfm-x2. Akkoma also uses this. So it probably makes more sense to say

  • The name is encoded in HTML using a class of the form _mfm_<name>_
Ahoi, I have time to look at this again 🎉 The first thing I want to do is figure out how to do the HTML representation. Basing myself on the SCSS from https://akkoma.dev/FoundKeyGang/FoundKey/pulls/338/files#diff-dbc35295e83f3bae7b68f17f2b8d541c735c3cb2 and what Akkoma already does, I think the following makes sense: * An *MFM Function* consists of a *name*, optionally some *attributes* who may or may not have a *value*, and a *content*. It has the form `$[name.attribute1,attribute2=value content]` * We use a *span* element to represent the MFM in HTML * The content of the MFM Function is expressed as the *innerHTML* of the span element * The name is encoded in HTML using a *class* of the form `mfm-<name>` * For example `$[spin tofu 🐈]` has the name `spin`, and thus becomes `<span class="mfm-spin">tofu 🐈</span>` * Attributes are expressed in HTML as [data-attributes](https://developer.mozilla.org/en-US/docs/Web/HTML/Global_attributes/data-*) of the form `data-mfm-<attribute-name>`. * For example `$[spin.x,speed=0.5s tofu 🐈]` becomes `<span class="mfm-spin" data-mfm-x data-mfm-speed="0.5s">tofu 🐈</span>` To this I would add * You MAY add a class `mfm` to a span representing an MFM Function * You MAY add a class `animated-mfm` to a span representing an animated MFM Function * You MAY add a variable in the span's style attribute for the MFM attributes who have a value. The variable name should be of the form `--mfm-<attribute-name>` For example, this could turn `$[spin.x,speed=0.5s tofu 🐈]` into `<span class="mfm animated-mfm mfm-spin" data-mfm-speed="0.5s" style="--mfm-speed: 0.5s;">tofu 🐈</span>` I'm thinking this could become a FEP, but this is all very new to me, so first I would like to hear your idea on this. Does this make sense? Are there things you think should be done differently? @a If possible, I would like to hear your thoughts on this too. The goal is to express MFM properly in HTML so it can be more properly federated over AP in the `content` field, without getting too much into trouble with HTML scrubbers and such. **EDIT:** I hadn't checked https://akkoma.dev/nbsp/marked-mfm From what I gather from the README, it uses `_mfm_x2_` instead of `mfm-x2`. Akkoma also uses this. So it probably makes more sense to say * The name is encoded in HTML using a *class* of the form `_mfm_<name>_`
Owner

To chime in very shortly.
In the interest of better integration with mfm-supporting systems without necessarily embedding the source into the objects, I would also add a data-mfmsource attribute to show everything but the content.
So $[name.attr1,attr2=val content] would become <span class="mfm mfm-name" data-mfm-attr1 data-mfm-attr2="val" data-mfmsource="name.attr1,attr2=val">content</span>.

This means that mfm-supporting systems wouldn't have to add an extra query to show the mfm source, and could simply extract it from the data attribute.

To chime in very shortly. In the interest of better integration with mfm-supporting systems without necessarily embedding the source into the objects, I would also add a `data-mfmsource` attribute to show everything but the content. So `$[name.attr1,attr2=val content]` would become `<span class="mfm mfm-name" data-mfm-attr1 data-mfm-attr2="val" data-mfmsource="name.attr1,attr2=val">content</span>`. This means that mfm-supporting systems wouldn't have to add an extra query to show the mfm source, and could simply extract it from the data attribute.
Author

In the interest of better integration with mfm-supporting systems without necessarily embedding the source into the objects, I would also add a data-mfmsource attribute to show everything but the content.

Unless things changed in the past year since I made this issue, the source (I assume we talk about the "source" field in the Activity Pub object) currently MUST be provided for MFM to work on mfm-supporting systems. So it will always make sense to send it, unless maybe some day every Misskey fork decides not to require that any more. Adding support for understanding the data-mfmsource needs work, so they may as well support the html representation instead, it should be much easier. Adding the source (or a hint of it) to the content isn't done in any other content type either, so it would be a first. And it would only be happening for the MFM Functions, not for the Markdown or Katex or HTML or whatever else MFM has.

So I don't really see the value of adding an extra data-mfmsource 🤔 What could be a use case for a remote server to first rebuild the MFM Function representation (while not doing that for the other input formats)? And why would that be preferred over embedding the source, which is already standard practice?

> In the interest of better integration with mfm-supporting systems without necessarily embedding the source into the objects, I would also add a `data-mfmsource` attribute to show everything but the content. Unless things changed in the past year since I made this issue, the source (I assume we talk about the "source" field in the Activity Pub object) currently MUST be provided for MFM to work on mfm-supporting systems. So it will always make sense to send it, unless maybe some day every Misskey fork decides not to require that any more. Adding support for understanding the `data-mfmsource` needs work, so they may as well support the html representation instead, it should be much easier. Adding the source (or a hint of it) to the content isn't done in any other content type either, so it would be a first. And it would only be happening for the MFM Functions, not for the Markdown or Katex or HTML or whatever else MFM has. So I don't really see the value of adding an extra `data-mfmsource` 🤔 What could be a use case for a remote server to first rebuild the MFM Function representation (while not doing that for the other input formats)? And why would that be preferred over embedding the source, which is already standard practice?

Some comments

  • using data attributes as the canonical parameter value source means postprocessing on the receiving server or scripts in the frontend are required. However turning all data-mfm-* into CSS variables seems rather straightforward — but care must be taken to avoid parameter values injecting unrelated CSS bits. (Is just disallowing semicolons in values enough?)
    On the upside this probably makes it simpler to extract individual parameters compared to using CSS variables directly. (And ofc using CSS variables means Style must be sanitised and (partially) kept, while with data attributes it can be purged and recreated)
    Overall seems like a good decision to me.
  • i’m not sure what the MAY parts are supposed to achieve
    • is adding anything else forbidden or just not expected to be honoured by the receiving end?
    • if the latter, what is the receiving end supposed to do if e.g. data attributes and CSS variable values disagree?
    • if animated-mfm is optional, receiving servers wishing to pause animations by default will need to add it based on the function name anyway. Are they allowed to strip animated-mfm from nodes whose function they don’ŧ support any animated effect for?
  • Should receivers keep all MFM attributes or only those they (currently) understand?
  • i’m indifferent to exact naming _mfm_/mfm-/...
  • given source is a standard AP property and by necessity already set by all MFM-federating servers i currently can't see a benefit in duplicating source bits in HTML content

Furthermore, while this defines how to represent MFM to move it across servers in a HTML sanitiser-friendly way, without a standard for to interpreting this representation it doesn’t actually solve the issue of diverging display across implementations. I think it would be good to also provide a CSS stylesheet with which federated MFM can be rendered.
When i checked it once, most of Mk12 MFM effects were easily representable with CSS when using attributes and variables (but beware e.g. x4 and x2 scaling differently depending on whether they’re nested or not). E.g. sparkle unfortunately requires more work than just copying Misskey CSS since individual particles pop in and out of existence in Misskey. For a static HTML representation we’d need to add a sufficient, fixed amount of particles and extend their path to a loop, hiding them as necessary. I think it will also require to make individual particle nodes identifiable in some way?
E.g. followmouse (not sure if it’s a Firefish/iceshrimp extension or standard Mk12) might be impossible without frontend scripts unless there’s some CSS trick for it. Tbh though, i’m kinda leaning towards considering effects only acievable via scripts non-portable.

EDIT: oh also, what happens when a MFM function moves an object? Is it allowed to leave normal post display area, get clipped or is the post width/height expanded (and other content if necessary shifted) to always encapsulate everything?

Some comments - using data attributes as the canonical parameter value source means postprocessing on the receiving server or scripts in the frontend are required. However turning all `data-mfm-*` into CSS variables seems rather straightforward — but care must be taken to avoid parameter values injecting unrelated CSS bits. *(Is just disallowing semicolons in values enough?)* On the upside this probably makes it simpler to extract individual parameters compared to using CSS variables directly. *(And ofc using CSS variables means `Style` must be sanitised and (partially) kept, while with data attributes it can be purged and recreated)* Overall seems like a good decision to me. - i’m not sure what the `MAY` parts are supposed to achieve - is adding anything else forbidden or just not expected to be honoured by the receiving end? - if the latter, what is the receiving end supposed to do if e.g. data attributes and CSS variable values disagree? - if `animated-mfm` is optional, receiving servers wishing to pause animations by default will need to add it based on the function name anyway. Are they allowed to strip `animated-mfm` from nodes whose function they don’ŧ support any animated effect for? - Should receivers keep _all_ MFM attributes or only those they (currently) understand? - i’m indifferent to exact naming `_mfm_`/`mfm-`/... - given `source` is [a standard AP property](https://www.w3.org/TR/activitypub/#source-property) and by necessity already set by all MFM-federating servers i currently can't see a benefit in duplicating source bits in HTML content Furthermore, while this defines how to represent MFM to move it across servers in a HTML sanitiser-friendly way, without a standard for to interpreting this representation it doesn’t actually solve the issue of diverging display across implementations. I think it would be good to also provide a CSS stylesheet with which federated MFM can be rendered. When i checked it once, most of Mk12 MFM effects were easily representable with CSS when using attributes and variables *(but beware e.g. `x4` and `x2` scaling differently depending on whether they’re nested or not)*. E.g. `sparkle` unfortunately requires more work than just copying Misskey CSS since individual particles pop in and out of existence in Misskey. For a static HTML representation we’d need to add a sufficient, fixed amount of particles and extend their path to a loop, hiding them as necessary. I think it will also require to make individual particle nodes identifiable in some way? E.g. `followmouse` *(not sure if it’s a Firefish/iceshrimp extension or standard Mk12)* might be impossible without frontend scripts unless there’s some CSS trick for it. Tbh though, i’m kinda leaning towards considering effects only acievable via scripts non-portable. **EDIT**: oh also, what happens when a MFM function moves an object? Is it allowed to leave normal post display area, get clipped or is the post width/height expanded (and other content if necessary shifted) to always encapsulate everything?
Owner

I think the general syntax @ilja set out is good, since its the same as I described in a previous post 😉

  1. You MAY add a class mfm to a span representing an MFM Function

    🤔 Does not seem very helpful. What use is it to know that something is MFM if I don't understand the specific mfm thingy? Besides you could probably still tell that its mfm by having a mfm-* class.

  2. You MAY add a class animated-mfm to a span representing an animated MFM Function

    👎 i agree with oneric:

    if animated-mfm is optional, receiving servers wishing to pause animations by default will need to add it based on the function name anyway.

  3. You MAY add a variable in the span's style attribute for the MFM attributes who have a value. The variable name should be of the form --mfm-

    👎 As I mentioned previously, I would assume that the style attribute is sanitized away, thus not super sensible to have this. Duplicating the information also allows the possibility of having two different values (might be used for fingerprinting or targeted "attacks"?)
    Admittedly, using CSS variables might be required for rendering, but again being optional means an implementation would have to re-do it anyway.

  4. So I don't really see the value of adding an extra data-mfmsource

    👍 though the same desync problem as (3) still applies even with source

  5. The name is encoded in HTML using a class of the form mfm_

    👎From what I've seen, classes in kebab case, i.e. mfm-* are more natural. That would also be the same naming scheme as required for the data-mfm-* attributes.

  6. using data attributes as the canonical parameter value source means postprocessing on the receiving server or scripts in the frontend are required. However turning all data-mfm-* into CSS variables seems rather straightforward — but care must be taken to avoid parameter values injecting unrelated CSS bits. (Is just disallowing semicolons in values enough?)

    with a proper implementation, potential injection into CSS variables can be prevented

    • setting the content of CSS variables is possible from JavaScript, e.g. https://css-tricks.com/updating-a-css-variable-with-javascript/
    • once the content is in the CSS variable, using var(--mfm-*) is treated as a single CSS token and not verbatim replacement, so injecting a semicolon should not be possible
    • the format of values that can be in each respective attribute are known ahead of time so could even be validated
  7. Furthermore, while this defines how to represent MFM to move it across servers in a HTML sanitiser-friendly way, without a standard for to interpreting this representation it doesn’t actually solve the issue of diverging display across implementations.

    I would like to point out #338 again, see https://akkoma.dev/FoundKeyGang/FoundKey/src/branch/markdown/packages/client/src/mfm.scss

    • for that stylesheet to work properly, duplicate data-mfm-* attributes to --mfm-* CSS variables
    • NB: data-mfm-* can be used in CSS selectors (when its just about the presence of the variable) but CSS variables can not.
    • diverging implementations will probably always exist ("we're just cooler than the other misskey forks 😏" or "why are these JSON-LD schemes all http in 2023? lets upgrade them to https! all the others are dumb and insecure")
    • this also implements the nesting of x2, x3 and x4.
  8. oh also, what happens when a MFM function moves an object? Is it allowed to leave normal post display area

    ABSOLUTELY NOT!
    -- the Misskey dev who implemented that nesting of x2 & co. for reasons of single emojis being 20x screen height

    for similar reasons I would also suggest implementing something like having a separate "show more" button on posts

  9. Should receivers keep all MFM attributes or only those they (currently) understand?

    By the current logic of keeping the whole MFM string: Yes, they should.

I think the general syntax @ilja set out is good, since its the same as I described in a previous post 😉 1) > You MAY add a class `mfm` to a span representing an MFM Function 🤔 Does not seem very helpful. What use is it to know that something is MFM if I don't understand the specific mfm thingy? Besides you could probably still tell that its mfm by having a `mfm-*` class. 2) > You MAY add a class animated-mfm to a span representing an animated MFM Function 👎 i agree with oneric: > if animated-mfm is optional, receiving servers wishing to pause animations by default will need to add it based on the function name anyway. 3) > You MAY add a variable in the span's style attribute for the MFM attributes who have a value. The variable name should be of the form --mfm-<attribute-name> 👎 As I mentioned previously, I would assume that the `style` attribute is sanitized away, thus not super sensible to have this. Duplicating the information also allows the possibility of having two different values (might be used for fingerprinting or targeted "attacks"?) Admittedly, using CSS variables might be required for rendering, but again being optional means an implementation would have to re-do it anyway. 4) > So I don't really see the value of adding an extra data-mfmsource 👍 though the same desync problem as (3) still applies even with `source` 5) > The name is encoded in HTML using a class of the form _mfm_<name>_ 👎From what I've seen, classes in kebab case, i.e. `mfm-*` are more natural. That would also be the same naming scheme as required for the `data-mfm-*` attributes. 6) > using data attributes as the canonical parameter value source means postprocessing on the receiving server or scripts in the frontend are required. However turning all data-mfm-* into CSS variables seems rather straightforward — but care must be taken to avoid parameter values injecting unrelated CSS bits. (Is just disallowing semicolons in values enough?) with a proper implementation, potential injection into CSS variables can be prevented * setting the content of CSS variables is possible from JavaScript, e.g. https://css-tricks.com/updating-a-css-variable-with-javascript/ * once the content is in the CSS variable, using `var(--mfm-*)` is treated as a single CSS token and not verbatim replacement, so injecting a semicolon should not be possible * the format of values that can be in each respective attribute are known ahead of time so could even be validated 7) > Furthermore, while this defines how to represent MFM to move it across servers in a HTML sanitiser-friendly way, without a standard for to interpreting this representation it doesn’t actually solve the issue of diverging display across implementations. I would like to point out #338 again, see https://akkoma.dev/FoundKeyGang/FoundKey/src/branch/markdown/packages/client/src/mfm.scss * for that stylesheet to work properly, duplicate `data-mfm-*` attributes to `--mfm-*` CSS variables * NB: `data-mfm-*` can be used in CSS selectors (when its just about the presence of the variable) but CSS variables can not. * diverging implementations will probably always exist ("we're just cooler than the other misskey forks 😏" or "why are these JSON-LD schemes all http in 2023? lets upgrade them to https! all the others are dumb and insecure") * this also implements the nesting of `x2`, `x3` and `x4`. 8) > oh also, what happens when a MFM function moves an object? Is it allowed to leave normal post display area ABSOLUTELY NOT! -- the Misskey dev who implemented that nesting of `x2` & co. for reasons of single emojis being 20x screen height for similar reasons I would also suggest implementing something like having a separate "show more" button on posts 8) > Should receivers keep all MFM attributes or only those they (currently) understand? By the current logic of keeping the whole MFM string: Yes, they should.

I would like to point out #338 again, see https://akkoma.dev/FoundKeyGang/FoundKey/src/branch/markdown/packages/client/src/mfm.scss

ohh thx, i missed this style sheet before, neat.

Afaict it already implements all MFM functions supported by FoundKey except sparkle?
As mentioned before sparkle needs some work to turn its on-the-fly generated particles into a looping animation with a fixed amount of nodes, but it should be possible.

Comparing to iceshrimp-js (since it has a publicly accessible MFM cheatsheet on each instance; e.g.: https://heckin.how/mfm-cheat-sheet), the following are missing but seem easy enough to add:

  • loop=<count> (animation-iteration-count) and delay=<time> parameters for all animated effects
  • crop function (clip-path)
  • border function
  • ruby

I might just be unaware of some nifty CSS feature, but the following seem impossible to implement without frontend/app scripting however and i’m inclined to consider them out of scope:

  • followmouse
  • unixtime (turning a unix timestamp into a human-friendly time in the reader’s timezone)

But well, we can always start with just the easy bits (e.g. current FoundKey without sparkle).

Notably though there already is a probably unresolvable incompatibility here: e.g. scale and position allow content to leave the post area. FoundKey clips anything outside, iceshrimp and iirc other *keys typically allow overflow.
I guess this will have to be left implementation defined, but it rules out “rescale and shift everything to fit the post area” as an approach.

for similar reasons I would also suggest implementing something like having a separate "show more" button on posts

Sorry, i don’t quite follow; can you elaborate?
There already is a "show more" button to expand long posts in most frontends/apps. Was this remark meant to just encourage everyone to have this? Or are you suggesting this same "show more" should also toggle allowing content to overflow the post body?
Or to have an additional, separate "show MFM overflow" button?
Or something else entirely?

with a proper implementation, potential injection into CSS variables can be prevented

yeah, but let’s assume an implementation not written in JS wants to safely map attributes to variable definitions to generate an inlined style. Is it enough to just check no value contains a semicolon or is there more to look out for? I’m not really familiar with the finer details of CSS.


While it’s not a regular MFM function, the example in the OP also contained math and suggested federating it how it’s displayed in the frontend, e.g. HTML with screen-reader-only MathML. However, this isn’t suitable for federation as the generated HTML is too complex for scrubbers and depends on specific fonts and vendored stylesheets of the generator (here KaTex).
Instead it should be federated as Core MathML and afaict FoundKey already does this. See also FEP-dc88, akkoma#642.

In my experience, MFM comes with the expectation of math typesetting, so it might be good to briefly mention this and point to FEP-dc88 in an eventual more formal MFM federation specification.

The example from OP could now federate as something like this:
<math xmlns="http://www.w3.org/1998/Math/MathML">
<semantics>
<mrow><mi>x</mi><mo>=</mo><mfrac><mrow><mo></mo><msup><mi>b</mi><mo lspace="0em" rspace="0em" mathvariant="normal"></mo></msup><mo>±</mo><msqrt><mrow><mo stretchy="false">(</mo><msup><mi>b</mi><mo lspace="0em" rspace="0em" mathvariant="normal"></mo></msup><msup><mo stretchy="false">)</mo><mn>2</mn></msup><mo></mo><mi>a</mi><mi>c</mi></mrow></msqrt></mrow><mi>a</mi></mfrac></mrow>

<annotation encoding="application/x-tex">
x= \frac{-b' \pm \sqrt{(b')^2-ac}}{a}
</annotation>
</semantics>
</math>

<br />

<span class="mfm-flip" data-mfm-h data-mfm-v>
FoundKey expands the world of the Fediverse
</span>

EDIT: fixed the example HTML to move MathML annotation at the end of the semantics block

> I would like to point out #338 again, see https://akkoma.dev/FoundKeyGang/FoundKey/src/branch/markdown/packages/client/src/mfm.scss ohh thx, i missed this style sheet before, neat. Afaict it already implements all MFM functions supported by FoundKey except `sparkle`? As mentioned before `sparkle` needs some work to turn its on-the-fly generated particles into a looping animation with a fixed amount of nodes, but it should be possible. Comparing to iceshrimp-js *(since it has a publicly accessible MFM cheatsheet on each instance; e.g.: https://heckin.how/mfm-cheat-sheet)*, the following are missing but seem easy enough to add: - `loop=<count>` (`animation-iteration-count`) and `delay=<time>` parameters for all animated effects - `crop` function (`clip-path`) - `border` function - `ruby` I might just be unaware of some nifty CSS feature, but the following seem impossible to implement without frontend/app scripting however and i’m inclined to consider them out of scope: - `followmouse` - `unixtime` (turning a unix timestamp into a human-friendly time in the reader’s timezone) But well, we can always start with just the easy bits (e.g. current FoundKey without `sparkle`). Notably though there already is a probably unresolvable incompatibility here: e.g. `scale` and `position` allow content to leave the post area. FoundKey clips anything outside, iceshrimp and iirc other *keys typically allow overflow. I guess this will have to be left implementation defined, but it rules out “rescale and shift everything to fit the post area” as an approach. > for similar reasons I would also suggest implementing something like having a separate "show more" button on posts Sorry, i don’t quite follow; can you elaborate? There already is a "show more" button to expand long posts in most frontends/apps. Was this remark meant to just encourage _everyone_ to have this? Or are you suggesting this same "show more" should _also_ toggle allowing content to overflow the post body? Or to have an additional, separate "show MFM overflow" button? Or something else entirely? > with a proper implementation, potential injection into CSS variables can be prevented yeah, but let’s assume an implementation not written in JS wants to safely map attributes to variable definitions to generate an inlined `style`. Is it enough to just check no value contains a semicolon or is there more to look out for? I’m not really familiar with the finer details of CSS. -------- While it’s not a regular MFM function, the example in the OP also contained math and suggested federating it how it’s displayed in the frontend, e.g. HTML with screen-reader-only MathML. However, this isn’t suitable for federation as the generated HTML is too complex for scrubbers and depends on specific fonts and vendored stylesheets of the generator (here KaTex). Instead it should be federated as Core MathML and afaict FoundKey already does this. See also FEP-dc88, [akkoma#642](https://akkoma.dev/AkkomaGang/akkoma/pulls/642). In my experience, MFM comes with the expectation of math typesetting, so it might be good to briefly mention this and point to FEP-dc88 in an eventual more formal MFM federation specification. <details> <summary>The example from OP could now federate as something like this:</summary> ```html <math xmlns="http://www.w3.org/1998/Math/MathML"> <semantics> <mrow><mi>x</mi><mo>=</mo><mfrac><mrow><mo>−</mo><msup><mi>b</mi><mo lspace="0em" rspace="0em" mathvariant="normal">′</mo></msup><mo>±</mo><msqrt><mrow><mo stretchy="false">(</mo><msup><mi>b</mi><mo lspace="0em" rspace="0em" mathvariant="normal">′</mo></msup><msup><mo stretchy="false">)</mo><mn>2</mn></msup><mo>−</mo><mi>a</mi><mi>c</mi></mrow></msqrt></mrow><mi>a</mi></mfrac></mrow> <annotation encoding="application/x-tex"> x= \frac{-b' \pm \sqrt{(b')^2-ac}}{a} </annotation> </semantics> </math> <br /> <span class="mfm-flip" data-mfm-h data-mfm-v> FoundKey expands the world of the Fediverse </span> ``` </details> EDIT: fixed the example HTML to move MathML `annotation` at the end of the `semantics` block
Author

care must be taken to avoid parameter values injecting unrelated CSS bits

Yes! I consider this covered by the AP spec, https://www.w3.org/TR/activitypub/#security-sanitizing-content (but good to point out for when I start implementing)

i’m not sure what the MAY parts are supposed to achieve

I'll remove those. It was mostly based on what I saw implemented, and thought it could maybe be useful to standard those a bit. But thinking a bit further, it doesn't actually help anything for the reasons you ( @Oneric ) and @Johann150 provided.

Should receivers keep all MFM attributes or only those they (currently) understand?

I don't think that needs to be specified here. This is just about representation. What a server decides to do with it, is up to them.

But I do think the answer is covered already in https://www.w3.org/TR/activitypub/#security-sanitizing-content "Any activity field being rendered for browsers (or other rich text enabled applications) should take care to sanitize fields containing markup to prevent cross site scripting attacks". In this case; If you know the attribute, you can sanitize it based on the expected values. If you don't know it, you can't properly sanatise it that way, so stripping may be preferable. But that depends on implementation.

without a standard for to interpreting this representation it doesn’t actually solve the issue of diverging display across implementations

While I agree, I don't think that should be in scope here 🤔 Properly federating it, is one thing. Displaying is another. And displaying will need additions each time someone comes up with a new MFM function. If those should be standardised, I think it makes more sense to do that separately.

For Akkoma I planned to use the Foundkey scss and at least the MFM Akkoma currently supports. At the moment of writing I have a simple poof-of-concept demo web page online with those MFM, using the (slightly adapted for this use-case) (s)css from Foundkey https://ilja.space/static/mfm_examples.html.

Is it allowed to leave normal post display area,

Same as above, really. But in practice I'd leave it up to implementations. For Akkoma-fe I think we'd want to not overflow the box.

From what I've seen, classes in kebab case, i.e. mfm-* are more natural. That would also be the same naming scheme as required for the data-mfm-* attributes.

This is not something I have feeling with, so I will trust you on this. Keeping the "old" notation makes somewhat sense, but if there's a better way, then I prefer to use that going forward. So unless anyone has arguments to the contrary, let's indeed use the mfm-* notation.

While it’s not a regular MFM function, the example in the OP also contained math

This should indeed use FEP-dc88, as Foundkey indeed already does #343 (comment) 🤗 l agree mentioning FEP-dc88 in the FEP is a good idea.


I'll leave this rest for at least a week to see if more feedback comes, then I'll see if I can make a FEP (first time, woohoo. Still have to read how to do it, though) and maybe start implementing this in MFMParser and Akkoma.

If there's still open remarks/questions I didn't respond to, feel free to point it out. I think I covered all of it.

> care must be taken to avoid parameter values injecting unrelated CSS bits Yes! I consider this covered by the AP spec, https://www.w3.org/TR/activitypub/#security-sanitizing-content (but good to point out for when I start implementing) > i’m not sure what the MAY parts are supposed to achieve I'll remove those. It was mostly based on what I saw implemented, and thought it could maybe be useful to standard those a bit. But thinking a bit further, it doesn't actually help anything for the reasons you ( @Oneric ) and @Johann150 provided. > Should receivers keep all MFM attributes or only those they (currently) understand? I don't think that needs to be specified here. This is just about representation. What a server decides to do with it, is up to them. But I do think the answer is covered already in https://www.w3.org/TR/activitypub/#security-sanitizing-content "*Any activity field being rendered for browsers (or other rich text enabled applications) should take care to sanitize fields containing markup to prevent cross site scripting attacks*". In this case; If you know the attribute, you can sanitize it based on the expected values. If you don't know it, you can't properly sanatise it that way, so stripping may be preferable. But that depends on implementation. > without a standard for to interpreting this representation it doesn’t actually solve the issue of diverging display across implementations While I agree, I don't think that should be in scope here 🤔 Properly federating it, is one thing. Displaying is another. And displaying will need additions each time someone comes up with a new MFM function. If those should be standardised, I think it makes more sense to do that separately. For Akkoma I planned to use the Foundkey scss and at least the MFM Akkoma currently supports. At the moment of writing I have a simple poof-of-concept demo web page online with those MFM, using the (slightly adapted for this use-case) (s)css from Foundkey <https://ilja.space/static/mfm_examples.html>. > Is it allowed to leave normal post display area, Same as above, really. But in practice I'd leave it up to implementations. For Akkoma-fe I think we'd want to not overflow the box. > From what I've seen, classes in kebab case, i.e. mfm-* are more natural. That would also be the same naming scheme as required for the data-mfm-* attributes. This is not something I have feeling with, so I will trust you on this. Keeping the "old" notation makes somewhat sense, but if there's a better way, then I prefer to use that going forward. So unless anyone has arguments to the contrary, let's indeed use the `mfm-*` notation. > While it’s not a regular MFM function, the example in the OP also contained math This should indeed use FEP-dc88, as Foundkey indeed already does https://akkoma.dev/FoundKeyGang/FoundKey/issues/343#issuecomment-10195 🤗 l agree mentioning FEP-dc88 in the FEP is a good idea. *** I'll leave this rest for at least a week to see if more feedback comes, then I'll see if I can [make a FEP](https://codeberg.org/fediverse/fep/src/branch/main/fep/a4ed/fep-a4ed.md) (first time, woohoo. Still have to read how to do it, though) and maybe start implementing this in [MFMParser](https://codeberg.org/ilja/mfm_parser) and Akkoma. If there's still open remarks/questions I didn't respond to, feel free to point it out. I think I covered all of it.
Author

I just realised there's another thing we need; discovery that this representation is used.

We can't simply assume that we can always use the HTML bc there's other implementations who still assume the receiving end will reparse the MFM, and we can't assume everyone will jump to implement this. So we need to have a way to know if the receiving HTML can be used or not.

I see two options:

  • When there's a span in the content with a class mdm-*, then we assume we can use the HTML.
    • This is kinda ugly and it will still reparse the source when there's no MFM functions used (which is the majority of posts)
  • Add an extra flag.
    • This seems like the best option to me. Implementations who don't know about this, can simply ignore it. For those who do know it, it's useful. I think this flag should be a "SHOULD" or a "MAY". I'm leaning to a "MAY" to hint to other servers that they can use the content.

The problem with an extra flag is what to use as a "flag"? It could be a completely new value, but then I'd need an extra namespace or something for the context. I don't have an http thing which I consider stable enough for this, and I don't want to force implementations to have their own schema. I remember Christine mentioning that urn: is also an option, so maybe something like

  "@context": [
    "https://www.w3.org/ns/activitystreams", 
    {
      "http-mfm": "urn:sha256:721ef0f4c99e61f0ec32e76916b24e1bb904810be17b60e58cac0d55d131967e"
    }
  ],

with 721ef0f4c99e61f0ec32e76916b24e1bb904810be17b60e58cac0d55d131967e being echo -n 'FEP-xxx: HTTP representation for MFM' | sha256sum. I'm unsure if there's precedent for this.

Or maybe something existing can be used? And what about the value of this flag? Is it just true or something else?

Any thoughts on that?

I just realised there's another thing we need; discovery that this representation is used. We can't simply assume that we can always use the HTML bc there's other implementations who still assume the receiving end will reparse the MFM, and we can't assume everyone will jump to implement this. So we need to have a way to know if the receiving HTML can be used or not. I see two options: * When there's a span in the content with a class `mdm-*`, then we assume we can use the HTML. * This is kinda ugly and it will still reparse the source when there's no MFM functions used (which is the majority of posts) * Add an extra flag. * This seems like the best option to me. Implementations who don't know about this, can simply ignore it. For those who do know it, it's useful. I think this flag should be a "SHOULD" or a "MAY". I'm leaning to a "MAY" to hint to other servers that they can use the content. The problem with an extra flag is what to use as a "flag"? It could be a completely new value, but then I'd need an extra namespace or something for the context. I don't have an http thing which I consider stable enough for this, and I don't want to force implementations to have their own schema. I remember Christine mentioning that `urn:` is also an option, so maybe something like ``` "@context": [ "https://www.w3.org/ns/activitystreams", { "http-mfm": "urn:sha256:721ef0f4c99e61f0ec32e76916b24e1bb904810be17b60e58cac0d55d131967e" } ], ``` with `721ef0f4c99e61f0ec32e76916b24e1bb904810be17b60e58cac0d55d131967e` being `echo -n 'FEP-xxx: HTTP representation for MFM' | sha256sum`. I'm unsure if there's precedent for this. Or maybe something existing can be used? And what about the value of this flag? Is it just `true` or something else? Any thoughts on that?
Owner

Regarding flags/namespaces I think there is a FEP in that direction (capabilities) already, though I'm not personally familiar with it.

I agree that the rendering is a separate issue that, at least definitely when getting into "new" MFM functions, does not belong here (a Foundkey issue after all).

Regarding flags/namespaces I think there is a FEP in that direction (capabilities) already, though I'm not personally familiar with it. I agree that the rendering is a separate issue that, at least definitely when getting into "new" MFM functions, does not belong here (a Foundkey issue after all).
ilja referenced this issue from a commit 2024-08-10 17:59:12 +00:00
ilja referenced this issue from a commit 2024-08-10 18:14:22 +00:00
Sign in to join this conversation.
No labels
feature
fix
upkeep
No project
No assignees
4 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: FoundKeyGang/FoundKey#343
No description provided.