Federate MFM in content
field using HTML #343
Loading…
Reference in New Issue
No description provided.
Delete Branch "%!s(<nil>)"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
See Akkoma issue AkkomaGang/akkoma#381
Generally fedi servers use html for content which can then be directly used. For example if you input using Markdown
*something*
, then it will be federated by using<i>something</i>
in the content field. The advantage is that implementations just need to understand html, not every input method some server comes up with.In case of MFM this isn't always true, however. This causes compatibility problems for other software who now have to fully implement a new parser and re-parse the incoming
source
if it's MFM.Example (from https://ilja.space/notice/ASo2tpidQ5yVRUYxG4):
When you post
It now federates as
I expect it to be something more like how the html looks when I check the post through the Foundkey FE
show example
general case
As you have probably noticed, in general we do federate HTML markup for "normal" things.
line breaks
From looking at https://ilja.space/notice/ASo4tEHCpV7nE1ZaEK I see that you are ignoring the line break markup though, in the snippet you gave there is
<span><br></span>
to mark the newline which is apparently ignored. I'm not sure why the line break is inside of aspan
, if that is causing issues I could remove that.reparsing MFM
If you are re-parsing MFM, note that every line break in the input will result in a line break in the output.
Side note because I also learned this while embarking on the MFM parser journey in #338: A blockquote ends at a linebreak as well. However a single empty line between blockquotes is treated as if it didn't exist:
means
KaTeX
The reason we do not federate as in your example is simple: We wouldn't understand it if we received it and thus don't expect others to understand it. As far as I am aware there is no KaTeX in Pleroma (and I think Misskey even removed it?).
Misskey and derivatives do not store HTML. Incoming notes will be converted from HTML to MFM. As you can probably guess from the monstrosity of the generated HTML you cannot transform that back into KaTeX, much less MFM. And HTML tags are not supported in MFM (apart from some exceptions). Thus if you were to federate the last example to Foundkey (and there was no MFM), it would throw out the KaTeX part completely.
(In theory I would like to fix this and store HTML at some point. But probably in the very distant future because it would be a HUGE refactor.)
MFM functions
Regarding
$[flip.h,v example]
and its transformation to<span style="display: inline-block; transform: scale(-1);">example</span>
: I would be surprised if any HTML sanitizer of fediverse software would allow thestyle
attribute to pass through.I am currently working on using an "actual" Markdown parser. Taking inspiration from sfr's MFM extension for marked.js it would render this example as:
My idea is that this output would eventually also be federated like that. However I'm only beginning to experiment with that for Foundkey Pages, see also #338.
Indeed, we do ignore this! Akkoma will check if the
source
is of mediaTypetext/x.misskeymarkdown
. If it is, then we can't use thecontent
(hence this issue), so we reprocess the MFM source. So we don't just ignore the<span><br></span>
, but actually the entirecontent
!The reason why we don't have a newline, is because we interpret this as markdown and markdown doesn't consider
\n
to be a newline. For a newline in markdown you either need\n\n
(which is more like a paragraph because it also adds a blank line), or\n
with two trailing spaces.I'm sure we can fix this on Akkoma's side for MFM, though.
So I'll do that(Done).You don't really need to store the HTML I think, you just process it on sending out (which already happens), but you do make a good point about the HTML sanitizer. I never actually worked with that, so I'm unsure how fine grained you can tell it to go 🤔
Using custom classes/attributes instead of what we have now, indeed sounds like a good solution to me. In practice remote instance still need to add some support, but then it's mostly just some css and not a whole parser.
I understand this isn't something that will happen very soon and I understand why, so thk you for a quick response 🤗❤️
Well the nice thing is, since we would have to do the same we have to write such a style sheet anyway, which you could of course use. But still there would be some additional support required because some
data-mfm-...
attributes would have to be made accessible to CSS. (Unless CSSattr()
is suddenly implemented by all major browsers.) But that would be minor compared with having to reparse MFM.As I said the eventual goal is to federate proper markup for everything but as you noted it's probably going to take a long time.
Maybe relevant for the katex part:
In Akkoma chat someone mentioned this https://codeberg.org/fediverse/fep/src/branch/main/fep/dc88/fep-dc88.md
Apparently there's something called MathML, which is basically some extra elements in HTML. And it seems browsers already support it. I don't know if it can go as complex as what MFM currently allows, but if so, maybe one day the Katex parts could be transformed to MathML.
KaTeX already renders to MathML, so adding that for federating HTML was not that hard.
I did not see a fallback level in the FEP though, so I guess formulas will be completely scrubbed by some receiving implementations now. I don't have an idea for how a fallback could be implemented either though. 🤷