WIP: FEP-dc88: Formatting Mathematics #642

Draft
pounce wants to merge 1 commits from pounce/akkoma:formatting-mathematics into develop
First-time contributor

See #641
This PR does everything it's supposed to for the first half of that issue, but it is ugly and useless

Ugly

This PR changes the default HTML sanitizer to optionally allow MathML core tags. Unfortunately, the FastSanitize library is a mess. As it's all comprised of macros around partial function definitions, it's not possible to use very much abstraction (such as for comprehensions or variable definitions) to help define a sanitizer, which leads to very ugly code.

This isn't the worst of it. FastHTML uses atoms to represent 'known tags' (I've done some digging, and it's basically anything in this list: https://github.com/lexbor/lexbor/blob/master/source/lexbor/tag/const.h). About half of the math tags are 'known', and there appears to be no rhyme or reason to which are known and which aren't (annotation is not known, annotation-xml is known). So we have to define half the tags using binaries and the other half using atoms.

Useless

If you create a post with MathML core tags... it won't show up! This is due to a bug in VueJS which affects the pleroma frontend. Basically, elements are added to the DOM with the wrong namespace, and won't be rendered correctly unless the math DOM nodes are all refreshed. Bad stuff.
Here's a MathML post in pleroma-fe:
image
and how it is supposed to look, in mastodon-fe:
image

See #641 This PR does everything it's supposed to for the first half of that issue, but it is ugly and useless ## Ugly This PR changes the default HTML sanitizer to optionally allow MathML core tags. Unfortunately, the `FastSanitize` library is a mess. As it's all comprised of macros around partial function definitions, it's not possible to use very much abstraction (such as `for` comprehensions or variable definitions) to help define a sanitizer, which leads to very ugly code. This isn't the worst of it. `FastHTML` uses atoms to represent 'known tags' (I've done some digging, and it's basically anything in this list: https://github.com/lexbor/lexbor/blob/master/source/lexbor/tag/const.h). About half of the math tags are 'known', and there appears to be no rhyme or reason to which are known and which aren't (`annotation` is not known, `annotation-xml` is known). So we have to define half the tags using binaries and the other half using atoms. ## Useless If you create a post with MathML core tags... it won't show up! This is due to a [bug in VueJS](https://github.com/vuejs/core/issues/7820) which affects the pleroma frontend. Basically, elements are added to the DOM with the wrong namespace, and won't be rendered correctly unless the math DOM nodes are all refreshed. Bad stuff. Here's a MathML post in pleroma-fe: ![image](/attachments/7c43673d-6f5b-479a-9594-d30d7a233302) and how it is supposed to look, in mastodon-fe: ![image](/attachments/5ac79f28-38cc-487d-aa9a-cf6c249fcdc6)
pounce added 1 commit 2023-09-16 21:46:42 +00:00
ci/woodpecker/pr/build-amd64 Pipeline is pending Details
ci/woodpecker/pr/build-arm64 Pipeline is pending Details
ci/woodpecker/pr/docs Pipeline is pending Details
ci/woodpecker/pr/lint Pipeline is pending Details
ci/woodpecker/pr/test Pipeline is pending Details
1b838627df
Allow MathML core tags in sanitized content
Member

Good news, the Vue bug appears to have been fixed a few week s ago: vuejs/core@d42b6ba3 and this fix is part of release 3.4.0+

Good news, the Vue bug appears to have been fixed a few week s ago: [vuejs/core@d42b6ba3](https://github.com/vuejs/core/commit/d42b6ba3f530746eb1221eb7a4be0f44eb56f7d3) and this fix is part of release 3.4.0+
Contributor

I wonder, is the front-end the only thing blocking this? As I understand, it does work with other front-ends like masto-fe, so I wonder if we should really wait until front-end is fixed. People generally install Akkoma-fe, but as I understand the idea is that Akkoma doesn't force/push one specific front-end.

I wonder, is the front-end the only thing blocking this? As I understand, it does work with other front-ends like masto-fe, so I wonder if we should really wait until front-end is fixed. People generally install Akkoma-fe, but as I understand the idea is that Akkoma doesn't force/push one specific front-end.
Member

Btw, re

As it's all comprised of macros around partial function definitions, it's not possible to use very much abstraction (such as for comprehensions or variable definitions) to help define a sanitizer, which leads to very ugly code.

I’m not too familiar with Elixir’s macro stuff, but some examples I’ve seen seem to suggest using regular for is at least sometimes capable of looping over macros and the below at least compiles (with math enabled) fine for me. Is there going something wrong during runtime later? (might want to add some basic tests)

  if Pleroma.Config.get([:markup, :allow_math]) do
    Meta.allow_tag_with_these_attributes("annotation", ["encoding"])
    Meta.allow_tag_with_these_attributes(:"annotation-xml", ["encoding"])

    Meta.allow_tag_with_these_attributes(:math, [
      "display",
      "mathvariant",
      "displaystyle",
      "scriptlevel"
    ])

    basic_math_tags = [
      "maction",
      "merror",
      :mi,
      "mmultiscripts",
      :mn,
      "mphantom",
      "mprescripts",
      "mroot",
      "mrow",
      "ms",
      "msqrt",
      "mstyle",
      "msub",
      "msubsup",
      "msup",
      "mtable",
      "mtext",
      "mtr",
      "semantics"
    ]

    for tag <- basic_math_tags do
      Meta.allow_tag_with_these_attributes(tag, ["mathvariant", "displaystyle", "scriptlevel"])
    end

    for tag <- ["mover", "munder", "munderover"] do
      Meta.allow_tag_with_these_attributes(tag, [
        "accent",
        "accentunder",
        "mathvariant",
        "displaystyle",
        "scriptlevel"
      ])
    end

    Meta.allow_tag_with_these_attributes("mfrac", [
      "linethickness",
      "mathvariant",
      "displaystyle",
      "scriptlevel"
    ])

    Meta.allow_tag_with_these_attributes(:mo, [
      "form",
      "stretchy",
      "symmetric",
      "largeop",
      "movablelimits",
      "lspace",
      "rspace",
      "minsize",
      "mathvariant",
      "displaystyle",
      "scriptlevel"
    ])

    Meta.allow_tag_with_these_attributes("mpadded", [
      "width",
      "height",
      "depth",
      "lspace",
      "voffset",
      "mathvariant",
      "displaystyle",
      "scriptlevel"
    ])

    Meta.allow_tag_with_these_attributes("mspace", [
      "width",
      "height",
      "depth",
      "mathvariant",
      "displaystyle",
      "scriptlevel"
    ])

    Meta.allow_tag_with_these_attributes("mtd", [
      "columnspan",
      "rowspan",
      "mathvariant",
      "displaystyle",
      "scriptlevel"
    ])
  end
Btw, re > As it's all comprised of macros around partial function definitions, it's not possible to use very much abstraction (such as for comprehensions or variable definitions) to help define a sanitizer, which leads to very ugly code. I’m not too familiar with Elixir’s macro stuff, but some examples I’ve seen seem to suggest using regular `for` is at least _sometimes_ capable of looping over macros and the below at least compiles (with math enabled) fine for me. Is there going something wrong during runtime later? *(might want to add some basic tests)* ```elixir if Pleroma.Config.get([:markup, :allow_math]) do Meta.allow_tag_with_these_attributes("annotation", ["encoding"]) Meta.allow_tag_with_these_attributes(:"annotation-xml", ["encoding"]) Meta.allow_tag_with_these_attributes(:math, [ "display", "mathvariant", "displaystyle", "scriptlevel" ]) basic_math_tags = [ "maction", "merror", :mi, "mmultiscripts", :mn, "mphantom", "mprescripts", "mroot", "mrow", "ms", "msqrt", "mstyle", "msub", "msubsup", "msup", "mtable", "mtext", "mtr", "semantics" ] for tag <- basic_math_tags do Meta.allow_tag_with_these_attributes(tag, ["mathvariant", "displaystyle", "scriptlevel"]) end for tag <- ["mover", "munder", "munderover"] do Meta.allow_tag_with_these_attributes(tag, [ "accent", "accentunder", "mathvariant", "displaystyle", "scriptlevel" ]) end Meta.allow_tag_with_these_attributes("mfrac", [ "linethickness", "mathvariant", "displaystyle", "scriptlevel" ]) Meta.allow_tag_with_these_attributes(:mo, [ "form", "stretchy", "symmetric", "largeop", "movablelimits", "lspace", "rspace", "minsize", "mathvariant", "displaystyle", "scriptlevel" ]) Meta.allow_tag_with_these_attributes("mpadded", [ "width", "height", "depth", "lspace", "voffset", "mathvariant", "displaystyle", "scriptlevel" ]) Meta.allow_tag_with_these_attributes("mspace", [ "width", "height", "depth", "mathvariant", "displaystyle", "scriptlevel" ]) Meta.allow_tag_with_these_attributes("mtd", [ "columnspan", "rowspan", "mathvariant", "displaystyle", "scriptlevel" ]) end ```
Author
First-time contributor

I wonder, is the front-end the only thing blocking this? As I understand, it does work with other front-ends like masto-fe, so I wonder if we should really wait until front-end is fixed. People generally install Akkoma-fe, but as I understand the idea is that Akkoma doesn't force/push one specific front-end.

Yes agree. I'll rebase this soon to get it ready to merge

> I wonder, is the front-end the only thing blocking this? As I understand, it does work with other front-ends like masto-fe, so I wonder if we should really wait until front-end is fixed. People generally install Akkoma-fe, but as I understand the idea is that Akkoma doesn't force/push one specific front-end. Yes agree. I'll rebase this soon to get it ready to merge
Some checks are pending
ci/woodpecker/pr/build-amd64 Pipeline is pending
ci/woodpecker/pr/build-arm64 Pipeline is pending
ci/woodpecker/pr/docs Pipeline is pending
ci/woodpecker/pr/lint Pipeline is pending
ci/woodpecker/pr/test Pipeline is pending
This pull request is marked as a work in progress.
This branch is out-of-date with the base branch
Sign in to join this conversation.
No description provided.