[bug] MFM parsing seems to break when underscore used in emoji shortcode #588

Open
opened 2023-07-14 21:30:43 +00:00 by Johann150 · 3 comments

Your setup

OTP

Extra details

nixnet.social

Version

3.8.0-0-gccae7ef

PostgreSQL version

No response

What were you trying to do?

Send a post from Foundkey. Content of the post:

@fristi@akkos.fritu.re :akko_mmh: no? `:neofox_flop__w_:`
JSON representation of the post
{
  id: 'https://genau.qwertqwefsday.eu/notes/9h6lit1t0o',
  type: 'Note',
  attributedTo: 'https://genau.qwertqwefsday.eu/users/8oxbqesrd1',
  summary: null,
  content: '<p><span class="h-card"><a href="https://akkos.fritu.re/users/fristi" class="u-url mention">@fristi@akkos.fritu.re</a></span><span> </span>​:akko_mmh:​<span> no? </span><code>:neofox_flop__w_:</code></p>',
  source: {
    content: '@fristi@akkos.fritu.re :akko_mmh: no? `:neofox_flop__w_:`',
    mediaType: 'text/x.misskeymarkdown',
  },
  published: '2023-07-14T20:28:43.025Z',
  to: [
    'https://www.w3.org/ns/activitystreams#Public',
  ],
  cc: [
    'https://genau.qwertqwefsday.eu/users/8oxbqesrd1/followers',
    'https://akkos.fritu.re/users/fristi',
  ],
  inReplyTo: 'https://akkos.fritu.re/objects/86fbcc0f-2ee7-4a59-a6d9-acd8337c7127',
  attachment: [],
  sensitive: false,
  tag: [
    {
      type: 'Mention',
      href: 'https://akkos.fritu.re/users/fristi',
      name: '@fristi@akkos.fritu.re',
    },
    {
      id: 'https://genau.qwertqwefsday.eu/emojis/akko_mmh',
      type: 'Emoji',
      name: ':akko_mmh:',
      updated: '2021-08-03T13:04:48.238Z',
      icon: {
        type: 'Image',
        mediaType: 'image/png',
        url: 'https://genau.qwertqwefsday.eu/files/cb4249e6-ffc3-4374-8943-e9c8c67fc646',
      },
    },
  ],
}

What did you expect to happen?

The post is rendered as

  1. mention
  2. emoji with shortcode :akko_mmh: which is referenced as a tag in the JSON representation
  3. the text " no? "
  4. code block with the content :neofox_flop__w_:

In other words it should be parsed as if it was this pseudo HTML

<mention>@fristi@akkos.fritu.re</mention> <emoji>:akko_mmh:</emoji> no? <code>:neofox_flop__w_:</code>

What actually happened?

The post is rendered as

  1. mention
  2. the text :akko
  3. the italic text
mmh: no? `:neofox_flop__w
  1. the text
:`

In other words it seems to be parsed as if it was this pseudo HTML

<mention>@fristi@akkos.fritu.re</mention> :akko<i>mmh: no? `:neofox_flop__w</i>:`

Logs

No response

Severity

I can manage

Have you searched for this issue?

  • I have double-checked and have not found this issue mentioned anywhere.
### Your setup OTP ### Extra details nixnet.social ### Version 3.8.0-0-gccae7ef ### PostgreSQL version _No response_ ### What were you trying to do? Send a post from Foundkey. Content of the post: ```markdown @fristi@akkos.fritu.re :akko_mmh: no? `:neofox_flop__w_:` ``` <detail> <summary>JSON representation of the post</summary> ```json { id: 'https://genau.qwertqwefsday.eu/notes/9h6lit1t0o', type: 'Note', attributedTo: 'https://genau.qwertqwefsday.eu/users/8oxbqesrd1', summary: null, content: '<p><span class="h-card"><a href="https://akkos.fritu.re/users/fristi" class="u-url mention">@fristi@akkos.fritu.re</a></span><span> </span>​:akko_mmh:​<span> no? </span><code>:neofox_flop__w_:</code></p>', source: { content: '@fristi@akkos.fritu.re :akko_mmh: no? `:neofox_flop__w_:`', mediaType: 'text/x.misskeymarkdown', }, published: '2023-07-14T20:28:43.025Z', to: [ 'https://www.w3.org/ns/activitystreams#Public', ], cc: [ 'https://genau.qwertqwefsday.eu/users/8oxbqesrd1/followers', 'https://akkos.fritu.re/users/fristi', ], inReplyTo: 'https://akkos.fritu.re/objects/86fbcc0f-2ee7-4a59-a6d9-acd8337c7127', attachment: [], sensitive: false, tag: [ { type: 'Mention', href: 'https://akkos.fritu.re/users/fristi', name: '@fristi@akkos.fritu.re', }, { id: 'https://genau.qwertqwefsday.eu/emojis/akko_mmh', type: 'Emoji', name: ':akko_mmh:', updated: '2021-08-03T13:04:48.238Z', icon: { type: 'Image', mediaType: 'image/png', url: 'https://genau.qwertqwefsday.eu/files/cb4249e6-ffc3-4374-8943-e9c8c67fc646', }, }, ], } ``` </detail> ### What did you expect to happen? The post is rendered as 1. mention 2. emoji with shortcode `:akko_mmh:` which is referenced as a `tag` in the JSON representation 3. the text " no? " 4. code block with the content `:neofox_flop__w_:` In other words it should be parsed as if it was this pseudo HTML ```html <mention>@fristi@akkos.fritu.re</mention> <emoji>:akko_mmh:</emoji> no? <code>:neofox_flop__w_:</code> ``` ### What actually happened? The post is rendered as 1. mention 2. the text `:akko` 3. the italic text ``` mmh: no? `:neofox_flop__w ``` 4. the text ``` :` ``` In other words it seems to be parsed as if it was this pseudo HTML ```html <mention>@fristi@akkos.fritu.re</mention> :akko<i>mmh: no? `:neofox_flop__w</i>:` ``` ### Logs _No response_ ### Severity I can manage ### Have you searched for this issue? - [x] I have double-checked and have not found this issue mentioned anywhere.
Johann150 added the
bug
label 2023-07-14 21:30:43 +00:00
Author

Something weird may be going on in the space where the markdown and MFM parsers interact? https://akkoma.dev/AkkomaGang/akkoma/src/branch/develop/lib/pleroma/web/common_api/utils.ex#L292-L294

Something weird may be going on in the space where the markdown and MFM parsers interact? https://akkoma.dev/AkkomaGang/akkoma/src/branch/develop/lib/pleroma/web/common_api/utils.ex#L292-L294

not sure how easily fixable this is, the underscores would indeed define a markdown italics area, the parser probably takes it as such as emoji are not something it knows exists

there may be hacks to auto-escape underscores within codes or other such nonsense though

not sure how easily fixable this is, the underscores would indeed define a markdown italics area, the parser probably takes it as such as emoji are not something it knows exists there may be hacks to auto-escape underscores within codes or other such nonsense though

I can confirm that what looks like the same thing happens when there's an underscore in a tagged username.
(The MFM post @t_e_s_t _test_ being rendered as <p>@t<em>e_s_t _test</em></p>)

{
  "@context": [
    "https://www.w3.org/ns/activitystreams",
    "https://ak.cute.rest/schemas/litepub-0.1.jsonld",
    {
      "@language": "und"
    }
  ],
  "actor": "https://ak.cute.rest/users/t_e_s_t",
  "attachment": [],
  "attributedTo": "https://ak.cute.rest/users/t_e_s_t",
  "cc": [
    "https://ak.cute.rest/users/t_e_s_t/followers"
  ],
  "content": "<p>@t<em>e_s_t _test</em></p>",
  "contentMap": {
    "en": "<p>@t<em>e_s_t _test</em></p>"
  },
  "context": "https://ak.cute.rest/contexts/b5bd9248-0268-44ac-8e6e-83022679dbb3",
  "conversation": "https://ak.cute.rest/contexts/b5bd9248-0268-44ac-8e6e-83022679dbb3",
  "id": "https://ak.cute.rest/objects/a7a510f3-b607-4bf9-adbf-380dbc52521f",
  "published": "2023-09-30T15:18:49.709432Z",
  "sensitive": null,
  "source": {
    "content": "@t_e_s_t _test_",
    "mediaType": "text/x.misskeymarkdown"
  },
  "summary": "",
  "tag": [],
  "to": [
    "https://www.w3.org/ns/activitystreams#Public"
  ],
  "type": "Note"
}
I can confirm that what looks like the same thing happens when there's an underscore in a tagged username. (The MFM post `@t_e_s_t _test_` being rendered as `<p>@t<em>e_s_t _test</em></p>`) ``` { "@context": [ "https://www.w3.org/ns/activitystreams", "https://ak.cute.rest/schemas/litepub-0.1.jsonld", { "@language": "und" } ], "actor": "https://ak.cute.rest/users/t_e_s_t", "attachment": [], "attributedTo": "https://ak.cute.rest/users/t_e_s_t", "cc": [ "https://ak.cute.rest/users/t_e_s_t/followers" ], "content": "<p>@t<em>e_s_t _test</em></p>", "contentMap": { "en": "<p>@t<em>e_s_t _test</em></p>" }, "context": "https://ak.cute.rest/contexts/b5bd9248-0268-44ac-8e6e-83022679dbb3", "conversation": "https://ak.cute.rest/contexts/b5bd9248-0268-44ac-8e6e-83022679dbb3", "id": "https://ak.cute.rest/objects/a7a510f3-b607-4bf9-adbf-380dbc52521f", "published": "2023-09-30T15:18:49.709432Z", "sensitive": null, "source": { "content": "@t_e_s_t _test_", "mediaType": "text/x.misskeymarkdown" }, "summary": "", "tag": [], "to": [ "https://www.w3.org/ns/activitystreams#Public" ], "type": "Note" } ```
Sign in to join this conversation.
No Milestone
No project
No Assignees
3 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: AkkomaGang/akkoma#588
No description provided.