[feat] Expose original content encoding in RSS feeds #392

Open
opened 2022-12-22 02:27:57 +00:00 by pzingg · 0 comments

The idea

As a follow-on to #391, RSS feeds should expose the original source content and content type of each post.

  1. Expose a source:encoded sub-element for each item element, containing the original content (i.e. activity["source"]["content"]) and with a typeattribute having the value ofactivity["source"]["mediaType"]`.

  2. If the title of the item is parsed from the original source content with the :parse_source configuration of #391, expose an optional descriptionOffset attribute in the source:encoded element, with a value representing the "grapheme" index into the source:encoded value indicating where the source description (the content that follows the parsed title) should begin. Referring to the title parsing rules in #391, this would be after the two newlines separators, or the h1 or h2 element in the case of a text/html encoded post.

The reasoning

To enable full interop between systems without loss of intention, RSS feeds should expose the content of items as originally authored, not just as algorithmically re-encoded by different ActivityPub softwares that support a variety of content types (glitch-soc vs Pleroma, etc.), into HTML.

The description element in RSS is expressly re-encoded in HTML, and cannot be re-processed back to its original content type, so additional elements are needed to preserve the original intention and encoding. We cannot use content:encoded because the RSS 2.0 specification says, "The content MUST be suitable for presentation as HTML and be encoded as character data in the same manner as the description element." We want to preserve the other formats, not just HTML.

Dave Winer, creator of RSS 2.0, has written recently about the importance of this capability, although he limits his discussion to the text/markdown content type:

The proposal in this issue extends his ideas to the other post content types that Pleroma supports. If the content type was "text/markdown", we could also expose the source:markdown element that he suggests as well.

Have you searched for this feature request?

  • I have double-checked and have not found this feature request mentioned anywhere.
  • This feature is related to the Akkoma backend specifically, and not pleroma-fe.
### The idea As a follow-on to #391, RSS feeds should expose the original source content and content type of each post. 1. Expose a `source:encoded` sub-element for each `item` element, containing the original content (i.e. `activity["source"]["content"]) and with a `type` attribute having the value of `activity["source"]["mediaType"]`. 2. If the title of the item is parsed from the original source content with the `:parse_source` configuration of #391, expose an optional `descriptionOffset` attribute in the `source:encoded` element, with a value representing the "grapheme" index into the `source:encoded` value indicating where the source description (the content that follows the parsed title) should begin. Referring to the title parsing rules in #391, this would be after the two newlines separators, or the `h1` or `h2` element in the case of a `text/html` encoded post. ### The reasoning To enable full interop between systems without loss of intention, RSS feeds should expose the content of items as originally authored, not just as algorithmically re-encoded by different ActivityPub softwares that support a variety of content types (glitch-soc vs Pleroma, etc.), into HTML. The `description` element in RSS is expressly re-encoded in HTML, and cannot be re-processed back to its original content type, so additional elements are needed to preserve the original intention and encoding. We cannot use `content:encoded` because the [RSS 2.0 specification says,](https://www.rssboard.org/rss-profile#namespace-elements-content-encoded) "The content MUST be suitable for presentation as HTML and be encoded as character data in the same manner as the description element." We want to preserve the other formats, not just HTML. Dave Winer, creator of RSS 2.0, has written recently about the importance of this capability, although he limits his discussion to the `text/markdown` content type: * [To MastoDevs re RSS + Markdown](http://scripting.com/2022/11/27/203645.html) * [Dev notes for Markdown in RSS](http://scripting.com/2022/07/19/152235.html) * [The 'source' namespace](http://source.scripting.com/) The proposal in this issue extends his ideas to the other post content types that Pleroma supports. If the content type was "text/markdown", we could also expose the `source:markdown` element that he suggests as well. ### Have you searched for this feature request? - [x] I have double-checked and have not found this feature request mentioned anywhere. - [x] This feature is related to the Akkoma backend specifically, and not pleroma-fe.
pzingg added the
feature request
label 2022-12-22 02:27:57 +00:00
floatingghost added the
extremely low priority
label 2022-12-22 05:01:21 +00:00
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: AkkomaGang/akkoma#392
No description provided.