[feat] Allow configuration of title-less items in RSS feeds #391

Open
opened 2022-12-22 02:01:48 +00:00 by pzingg · 0 comments

The idea

  1. Add a configuration key for a boolean value :parse_source with a default value of false under config :pleroma, :feed, :post_title. An example description for the key is shown below.

  2. Modify the function Pleroma.Formatter.truncate/3 to accept zero as the value for Pleroma.Config.get([:feed, :post_title, :max_length]). Currently, this function will raise an exception if max_length is less than String.length(omission). If max_length is less than the length of the omission string, just use String.slice(text, 0, max_length) (without adding the omission) as the truncated text.

  3. Modify the logic in Pleroma.Web.Feed.FeedView.activity_title/2 to test if and only if the opts parameter has %{parse_source: true} and the activity has non-nil values at activity.data["source"]["mediaType"] and activity.data["source"]["content"]. If so, do not use the HTML content for the activity to produce the title. Instead, parse the title from the "source" "content", according to rules for each content type, ignoring the :max_length configuration value (see below for the rules).

Notes

If the :parse_source option is set, and the rules set forth below fail to parse out an authored title, the :max_length option will be used as before to render a plaintext title sub-element and a description sub-element that contains HTML encoded from the original full activity.data["source"]["content"].

If the :parse_source option is set, no title is parsed, and :max_length is zero, the feed item/entry will be rendered without a title sub-element.

If the :parse_source option is set, and the rules below parse out an authored title, the description sub-element of the feed item/entry will contain HTML encoded from a slice of the activity.data["source"]["content"] beginning at an offset just beyond the end of the parsed title. In this way, the description sub-element will not repeat the title that was parsed out.

In all cases, including under the current behavior, the title sub-element's value should be a single line of plain text, trimmed of leading and trailing whitespace, without links or emojis. If no title is produced, the feed item/entry will be rendered without a title sub-element.

Example :parse_source in description.exs

          %{
            key: :parse_source,
            type: :boolean,
            description: "Use content type-specific parsers to extract title (ignores max_length)",
            suggestions: [true]
          }

Parsing title from "text/html" content

The title is the content of an initial h1 or h2 element in the content, or nil if this is not satisfied.

Parsing title from "text/plain" content

The title is a leading single line of text, separated from the description by two newlines or nil if is not satisfied. This is the same logic that separates the headers of a plaintext email from its body.

Parsing title from "text/bbcode" content

The title is the content of a leading [b] element, separated from the description by two newlines, or nil if is not satisfied.

Parsing title from "text/markdown" content

The title is the content of a leading # or ## (h1 or h2) element, separated from the description by two newlines, or nil if is not satisfied.

Parsing title from "text/x.misskeymarkdown" content

The title is the content of a leading ** (bold) element, separated from the description by two newlines, or nil if is not satisfied. (Not sure if the Pleroma implementation of x.misskeymarkdown supports a title element, which could be used in place of **).

The reasoning

Admins configuring a server should be able to choose between:

  1. Truncating titles for all items in RSS feeds from the HTML-encoded description (i.e. from activity.data["content"]). This is the current behavior (:max_length greater than zero and :parse_source false).

  2. Making all items in RSS feeds title-less. (:max_length equal to zero and :parse_source false). Rationales for this are explored in Why Mastodon should have title-less feeds and Common features that a "document" should support.

  3. Letting post authors specify a title for individual posts by following rules according to the content type listed above, and parsing (not truncating) titles from activity.data["source"]["content"] (:parse_source equal to true). This extends the expressiveness and intentions of author's posts.

Titles in other ActivityPub aware server applications:

  • The main Mastodon application and its most popular forks, Hometown and glitch-soc, do not expose title elements for RSS feed item.
  • Among other examples of microblogging software that do not expose titles is Manton Reece's micro.blog feed.
  • Apparently, Matt Mullenweg has indicated that title-less items will be supported in upcoming versions of Wordpress and Tumblr.

Have you searched for this feature request?

  • I have double-checked and have not found this feature request mentioned anywhere.
  • This feature is related to the Akkoma backend specifically, and not pleroma-fe.
### The idea 1. Add a configuration key for a boolean value `:parse_source` with a default value of `false` under `config :pleroma, :feed, :post_title`. An example description for the key is shown below. 2. Modify the function `Pleroma.Formatter.truncate/3` to accept zero as the value for `Pleroma.Config.get([:feed, :post_title, :max_length])`. Currently, this function will raise an exception if `max_length` is less than `String.length(omission)`. If `max_length` is less than the length of the `omission` string, just use `String.slice(text, 0, max_length)` (without adding the `omission`) as the truncated text. 3. Modify the logic in `Pleroma.Web.Feed.FeedView.activity_title/2` to test if and only if the opts parameter has `%{parse_source: true}` and the activity has non-nil values at `activity.data["source"]["mediaType"]` and `activity.data["source"]["content"]`. If so, do not use the HTML content for the activity to produce the title. Instead, parse the title from the "source" "content", according to rules for each content type, ignoring the `:max_length` configuration value (see below for the rules). #### Notes If the :parse_source option is set, and the rules set forth below fail to parse out an authored title, the :max_length option will be used as before to render a plaintext title sub-element and a description sub-element that contains HTML encoded from the original full `activity.data["source"]["content"]`. If the :parse_source option is set, no title is parsed, and :max_length is zero, the feed item/entry will be rendered without a title sub-element. If the :parse_source option is set, and the rules below parse out an authored title, the description sub-element of the feed item/entry will contain HTML encoded from a slice of the `activity.data["source"]["content"]` beginning at an offset just beyond the end of the parsed title. In this way, the description sub-element will not repeat the title that was parsed out. In all cases, including under the current behavior, the title sub-element's value should be a single line of plain text, trimmed of leading and trailing whitespace, without links or emojis. If no title is produced, the feed item/entry will be rendered without a title sub-element. #### Example :parse\_source in description.exs ``` %{ key: :parse_source, type: :boolean, description: "Use content type-specific parsers to extract title (ignores max_length)", suggestions: [true] } ``` #### Parsing title from "text/html" content The title is the content of an initial `h1` or `h2` element in the content, or `nil` if this is not satisfied. #### Parsing title from "text/plain" content The title is a leading single line of text, separated from the description by two newlines or `nil` if is not satisfied. This is the same logic that separates the headers of a plaintext email from its body. #### Parsing title from "text/bbcode" content The title is the content of a leading `[b]` element, separated from the description by two newlines, or `nil` if is not satisfied. #### Parsing title from "text/markdown" content The title is the content of a leading `#` or `##` (h1 or h2) element, separated from the description by two newlines, or `nil` if is not satisfied. #### Parsing title from "text/x.misskeymarkdown" content The title is the content of a leading `**` (bold) element, separated from the description by two newlines, or `nil` if is not satisfied. (Not sure if the Pleroma implementation of x.misskeymarkdown supports a `title` element, which could be used in place of `**`). ### The reasoning Admins configuring a server should be able to choose between: 1. Truncating titles for all items in RSS feeds from the HTML-encoded description (i.e. from `activity.data["content"]`). This is the current behavior (`:max_length` greater than zero and `:parse_source` false). 2. Making all items in RSS feeds title-less. (`:max_length` equal to zero and `:parse_source` false). Rationales for this are explored in [Why Mastodon should have title-less feeds](http://scripting.com/2022/12/10.html) and [Common features that a "document" should support](http://this.how/whatIsADocument/). 3. Letting post authors specify a title for individual posts by following rules according to the content type listed above, and parsing (not truncating) titles from `activity.data["source"]["content"]` (`:parse_source` equal to true). This extends the expressiveness and intentions of author's posts. Titles in other ActivityPub aware server applications: * The main Mastodon application and its most popular forks, Hometown and glitch-soc, do not expose `title` elements for RSS feed item. * Among other examples of microblogging software that do not expose titles is [Manton Reece's micro.blog feed](https://www.manton.org/feed.xml). * Apparently, Matt Mullenweg has indicated that title-less items will be supported in upcoming versions of Wordpress and Tumblr. ### Have you searched for this feature request? - [x] I have double-checked and have not found this feature request mentioned anywhere. - [x] This feature is related to the Akkoma backend specifically, and not pleroma-fe.
pzingg added the
feature request
label 2022-12-22 02:01:48 +00:00
pzingg changed title from [feat] Allow configuration of title-less feeds to [feat] Allow configuration of title-less items in RSS feeds 2022-12-22 02:03:45 +00:00
floatingghost added the
extremely low priority
label 2022-12-22 05:01:29 +00:00
Sign in to join this conversation.
No Milestone
No project
No Assignees
1 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: AkkomaGang/akkoma#391
No description provided.