[bug] RSS feeds do not pass W3C validation #387

Open
opened 2022-12-21 23:57:00 +00:00 by pzingg · 0 comments

Your setup

From source

Extra details

Ubuntu 22.04, Elixir 1.13.4 (compiled with Erlang/OTP 25)

Version

8e5a88ed

PostgreSQL version

12

What were you trying to do?

I pasted the user and tag RSS feeds produced at these URLs against the official W3C Feed Validation Service to check the feeds' compliance with the RSS 2.0 and Atom 1.0 specifications.

  1. /users/:nickname/feed.rss
  2. /users/:nickname/feed.atom
  3. /tags/:tag.rss
  4. /tags/:tag.atom

What did you expect to happen?

The user and tag feeds should be as compliant as possible with the official standards.

It may be the case that non-compliant elements and attributes are included in feed output in order to support legacy feed readers or other automated tools. If so, these non-compliant elements should be identified and noted in the source code accordingly.

For example, undefined or repeated link elements such as rel="avatar", rel="emoji", rel="header", rel="mentioned", or rel="ostatus:conversation", and the managingEditor sub-elements in the RSS 2.0 feed.

What actually happened?

The validator service reported many errors. Some of these should be corrected without discussion. Others might need to be reviewed to make sure that legacy non-compliant readers will continue to parse these feeds correctly.

Errors in RSS 2.0 (".rss") feeds:

  1. Date-times must be in RFC-822 format, with a time zone (must include a trailing "Z", "GMT", or "+0000" time zone indication)
  2. The "isPermalink" attribute of item.guid element is misspelled. It should be named "isPermaLink", with a capital "L".
  3. The channel.image element has text value of URL, instead of required sub-elements url, title, link, and optional sub-elements description, width and height. Reformat this appropriately.
  4. The channel element contains an undefined guid sub-element. It is redundant, since a link sub-element is present. Remove it or use a namespace atom:id element instead?
  5. The channel element contains an undefined updated sub-element, which is redundant, since a lastBuildDate sub-element is present. Remove it or use a namespaced atom:updated element instead?
  6. The channel.managingEditor element contains many undefined elements. Per the spec the value should be an email address and name, i.e "geo@herald.com (George Matesky)". Not sure if email addresses of users are public in Pleroma. If these undefined elements are to be used, they should be namespaced, and the multiple link sub-elements should be changed to atom:link elements.
  7. The channel element should have only one link sub-element. Change channel.link[rel="next"] to an atom:link element?
  8. An item element should have only one link sub-element. Change item.link elements with rel attribute values of "emoji", "ostatus:conversation", and "mentioned" to an atom:link elements?
  9. All prefixes should be namespaced. Add "xmlns:activity", etc. declarations as necessary.
  10. If a post in the feed has a summary, the item element will have two description sub-elements, where only one is permitted. Change the second one to a namespaced atom:summary element.

Warnings in RSS 2.0 (".rss") feeds:

  1. Add a channel.atom:link element with rel="self", and remove the non-standard rel attribute from channel.link element?

Errors in Atom 1.0 (".atom") feeds:

  1. Date-times must be in RFC-3339 format, with a time zone (must include a trailing "Z" or "+00:00" time zone indication)
  2. Sub-elements of an entry.author elements other than name, uri, or email, must be namespaced as extensions (e.g. id, link, summary, and ap_enabled).

Warnings in Atom 1.0 (".atom") feeds:

  1. The entry.summary element should not be blank (omit entire element if there is no summary)
  2. The entry.title element should not contain HTML unless declared in the type attribute. Best practices indicate that titles should be plain text only (and probably without any newlines).
  3. Several link elements use unregistered relations, such as "avatar", "header", "emjoi", "mentioned", and "ostatus:conversation". These might need to be left as non-compliant, or replaced with a URI, see RFC-5988 Section 4.2 and the IANA registry of link relations.

Logs

No response

Severity

I can manage

Have you searched for this issue?

  • I have double-checked and have not found this issue mentioned anywhere.
### Your setup From source ### Extra details Ubuntu 22.04, Elixir 1.13.4 (compiled with Erlang/OTP 25) ### Version 8e5a88ed ### PostgreSQL version 12 ### What were you trying to do? I pasted the user and tag RSS feeds produced at these URLs against the official [W3C Feed Validation Service](https://validator.w3.org/feed/) to check the feeds' compliance with the RSS 2.0 and Atom 1.0 specifications. 1. `/users/:nickname/feed.rss` 2. `/users/:nickname/feed.atom` 3. `/tags/:tag.rss` 4. `/tags/:tag.atom` ### What did you expect to happen? The user and tag feeds should be as compliant as possible with the official standards. It may be the case that non-compliant elements and attributes are included in feed output in order to support legacy feed readers or other automated tools. If so, these non-compliant elements should be identified and noted in the source code accordingly. For example, undefined or repeated link elements such as rel="avatar", rel="emoji", rel="header", rel="mentioned", or rel="ostatus:conversation", and the managingEditor sub-elements in the RSS 2.0 feed. ### What actually happened? The validator service reported many errors. Some of these should be corrected without discussion. Others might need to be reviewed to make sure that legacy non-compliant readers will continue to parse these feeds correctly. Errors in RSS 2.0 (".rss") feeds: 1. Date-times must be in RFC-822 format, with a time zone (must include a trailing "Z", "GMT", or "+0000" time zone indication) 2. The "isPermalink" attribute of item.guid element is misspelled. It should be named "isPermaLink", with a capital "L". 3. The channel.image element has text value of URL, instead of required sub-elements url, title, link, and optional sub-elements description, width and height. Reformat this appropriately. 4. The channel element contains an undefined guid sub-element. It is redundant, since a link sub-element is present. Remove it or use a namespace atom:id element instead? 5. The channel element contains an undefined updated sub-element, which is redundant, since a lastBuildDate sub-element is present. Remove it or use a namespaced atom:updated element instead? 6. The channel.managingEditor element contains many undefined elements. Per the spec the value should be an email address and name, i.e "geo@herald.com (George Matesky)". Not sure if email addresses of users are public in Pleroma. If these undefined elements are to be used, they should be namespaced, and the multiple link sub-elements should be changed to atom:link elements. 7. The channel element should have only one link sub-element. Change channel.link\[rel="next"\] to an atom:link element? 8. An item element should have only one link sub-element. Change item.link elements with rel attribute values of "emoji", "ostatus:conversation", and "mentioned" to an atom:link elements? 9. All prefixes should be namespaced. Add "xmlns:activity", etc. declarations as necessary. 10. If a post in the feed has a summary, the item element will have two description sub-elements, where only one is permitted. Change the second one to a namespaced atom:summary element. Warnings in RSS 2.0 (".rss") feeds: 11. Add a channel.atom:link element with rel="self", and remove the non-standard rel attribute from channel.link element? Errors in Atom 1.0 (".atom") feeds: 12. Date-times must be in RFC-3339 format, with a time zone (must include a trailing "Z" or "+00:00" time zone indication) 13. Sub-elements of an entry.author elements other than name, uri, or email, must be namespaced as extensions (e.g. id, link, summary, and ap_enabled). Warnings in Atom 1.0 (".atom") feeds: 14. The entry.summary element should not be blank (omit entire element if there is no summary) 15. The entry.title element should not contain HTML unless declared in the type attribute. Best practices indicate that titles should be plain text only (and probably without any newlines). 16. Several link elements use unregistered relations, such as "avatar", "header", "emjoi", "mentioned", and "ostatus:conversation". These might need to be left as non-compliant, or replaced with a URI, see [RFC-5988 Section 4.2](https://www.rfc-editor.org/rfc/rfc5988#section-4.2) and the [IANA registry of link relations](http://www.iana.org/assignments/link-relations/link-relations.xhtml). ### Logs _No response_ ### Severity I can manage ### Have you searched for this issue? - [x] I have double-checked and have not found this issue mentioned anywhere.
pzingg added the
bug
label 2022-12-21 23:57:00 +00:00
Sign in to join this conversation.
No Milestone
No project
No Assignees
1 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: AkkomaGang/akkoma#387
No description provided.