Improve discord embeds #890
No reviewers
Labels
No labels
approved, awaiting change
bug
configuration
documentation
duplicate
enhancement
extremely low priority
feature request
Fix it yourself
help wanted
invalid
mastodon_api
needs docs
needs tests
not a bug
planned
pleroma_api
privacy
question
static_fe
triage
wontfix
No milestone
No project
No assignees
3 participants
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference: AkkomaGang/akkoma#890
Loading…
Add table
Reference in a new issue
No description provided.
Delete branch "darkkirb/akkoma:improve-discord-embeds"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Recently, discord has introduced improved embedding of fedi posts on their platform. This works by fetching the post either via activitypub (unsigned) fetching or via the mastodon API
To make this work, a
<link rel="alternate" type="application/activity+json">
needs to be present for the embed fetcher to do anything special.It also has support for pleroma/akkoma-fe frontend URLs, however the static-fe only uses these in some, but not all places.
In particular, the current activity uses the internal activtiy ID as URL, which discord can only handle if authorized fetch is disabled.
Adding the alternate link makes sense to me independent of what Discord’s doing to clearly link it to the AP representation and identify it as being an AP object. If there need to be changes on our end anyway, generally I’d prefer if Discord just explained what’s needed so it can be exposed in regular OPenGraph metadata saving on extra requests and not depending on auth fetch being disabled.
Note if you want this to work in general (when the frontend isn’t locked down), you’ll also have to tweak akkoma-fe since not the static frontend isn’t used by everyone and disabled by default
I don’t understand what the second commit is trying to do or why
@ -0,0 +16,4 @@
[
rel: "alternate",
type: "application/activity+json",
href: url
While using the
Create
activity ID (what we pass asurl
in static_fe) here is consistent with the identifiers used in other metadata providers, semantically theNote
orQuestion
object is imho a better equivalent here for "alternate representation of this post in ActivityPub form"That is true, I was surprised that I received an /activity/ url back, especially since i have not yet seen those before
@ -0,0 +28,4 @@
[
rel: "alternate",
type: "application/activity+json",
href: user.uri || user.ap_id
Since this claims to be the AP representation of the user, this should only ever use
ap_id
@ -27,0 +30,4 @@
# Workaround 2: Due to a Discord bug, this cannot be a relative URL
true -> url(~p[/notice/#{activity}])
_ -> activity.data["url"] || activity.data["external_url"] || activity.data["id"]
end
This whole block appears to do nothing since the defined variable is never used
I’m also not sure what it’s trying to do though. Why does Discord want a display URL?
For fetching the AP representation the canonical AP ID is always preferable.
Either it already got a display URL, in which case it will get an AP ID from parsing the new alternate link.
Or it got a AP ID and will either
oh oops, yeah that is a mistake on my end. the following code block was meant to use uri nstead of
activity.data["id"]
The display URL is for fetching via open mastodon api when authorized fetch is enabled. The canonical URL ought to work if authorized fetch is disabled, although I have not tested this.
Discord does two things with a
<link rel="alternate" type="application/activity+json" href="…">
:/notice/[alphanumeric ID]
; if this check passes, it parses the ID from there, and attempts to fetch/api/v1/statuses/[id]
, which works regardless of authorized_fetch/api/v1/statuses/[id]
doesn't return a valid status JSON object), it attempts to request the link's href directly withAccept: application/activity+json
, expecting to receive a valid ActivityPub JSON object in responseSo if the link is to
/notice/[alphanumeric ID]
, then embeds will work regardless ofauthorized_fetch
, since we'll take the first code path and hit the Mastodon API. If it's to/objects/[uuid]
, we'll attempt to fetch the AP JSON, which will fail ifauthorized_fetch
is enabled, at which point we'll be out of things to fall back on and just end up with the opengraph embed from the original HTML.The service doesn't attempt to parse a Mastodon-compatible status ID from the HTML page URL, because in practice that's a much weaker signal of Mastodon API compatibility than a corresponding
<link>
is.Just trying to parse any URL for a MaStoAPI status id doesn’t make sense, agreed. But if you detect an AP-type alternate link on the page and the alternate link itself doesn't match an MastoAPI pattern you could then fall back to parsing the original page URL for a MastoAPI id
Advertising a frontend URL as the AP representation doesn't make sense semantically. Furthermore, now anyone who actually wants to use this link as its intended, retrieving the AP representation, will end up making two requests (the first being answered with a redirect) also bringing the redirect-mishandling signature issue into this.
@ -0,0 +18,4 @@
"content" => "Me when i’m in the write test data competition but my opponent is akkoma"
})
assert ApUrl.build_tags(%{object: note, url: note.data["id"], user: user}) == [
This test is flawed; the static frontend actually passes the activity ID as
url
(consider adding a full static_fe test too)
https://social.treehouse.systems/@rcombs/114196825932804138
essentially it is for improved embeds specifically for activitypub-based websites, not a replacement or an addition to the existing ograph metadata. it includes rich text support for example
akkoma-fe already returns the frontend link, which is a weird divergence between akkoma-fe and static-fe I discovered when writing this
Sure, but it would make more sense for everyone if this was just exposed in regular metadata, so there’s no need for special logic in clients, making additional requests or to relying on authorized fetch being disabled.
But as I wrote before, adding the
rel="alternate"
link makes sense on its own. If Discord ever wants to improve previews further such that they work more reliably and without all these extra steps in the future, that’s the ideal way.@ -27,0 +27,4 @@
url = case user.local do
# Workaround 1: Discord expects a /notice/#{activity} URL for fetching posts with AP signing on
# Workaround 2: Due to a Discord bug, this cannot be a relative URL
This bug has now been fixed, so you should be able to switch to a relative URL now.
Some of the data we fetch via ActivityPub (or the Mastodon API) doesn't exist in the opengraph standard (eg quotes, replies, names/avatars/URLs of involved users), or is possible to represent but frequently is left out in existing implementations (eg full content of a post with a subject line), or doesn't really fit into opengraph well at a conceptual level (eg HTML rich-text representations of post content).
We considered sending an
Accept: application/activity+json,text/html,[…]
header, but unfortunately many sites return a sparse 401 ifapplication/activity+json
comes first in theAccept
header rather than falling back on giving the HTML response, which would mean affected URLs would fail to embed at all, which makes that a non-starter. Even if we were to implement the complex signing behavior those sites want, the possibility of a bug anywhere in that stack resulting in complete failures to embed would be unacceptable.Sites that implement the Mastodon JSON API will receive fewer requests than sites that only support standard ActivityPub (partly because we try the Mastodon API first when a site's ActivityPub paths are detected to likely be Mastodon-compatible, and partly because the Mastodon API includes author/quote/etc information into a single reply).
If you'd like to cut down on latency and bandwidth to Discord by a bit, note that some sites opt to return stripped-down HTML when an incoming request comes from a known embed service user-agent (like
Discordbot/2.0
), consisting solely of meta and link tags. This wouldn't actually reduce request count, but it'd at least shave a good several kB off of each embed pull.In the future, we might add support for sites to see the
Discordbot/2.0
and return ActivityPub JSON directly, but that's not on the roadmap at this time. It's a shame that the unreliable behavior of some instances preclude use of theAccept
header, but that's the current state of things, so further request-minimization would unfortunately require nonstandard weirdness.This should all — perhaps with the exception of rich HTML — apply equally to other micro/mini blogging services too though. Surely no one wants to implement a specific pattern for each of them and they already expose this somehow?
Note Akkoma also provide twitter-style metadata, so if anything doesn't exist in OpenGraph, using existing
twitter:*
tags or tags from some other standard are fine too.Heck, if really none exists yet creating a FEP for it or even
discord:*
tags might be a good idea.Hmmm, yeah.. i know Phoenix parses the incoming Accept headers and indicates the requested type to us but i have actually no idea how or if it allows handling falling back between multiple supported types. Thus I suspect *oma ends up rejecting this too
Note though, this can’t always work anyway. There exist several standalone frontends (and backends without any own frontend) which run on a different domain than the backend and are unlikely to preferentially or at all handle AP requests.
Furthermore, note compliant AP servers are not actually required to handle
application/activity+json
at all. The canonical and only required type isapplication/ld+json; profile="https://www.w3.org/ns/activitystreams"
and subsets of it. The latter is accepted by all implementations I know of.Mastodon using the former for outgoing requests is a known (and pointless) Mastodon-ism deviation.
I consider anything relying on keeping a list of all such agents a non-solution. It’s both cumbersome to maintain and most likely will never be complete. Well-behaved agents should do their best to keep load on the remote as small as possible on their own; if like here no mechanism exists to request just the meta tags yet this ofc can't be done.
Does some convention for this exists already, like for example
Accept: text/html; subset="head"
or so?Anyway, I wrote this twice before but just to make it clear again: i think adding an alternate link for the AP representation makes sense regardless of all of this. Meaning, i have no objection to merging this after all review comments are properly addressed. (note though that ultimately actually merging is up to floati not me)
I don’t mind discussing future improvements here further, but it’s not a blocker.
It using the non-canonical type is unfortunate, but just another Mastodon-ism i guess. (Perhaps an alternate link for both
activity+json
and the canonical form of the AP type should be added)As long as Discord is the only one doing this multi-request thing, the added load should also be negligible, but if this were to spread into every local URL-preview client it may be quite cumbersome and ultimately unnecessary and wasteful on both ends as with meta tags there’s already an establish channel to convey everything needed for preview in the initial request.
Plus, it not working with auth_fetch makes it somewhat unreliable (and given Discord users which weren’t using Akkoma filed issues about link previews before, this might also happen again if someone notices instance A running Akkoma (with auth_fetch) doesn't get the rich preview while instance B running glitchsoc (without auth_Fetch) does).
The embed service should work fine with Akkoma even if auth_fetch is enabled once this PR lands, since it'll recognize the path in the as "potentially Mastodon-compatible" and attempt to hit
/api/v1/statuses/[id]
(with the actual ActivityPub URL as a fallback), andGET /api/v1/statuses/[id]
isn't gated by auth_fetch.To clarify: does this apply to any ActivityPub alternate link being added to the header, or only if this alternate link is of the
/notice/xxx
form as the comment in the current, defunct second commit suggests?The latter I’d object to since it’s not actually the canonical AP address (which this alternate link claims to be), HTTP signing in the presence of redirects is frequently mishandled (causing pain if it gets used as it declares itself) and it should not be necessary even in the current state of things as I explained here
Due to inactivity and unresolved issues superseded by: #905
View command line instructions
Checkout
From your project repository, check out a new branch and test the changes.