[bug] Akkoma unable to parse non-ASCII usernames from other instances #787
Labels
No labels
approved, awaiting change
bug
configuration
documentation
duplicate
enhancement
extremely low priority
feature request
Fix it yourself
help wanted
invalid
mastodon_api
needs docs
needs tests
not a bug
planned
pleroma_api
privacy
question
static_fe
triage
wontfix
No milestone
No project
No assignees
3 participants
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference: AkkomaGang/akkoma#787
Loading…
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Your setup
From source
Extra details
Fedora 40
Version
8afc3bee7a
PostgreSQL version
16
What were you trying to do?
Fetching a post made by a user with a non-ASCII username (example: https://mastodon.vierkantor.com/@%CE%B5%E1%BD%90%CE%B4%CE%B1%CE%B9%CE%BC%CE%BF%CE%BD%CE%AF%CE%B1/112525442573810213)
What did you expect to happen?
Post gets fetched and Akkoma runs happily.
What actually happened?
Akkoma keeps retrying to fetch the post and it associated user, but is unable to. Instead it keeps retrying until it runs out of memory and the OOM killer terminates beam.smp.
This doesn't seem to hardlock the instance though once it gets restarted by systemd.
Logs
Severity
I cannot use it as easily as I'd like
Have you searched for this issue?
Seems like the server OOMing when the queue gets stuck is a bigger issue (see #736), but I think I'll leave this open as I do think we should do something about non-ASCII usernames.
by all accounts the actual rejection is correct, we (follow masto in that regard)[https://github.com/mastodon/mastodon/blob/main/app/models/account.rb#L70]
the question is, why doesn't it just throw the fetch away
The OOM part of this is fixed by #788
Curiously the actor from AkkomaGang/akkoma-fe#372 (
@你好@i18n.viii.fi
/https://i18n.viii.fi/%E4%BD%A0%E5%A5%BD
) with a CJK username and also URL-encode sequences in the AP ID can successfully be fetched on current develop. While the one mentioned here cannot (@εὐδαιμονία@mastodon.vierkantor.com
/https://mastodon.vierkantor.com/@%CE%B5%E1%BD%90%CE%B4%CE%B1%CE%B9%CE%BC%CE%BF%CE%BD%CE%AF%CE%B1
)Both use non-ASCII letters in both
name
andpreferredUsername
The name of our validation regex makes it sound like it is only supposed to apply to local actors, which would explain how one of these passes at all. I’m not completely sure how to read ruby db schema definitions, but the linked Mastodon regex might also mostly only apply to local actors:
d326ad0ed9/app/models/account.rb (L100)
Note how some of the error logs dropped the percentage signs from the URL: