Inlined reblog / announce / boost, like, reply and poll vote info desyncs from activities #956
Labels
No labels
approved, awaiting change
broken setup
bug
cannot reproduce
configuration
documentation
duplicate
enhancement
extremely low priority
feature request
Fix it yourself
help wanted
invalid
mastodon_api
needs change/feedback
needs docs
needs tests
not a bug
not our bug
planned
pleroma_api
privacy
question
static_fe
triage
wontfix
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
AkkomaGang/akkoma#956
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Your setup
From source
Extra details
Fedora (i believe)
Version
basically
16d7d612ff(exact: this fork)PostgreSQL version
17
Current Status (updated)
mixtasks available to retroactively fix desyncs for stuff inlined intoobjects: #964LikeandAnnounces, just with the authoritative reference beingAnswerobjects rather than an activity typeAFFECTED: I can now confirm poll votes are affected too. From incoming votes on a multi-choice poll, both vote Answer objects from a particular unlucky single voter were accepted and inserted into the database, but only one showed up in the inlined count. This effect manifest near immediately while the poll was still running ruling out any late-rot theories thus further cementing the dubious elixir-side caches as the main suspect
followactivities (and via a custom field the follow state is also duplicated in those activities...).Unlike the others, here no elixir-side caching is directly involved in either side though
What’s wrong?
I noticed some, but not all, old posts I once boosted, still showed up in my user profile (as expected), but the post details showed as if I didn’t boost it (in some cases the post details claimed nobody boosted it despite it showing up in my profile).
Attempting to re-boost the post via akko-fe, didn’t change the visual status nor the logical state. Notably the
api/v1/statuses/:id/reblogAPI did return a success though albeit the copy of the post details in its response body too indicated"reblogged": "false".This means affected posts are both impossible to re-boost and also un-boost via the frontend.
Manually calling
POST api/v1/statuses/:id/unreblogstill worked though restoring things to a consistent state.As it turns out, I got somehow lost from the inlined
announcementslist inside the postsdatacolumn. This is an inlined JSON array of the AP ids of all users who boosted a post. TheAnnounceactivities still existed in the database however, leading to it showing up in the profile(, preventing deletion during db prunes) and likely preventing/reblogattempts from doing anything at all.I have no idea how this happened or if it’s an still existing bug or something already fixed.
Ideally such info shouldn’t be duplicated at all, but I imagine the db will collapse if every render has to join
objectsandactivitiesvia AP IDs to get the list of boosters.Even if the underlying issue is already fixed or we figure out what the issue is¹, there might be more affected posts and instances than the few I’ve seen myself. If this issue is more widespread, adding a
mixtask to fix up the db state might make sense.[1]: Afairc though, it used to be normal and was only corrupted at some unknown later point. Else I’d have guessed it’s due to elixir-side caches not being in sync with transaction failures / success and a concurrent change to the same post ended up poisoning the cache and thus undoing the initial change from the boost.
Severity
I can manage
Have you searched for this issue?
After checking the database with an (unoptimised) query, I can confirm this also affects 24 objects on my own instance; the oldest from shortly after instance creation and the newest from end of May (so about two months ago).
For a single affected post, this query fixes the issue:
Besides identifying and fixing the underlying issue ofc, it remains to optimise both queries, convert them into
Ectoformat and create amixtask which both detects and fixes this where necessary.Checking the opposite direction too, i.e. users being listed in
announcementsetc without anAnnounceactivity existing, seems like a good idea too.We might also want to check whether anything else is implemented similarly and could be affected; namely replies, likes and emoji reactions
UPDATE (without checking the other direction still):
repliesis indeed also heavily affected (some ~4700 occurrences; if excluding non-public replies it’s still ~4600); since akko-fe usescontextto fetcht threads this doesn’t show up much thoughDesynced
repliesCounts "only" occured 57 timeslikestoo is affected (29 instances on my instance); this should be similarly visible in the frontend like the originally reported boost issuereactionsuse a different format which is more effort to fully check. There are no cases of all inlined emoji reactions having vanished while associatedEmojiReactactivities remain, but it seems plausible this one suffers tooInlined reblogs / announces / boosts info desyncs from Announce activitiesto Inlined reblog / announce / boost info desyncs from Announce activitiesInlined reblog / announce / boost info desyncs from Announce activitiesto Inlined reblog / announce / boost, like and reply info desyncs from activities(Still unoptimised) queries which identify and fix both missing (activity exists, but not in inlined data) and zombie (still in inlined data, but no activity) in one go for likes and announcements. Since they only differ by trivial replacements, just the one for likes is shown below:
Since they don't seem to have much value, I’m inclined to just drop inlined replies now already rather than resyncing them. The reply count will be kept though since it’s used in regular status responses. While it too desyncs, this is only a cosmetic issue.
UPDATE turns out the inlined
repliesarray already is unused anyway. Them being in the database is just an artifact of it temporarily holding the AP IDs of existing replies to fetch when a post is initially received.Query to fix replies count using only publicly-visible replies (note: only one of two instances in the code of increasing this counter currently restricts this to publicly-visible content)
(The format of inlined reactions remains a pain to deal with, so I still haven’t looked at it properly.
In fact it’s even worse,
Pleroma.Web.PleromaAPI.EmojiReactionController::filter_allowed_users/3shows there are three different allowed formats for array elements)Alright, I can now confirm
reactionswere indeed also affected, but apart from the expected desync as observed in other inline copies, there were also oddities resulting from Iceshrimp.NET federating (from its POV) remote emoji reactions with a@remote.domainindicator as part of the emoji shortcode. This sometimes lead to it being treated as a seperate emoji and sometimes correctly being folded into the orignal emoji (perhaps depending on the order in which reactions were processed).This might also be why sometimes remote emoji reactions just don’t seem to work (the API will split the name at the
@and look for a match for the initial part)I’ll let the validator drop these remote indicators and add a migration to fix up any existing activities.
I had 5 such Iceshrimp.NET-flavour remote emoji reactions 3 of which counted as separate emoji entries in
reactionand 1 desync unrelated to such remote indicators.Since pure SQL is often more readable than
fragment-infested ecto, here’s a pure SQL query to check forreactionsdesync — assuming each custom emoji reaction has exactly one element in thetagarray:Just noticed most of the inlined copies are updated only in side effects, i.e. after the main transaction ended already and we know some funny things occasionally happen during side-effect processing (#888). Though the odd crashes from #888 don’t seem to always cause a mismatch like observed here, they might still account for a portion of the missing entries
Inlined reblog / announce / boost, like and reply info desyncs from activitiesto Inlined reblog / announce / boost, like, reply and poll vote info desyncs from activitiesI can confirm this also affects poll votes.
From incoming votes on a multi-choice poll, both vote Answer objects from a particular unlucky single voter were accepted and inserted into the database, but only one showed up in the inlined count. This effect manifest near immediately while the poll was still running ruling out any late-rot theories thus further cementing the dubious elixir-side caches as the main suspect
The mix task should be extended to also resync local(!) polls, though if we delete
Answerobjects (from remote users) after a poll closed, this might only be applicable to still open polls. (And if we don't delete them yet, we perhaps should since they’re no longer relevant (other than to retroactively fix this bug))