Tweak search #1113
No reviewers
Labels
No labels
approved, awaiting change
broken setup
bug
cannot reproduce
configuration
documentation
duplicate
enhancement
extremely low priority
feature request
Fix it yourself
help wanted
invalid
mastodon_api
needs change/feedback
needs docs
needs tests
not a bug
not our bug
planned
pleroma_api
privacy
question
static_fe
triage
wontfix
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
AkkomaGang/akkoma!1113
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "Oneric/akkoma:search-overhaul"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Both post and user search. Hashtag "search" remains unchanged.
Most of this should be a clear, objective improvements, but there are also some more opinionated/subjective changes.
The bulk of the changes is about user search which in addition to its poor result quality, #1112 showed to perform atrociously awful, which should now be better (albeit a planner confusion keeps it from being as efficient as it could be on my instance, but still on average clearly better than before. See commit message for perf measurements. Should improve when/if we overhaul user deletion and actually drop deleted users from the database).
And some changes ot directly related to search but a consequence of changes made for search.
As a short summary of the more noteworthy and opinionated points:
simplerather thanenglishFTS configRanking by match quality is (presumably much? but haven’t tested it on real-world db tbh) more expensive and can only ever work with
offsetpagination rather thanidpagination.@); to be used by mention autocomplete (see also AkkomaGang/akkoma-fe#507)follow_onlyAPI toggle, it doesn’t make too much sense and degraded query performance (at least the way it was implemented before)restrict_unauthenticatedtoggles for search. Previously it was: everything, local-only or nothing with the latter being tied to a full instance lockdown (private instance)nicknameindex on the users table is replaced by an explicitly case-folded index. All queries (i hope, if i didn’t miss any) have been changed to match (was necessary for fast and case-insensitivestarts_withqueries)textfor a minor efficiency increase. Apparentlycitextis also considered “legacy” by postgres devs (the more powerful replacement are non-deterministic ICU collations, but they are not a good fit for prefix lookups as needed in the new user search; our other citext columnemailthough could in principle migrate to that)58996e46253af928ac07good thing CI uses an older postgres; apparently newer version like 18 autoconvert
citexttotextforCASEFOLD, but the version in CI (15?) errors out onCASEFOLDwith acitextargument.Now everything casefold call should explicitly convert to
text3af928ac07256ff4e600Overhaul searchto Tweak searchoh, actually the issue is
CASEFOLDitself not yet existinghmm...
Apparently it was added only very recently in PostgreSQL 18; too new to just bump our minimal required version. So i guess i’ll change this to use
LOWERinstead later. (The advantage ofCASEFOLDis that it can work better on glyphs with ambigous lower/upper forms or when only the lower xor upper form exists but not both.)256ff4e600b9925c3e12b9925c3e12b436369de2and the
"pg_c_utf8"collation is also relatively recent; seemingly introduced in postgresql 17Since an explicit collation is only used for
starts_withhere, i think just using the much older"C"/"POSIX"there should give equivalent results. (For lowercasing or casefolding though they differ; the latter cannot map e.g. accented characters likeÀ → à)EDIT: with this its now all good even on CI’s postgres 15, it seems
b436369de2b396825b4db396825b4d775337754d775337754dc92b794da29b94b3034680bc30b56880bc30b568a2eac6d414a2eac6d4145b05ab84f6Seems all good on my instance and didn’t break ihba either, so seems good to go