Rich Media parsing was previously handled on-demand with a 2 second HTTP request timeout and retained only in Cachex. Every time a Pleroma instance is restarted it will have to request and parse the data for each status with a URL detected. When fetching a batch of statuses they were processed in parallel to attempt to keep the maximum latency at 2 seconds, but often resulted in a timeline appearing to hang during loading due to a URL that could not be successfully reached. URLs which had images links that expire (Amazon AWS) were parsed and inserted with a TTL to ensure the image link would not break.
Rich Media data is now cached in the database and fetched asynchronously. Cachex is used as a read-through cache. When the data becomes available we stream an update to the clients. If the result is returned quickly the experience is almost seamless. Activities were already processed for their Rich Media data during ingestion to warm the cache, so users should not normally encounter the asynchronous loading of the Rich Media data.
Implementation notes:
- The async worker is a Task with a globally unique process name to prevent duplicate processing of the same URL
- The Task will attempt to fetch the data 3 times with increasing sleep time between attempts
- The HTTP request obeys the default HTTP request timeout value instead of 2 seconds
- URLs that cannot be successfully parsed due to an unexpected error receives a negative cache entry for 15 minutes
- URLs that fail with an expected error will receive a negative cache with no TTL
- Activities that have no detected URLs insert a nil value in the Cachex :scrubber_cache so we do not repeat parsing the object content with Floki every time the activity is rendered
- Expiring image URLs are handled with an Oban job
- There is no automatic cleanup of the Rich Media data in the database, but it is safe to delete at any time
- The post draft/preview feature makes the URL processing synchronous so the rendered post preview will have an accurate rendering
Overall performance of timelines and creating new posts which contain URLs is greatly improved.
Applying works fine with a 20220220135625 version, but it won’t be
rolled back in the right order. Fortunately this action is idempotent
so we can just rename and reapply it with a new id.
To also not break large-scale rollbacks past 2022 for anyone
who already applied it with the old id, keep a stub migration.
This promotes and expands our existing optional migration.
Based on usage statistics from several instances, see:
#764
activities_hosts is now retained after all since it’s essential
for the "instance" query parameter of *oma’s public timeline to
reliably work in a reasonable amount of time. (Although akkoma-fe has
no support for this feature and apparently barely anyone uses it.)
activities_actor_index was already dropped before in
20221211234352_remove_unused_indices; no need to drop it again.
Birthday indices were introduced in pleroma starting with
20220116183110_add_birthday_to_users which is past the
last common migration 20210416051708.
OTP builds to 1.15
Changelog entry
Ensure policies are fully loaded
Fix :warn
use main branch for linkify
Fix warn in tests
Migrations for phoenix 1.17
Revert "Migrations for phoenix 1.17"
This reverts commit 6a3b2f15b7.
Oban upgrade
Add default empty whitelist
mix format
limit test to amd64
OTP 26 tests for 1.15
use OTP_VERSION tag
baka
just 1.15
Massive deps update
Update locale, deps
Mix format
shell????
multiline???
?
max cases 1
use assert_recieve
don't put_env in async tests
don't async conn/fs tests
mix format
FIx some uploader issues
Fix tests
This is useful for people who want to migrate back to Pleroma.
It's also added in the docs, but also noted that this is barely tested and to be used at their own risk.
By default Postgresql first restores the data and then the indexes when dumping and restoring the database.
Restoring index activities_visibility_index took a very long time.
users_ap_id_COALESCE_follower_address_index was later added because having this could speed up the restoration tremendously.
The problem now is that restoration apparently happens in alphabetical order, so this new index wasn't created yet
by the time activities_visibility_index needed it.
There were several work-arounds which included more complex steps during backup/restore.
By renaming this index, it should be restored first and thus activities_visibility_index can make use of it.
This speeds up restoration significantly without requiring more complex or unexpected steps from people.
Close#304.
Notes:
- This patch was made on top of Pleroma develop, so I created a separate cachex worker for request signature actions, instead of Akkoma's instance cache. If that is a merge blocker, I can attempt to move logic around for that.
- Regarding the `has_request_signatures: true -> false` state transition: I think that is a higher level thing (resetting instance state on new instance actor key) which is separate from the changes relevant to this one.
Co-authored-by: Luna <git@l4.pm>
Reviewed-on: #312
Co-authored-by: @luna@f.l4.pm <akkoma@l4.pm>
Co-committed-by: @luna@f.l4.pm <akkoma@l4.pm>
During attachment upload Pleroma returns a "description" field.
* This MR allows Pleroma to read the EXIF data during upload and return the description to the FE using this field.
* If a description is already present (e.g. because a previous module added it), it will use that
* Otherwise it will read from the EXIF data. First it will check -ImageDescription, if that's empty, it will check -iptc:Caption-Abstract
* If no description is found, it will simply return nil, which is the default value
* When people set up a new instance, they will be asked if they want to read metadata and this module will be activated if so
There was an Exiftool module, which has now been renamed to Exiftool.StripLocation
User keys are now generated on user creation instead of "when needed",
to prevent race conditions in federation and a few other issues. This
migration will generate keys missing for local users.