Read image description from EXIF data #744
No reviewers
Labels
No labels
approved, awaiting change
bug
configuration
documentation
duplicate
enhancement
extremely low priority
feature request
Fix it yourself
help wanted
invalid
mastodon_api
needs docs
needs tests
not a bug
planned
pleroma_api
privacy
question
static_fe
triage
wontfix
No milestone
No project
No assignees
3 participants
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference: AkkomaGang/akkoma#744
Loading…
Reference in a new issue
No description provided.
Delete branch "timorl/akkoma:elseinspe"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
This is just #241 with the current develop merged in. I think it would be a pretty rad feature to have and I'll gladly go through review with this PR.
After this is done I might do the next part, with writing EXIF data as well, as @ilja intended, but no guarantees. <_<
fyi, if #745 gets merged the renaming of the existing Exiftool filter here as well as the order of filters generated by the
instance
mix tasks will need a little tweaking(i assume if #745 gets merged at all, it will be before this, as renaming the filter twice in short succession doesn’t make much sense)
Looking ahead a bit:
hmm, a single image file might be attached to multiple posts by different people. I guess
Dedupe
being the final upload filter will ensure different alt texts mean different file hashes and thus no conflict.But now editing the alt text will change file content, hash and thus the upload URL which previously never happened; this should be fine MastoAPI-wise, but probably needs some additional tweaks in our. Unless Mastodon also changes media ids on update, I’m assuming changing the id might cause issues with some clients. But if we just update the existing attachment objects, this will leave orphaned upload files without an associated attachment object. Cleaning old files during updates on the other hand entails querying all attachments to check if anything uses the same file; might be costly.
this fails lint so CI can't run tests over it - can you run
mix format
on this branch? thankiesRan the formatter (sorry!) although it touched a couple of files I'm pretty sure don't actually fall under this PR.
Renamed
StripLocation
toStripMetadata
, technically true already and definitely true after #745. As for the order of operations – wouldn't it be better to make the stripping more configurable and just make it skip overImageDescription
if metadata reading/writing is enabled? I'm not sure about exposing the configuration to users – on the one hand I also don't see why anyone would want to strip only GPS metadata (as you rightly point out in the other PR), on the other the metadata tags are not all for tracking purposes and I imagine there might be some e.g. photographers who would like to preserve at least a subset of them when uploading. Maybe having a way do define "don't strip tags from this list" would be a good idea?As for the writing part – I'll create an issue to discuss this to avoid creating chaos here. Definitely thank you for the points, I've been wondering what's so much harder about that part than the reading part and now I'm starting to get an understanding.
If the alt text get’s edited or the content of the different description metadata fields differs, skipping stripping can leave unintended data exposed
As for metadata stripping being more configurable in general (but independent of other loaded filters) to allow preserving/deleting specific tag (groups) this is a good point. It would probably be easiest to expose a
purge
andpreserve
string list which will just get mapped directly to exiftool commandline parameters. The drawback being misspelling something here will result in all uploads failingrunning mix format on this PR’s HEAD reverts some of the formatting changes for me:
And running it on the commit just before formatting was applied only touches stuff from this PR for me:
For reference this is with elixir/mix 1.15 and OTP26 (though since CI tests multiple elixir versions, this alone shouldn’t be the cause).
I’m not sure what stuff
mix format
picks up, but do you have anything which might influence it like a.editorconfig
file etc?the lint step in particular runs under 1.15/otp26 on ci (I should bump to 1.16 but I don't think formatting changes between them)
No need to shame me so hard for being only on 1.14. ;-;
Anyway, I updated to 1.15 and the formatting seems better now. Although I get a lot of tests failing, both on my branch as well as on
develop
(this was already the case for 1.14, so no change there). No idea what that is about, but I'm pretty sure none of these are related to the changes in this PR.I also changed the order of the filter application (I think?). I'm pretty sure the
purge
andpreserve
lists will still eventually be necessary, when we add writing, but it seems that won't happen for a while.Silly question – considering that the config seems to just be code, and running tests seems relatively simple, perhaps we could just add tests that check passing a single image through the configured filters and immediately fail on startup? Also, maybe this discussion should actually be happening in the other PR. <_<"
huh, so the Elixir version can affect formating results `^^
A few tests are flaky (mostly emoji stuff and "only federate to reachable instances") and occasionally fail eventhough nothing is wrong. Rerunning just those on their own will show if somethings wrong; i.e.
mix test <path to test source files>...
But if you get consistent and many failures that’s probably not it. Do you have all optional dependencies installed (e.g. ffmpeg)? Tests include all optional paths and if you’re missing stuff it won’t work
fwiw running tests on you’re branch (and after excluding one flaky fail) there’s only one real failure:
test/mix/tasks/pleroma/instance_test.exs:32 :: test running gen (Mix.Tasks.Pleroma.InstanceTest)
still checks for eadDescriptioncoming _after_
StripMetadata` and now fails.Quite a couple of these tests seem flaky and the logs being thrown around during test execution don't help with clarity. ._. Anyway, I fixed the test that was actually wrong (thanks for the pointer!) and after a few tries even got a "0 failures", so should now be fine.
I had a feeling i fixed something while porting the rename to #745 but forgot what. Now I (too late, sorry) remember:
The migration id is still
20220220135625
; this works fine while applying, but messes up the order when rolling migrations backSince rename action is idempotent I think we can still just rename it to
20240425120000_
. I guess to not break rolling back for everyone who already applied this we’ll also need a noop migration under the 2022 id