Exiftool: Strip all non-essential metadata tags #745
No reviewers
Labels
No labels
approved, awaiting change
bug
configuration
documentation
duplicate
enhancement
extremely low priority
feature request
Fix it yourself
help wanted
invalid
mastodon_api
needs docs
needs tests
not a bug
planned
pleroma_api
privacy
question
static_fe
triage
wontfix
No milestone
No project
No assignees
3 participants
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference: AkkomaGang/akkoma#745
Loading…
Reference in a new issue
No description provided.
Delete branch "Oneric/akkoma:exiftool-strip-all"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Documentation was already clear on this only stripping GPS tags. But there are more metadata tags which can potentially be sensitive (e.g. author and possibly description) and the filter’s name alone suggests a broader effect than it currently has.
Thus change the filter to strip all metadata except for colourspace info and orientation.
Additionally fix a doc typo and enable stripping of HEIC and JXL (confirmed to work with exiftool 12.70 and 12.57, released on Feb. 23, 2023)
Judging from release logs, write support for JPEG XL was added in 12.25 (2021-04-22) and for HEIC in 11.33 (2019-03-28).
As a point of reference, Debian stable (bookworm) has a new enough version for both, oldstable (bullseye) has exiftool 12.16 lacking JXL (however, release logs mentioned making JXL support "official" in 12.25, potentially it already worked to some degree earlier)
HEIC and JXL samples for testing can be found here:
If desired i can make “strip all” vs “strip only GPS” a config option though personally i currently don’t see why one would want to strip one but not the other
.
There’s some relation to description extraction from #744 / #241:
StripLocation
no longer makes sensePleroma.Upload.Filter.Exiftool.ReadDescription
must be placed before metadata stripping to wrok correctlyIf the image description contained sensitive data, uploaders will now get a chance to review and edit it before publishing a post with it (assuming metadata stripping is active).
TODO note for me, make this more configurable as suggested by timorl
Quoting my comment over at the “read description“ PR
5a2fd2908f
tof01ace481a
Exact metadata tags to be stripped are now configurable, plus more tests.
For new config to show up in admin-fe AkkomaGang/admin-fe#25 is required
cc @timorl: to save you the trouble of now also having to migrate the filter’s config option, adjust all their docs and the relevant admin-fe code, i pulled in the renaming bits into this PR. You can just drop everything related to renaming
Aaa, I should've checked here before writing the other comment! :D
Looks nice, what do you think about checking the config on startup as I suggested in #744? Is that doable and a good idea even?
f01ace481a
to09798f5121
In theory this might be nice, but we’d need to ship unremovable fixed-location sample files and e.g.
Mogrify
i think already suffers from the same problem and if we add too much it will slow down startup.More problematic however is something i just recalled from 3.12.2’s
base_url
check: those deprecation checks happen before the database is loaded (since in theory deprecated options might also affect how to connect to the db?). And once the db is up the whole server is running and accepting requests.Meaning, we can’t know which filters and which options are selected here if in-database config is used. Ofc this also means, this deprecation warning and gracefull fallback as well as all other deprecation checks don’t work for in-db config... but given the renames comes with a migration for in-db config this shouldn’t actually be an issue here (but it would be for the startup test).
bit of a merge conflict here after the other exiftool PR i think
09798f5121
to3430aaec91
rebased and resolved conflicts dropping the duplicate rename
should be good again now
noticed this leaves the temporary stripped files around after running tests.
I’m not sure how this was cleaned up prevuiously, the officially sanctioned methods seem to be an
on_exit
callback or@(module)tag :tmp_dir
(the latter since elixir 1.11)Will update shortly
fb7c172d5e
to5bc64c5753
done (using
tmp_dir
)yeppers looks cool, i'll wait for tests to pass :nod:
@ -15,3 +15,3 @@
It is required for the following Akkoma features:
* `Pleroma.Upload.Filters.Mogrify`, `Pleroma.Upload.Filters.Mogrifun` upload filters (related config: `Plaroma.Upload/filters` in `config/config.exs`)
* `Pleroma.Upload.Filters.Mogrify`, `Pleroma.Upload.Filters.Mogrifun` upload filters (related config: `Pleroma.Upload/filters` in `config/config.exs`)
hehe plaroma
@ -32,0 +41,4 @@
assert String.match?(exif_filtered, ~r/Color Space/)
end
# this is a nonsensical configuration, but it shouldn't explode
good call, we all know the first thing someone will do is supply a nonsensical config :hehe: