Feature request - OCR #370

Closed
opened 2023-03-29 18:44:32 +00:00 by jeder · 4 comments
Contributor

I just got reminded of that, and thought that it would be a nice to have and I know that it won't be added any time soon but whatever lmao

Anyway, OCR for alt-text yes
I am lazy and can't be arsed to write a wall of text yes

I just got reminded of that, and thought that it would be a nice to have ~~and I know that it won't be added any time soon but whatever lmao~~ Anyway, OCR for alt-text yes I am lazy and can't be arsed to write a wall of text yes
Johann150 added the
feature
label 2023-03-30 15:43:58 +00:00
Owner

Mastodon seems to use client side Tesseract, but that apparently requires adding a 10MiB GZip-ed blob to your web app. Not sure I'm a fan of that.

Are there some API providers maybe?

Mastodon [seems to use](https://github.com/mastodon/mastodon/pull/11566) client side Tesseract, but that apparently requires adding [a 10MiB GZip-ed blob](https://github.com/mastodon/mastodon/blob/main/public/ocr/lang-data/eng.traineddata.gz) to your web app. Not sure I'm a fan of that. Are there some API providers maybe?
Author
Contributor

all of the stuff i've found so far just seems scummy 🥴

all of the stuff i've found so far just seems scummy 🥴

Feel free to use/reference https://codeberg.org/calckey/calckey/commits/branch/main/search?q=ocr for our implementation. It uses Tesseract, but on the backend.

Feel free to use/reference https://codeberg.org/calckey/calckey/commits/branch/main/search?q=ocr for our implementation. It uses Tesseract, but on the backend.
Contributor

If you’re gonna run OCR on the backend, might also consider EasyOCR on especially beefy systems. It’s much better in my experience, except it doesn’t support vertical Chinese/Japanese text.

Sadly there’s no library, only a Python module.

If you’re gonna run OCR on the backend, might also consider [EasyOCR](https://github.com/JaidedAI/EasyOCR) on especially beefy systems. It’s much better in my experience, except it doesn’t support vertical Chinese/Japanese text. Sadly there’s no library, only a Python module.
Sign in to join this conversation.
No Label
feature
fix
upkeep
No Milestone
No Assignees
4 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: FoundKeyGang/FoundKey#370
No description provided.