You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
1. ocrmypdf --force-ocr --output-type pdf "Blandford-1982-Hydromagnetic flows from accretion discs and the production of radio jets.pdf" "Blandford-1982-Hydromagnetic flows from accretion discs and the production of radio jets-ocr.pdf"
2. Open Blandford-1982-Hydromagnetic flows from accretion discs and the production of radio jets-ocr.pdf"
3. Only the vertical text in the margin is left which, before OCR had been as text.
Resolved. We were mistakenly suppressing an error message from Ghostscript, and also Ghostscript unexpectedly generates an output image of the whole page minus the offending image instead of exiting with an error. Between the two, ocrmypdf didn't notice anything was wrong.
Your original command will now exit with an error but suggesting using --continue-on-soft-render-error to proceed even though there is invalid/ambiguous content in the input PDF. That's expected behavior.
Describe the bug
Empty pages are produced, no text is recognized. If just rasterizing, I only get rasterized text that was previously OCRed but the text that was not OCRed is gone.
The file before OCR
Blandford-1982-Hydromagnetic flows from accretion discs and the production of radio jets.pdf
The file after
ocrmypdf --force-ocr --output-type pdf "Blandford-1982-Hydromagnetic flows from accretion discs and the production of radio jets.pdf" "Blandford-1982-Hydromagnetic flows from accretion discs and the production of radio jets-ocr.pdf"
Blandford-1982-Hydromagnetic flows from accretion discs and the production of radio jets-ocr.pdf
Steps to reproduce
Files
Blandford-1982-Hydromagnetic flows from accretion discs and the production of radio jets-ocr.pdf
How did you download and install the software?
PyPI (pip, poetry, pipx, etc.)
OCRmyPDF version
16.8.0
Relevant log output
No response
The text was updated successfully, but these errors were encountered: