paperless-gpt OCR results into selectable PDF text layer? #135
Replies: 1 comment 4 replies
-
It does not currently get embedded in the pdf, it simply replaces the "Content" section for the document. I have been trying to customize the prompt to get it to output valid hOCR, which is a format that includes the OCR text as well as positioning, but I'm getting mixed results. If someone figures out a reliable way to get valid hocr output then merging it with the pdf to create a selectable text layer would be trivial. |
Beta Was this translation helpful? Give feedback.
-
I looked at the video linked from the paperless-gpt README, and it seems like the OCR results are simply shown onscreen, right?
However, they aren't actually merged in as a selectable text layer on the PDF, right?
Or has it changed since the video? =)
Beta Was this translation helpful? Give feedback.
All reactions