Replies: 2 comments
-
No, not with PyMuPDF. All the pages are without structure - only consist of an image each. You must use an OCR tool to insert a text layer again, like ocrmypdf. Nothing beyond that is possible for principal reasons. In your list, step 4 makes no sense at that point.It should be done before taking a page image. Step 5, as it is worded, is impossible to do. You can recreate a text layer (OCR tool), but you cannot revocer the original structure (annotations, links, single, separate images that were on the original page, etc.) - no way, this is lost forever. |
Beta Was this translation helpful? Give feedback.
-
Hi,Really Thanks for the Reply. Could be better I you have any code sample for step-3 in pymupdf , but not to take the whole pdf as a copy , to take a copy of only images (including transparent images) and replace it back in same pdf with increased clarity after removing the old images . |
Beta Was this translation helpful? Give feedback.
-
Hi I went through the following link
#1183
which is good in converting whole pdf to an image format which is not editable.
Is it possible to convert back from image contained PDF to Editable PDF?
For Example I need like to following:
1.Getting the copy of whole page as an image.
2.Remove the old contents(both text,image) in the page.
3.Place the whole image in the same page.
4.apply_redaction() -based on my own coordinates
5.converting back to original PDF( like editable,searchable etc...)
Could be helpful greatly If there is any solution.
Beta Was this translation helpful? Give feedback.
All reactions