Skip to content

Using plain text OCR and Markdown for EPUB PDF export

hendrack edited this page Dec 28, 2022 · 20 revisions

I have seen a feature request in the issues page for exporting to epub. This can be done without too much hassle using the markdown format. Markdown format is very easy to learn and does not involve much coding, here is a cheat sheet.

I use these workflows in Linux, so Windows users may have to adapt a bit.

One option for pdf/epub conversion is pandoc. Here are some examples:

https://jdhao.github.io/2019/05/30/markdown2pdf_pandoc/
https://learnbyexample.github.io/customizing-pandoc/
https://keleshev.com/my-book-writing-setup/

Another option for PDF output is to use LaTeX or ConTeXT and the CTAN/markdown package.
With this, you can either embed markdown syntax in your latex files, or embed .md files itself. I don't recommend LaTeX or ConTeXT for epub exporting.

Yet another option is to use RStudio Desktop. This IDE has a nice addon called bookdown and allows you to export markdown and r-markdown to pdf/epub.

Epub is mostly html/css code, if you export your OCR markdown text to epub, you can also use Sigil to fine tune your epub or edit the CSS layout there.

Clone this wiki locally