-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
1 parent
0179337
commit 7edbece
Showing
3 changed files
with
48 additions
and
36 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,29 @@ | ||
## Linux | ||
|
||
- Run `apt-get install ocrmypdf` | ||
- Install ghostscript > 9.55 by following [these instructions](https://ghostscript.readthedocs.io/en/latest/Install.html) or running `scripts/install/ghostscript_install.sh`. | ||
- Run `pip install ocrmypdf` | ||
- Install any tesseract language packages that you want (example `apt-get install tesseract-ocr-eng`) | ||
- Set the tesseract data folder path | ||
- Find the tesseract data folder `tessdata` with `find / -name tessdata`. Make sure to use the one corresponding to the latest tesseract version if you have multiple. | ||
- Create a `local.env` file in the root `marker` folder with `TESSDATA_PREFIX=/path/to/tessdata` inside it | ||
|
||
## Mac | ||
|
||
Only needed if using `ocrmypdf` as the ocr backend. | ||
|
||
- Run `brew install ocrmypdf` | ||
- Run `brew install tesseract-lang` to add language support | ||
- Run `pip install ocrmypdf` | ||
- Set the tesseract data folder path | ||
- Find the tesseract data folder `tessdata` with `brew list tesseract` | ||
- Create a `local.env` file in the root `marker` folder with `TESSDATA_PREFIX=/path/to/tessdata` inside it | ||
|
||
## Windows | ||
|
||
- Install `ocrmypdf` and ghostscript by following [these instructions](https://ocrmypdf.readthedocs.io/en/latest/installation.html#installing-on-windows) | ||
- Run `pip install ocrmypdf` | ||
- Install any tesseract language packages you want | ||
- Set the tesseract data folder path | ||
- Find the tesseract data folder `tessdata` with `brew list tesseract` | ||
- Create a `local.env` file in the root `marker` folder with `TESSDATA_PREFIX=/path/to/tessdata` inside it |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters