Skip to content

gawati/pdf2xml-service

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

PDF File to text

Dependencies

  • Python2

Install

Set up and activate a Python2.7 virtual environment. Then,

$ pip install -e .

Additionally, install lc_pdfminer

Run

$ export FLASK_APP=pdf2xml
$ flask run --port=5000

To turn on development features, set the env variable before running.

$ export FLASK_ENV=development

Build & Distribution

Version is maintained in setup.py.
python setup.py sdist will create a development package with “.dev” and the current date appended.
python setup.py release sdist will create a release package with only the version.
To learn more about the deploy process referenced, read this

Deploy

Activate the Python2 virtual environment.

  1. Install gunicorn
$ pip install gunicorn
  1. Set the log paths in gunicorn.conf

  2. Configure apache (apache.conf)

<Location "/path/to/pdf2xml-service">
    ProxyPass "http://127.0.0.1:5000/"
    ProxyPassReverse "http://127.0.0.1:5000/"
</Location>
  1. Run gunicorn
$ gunicorn -c gunicorn.conf -b 0.0.0.5000 pdf2xml:app

Check the app in the browser, http://your-ip-here/path/to/flaskapp

About

Service that converts PDF to XML

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •