python 3.9
pip install streamlit
Setup Google Cloud Vision using the instructions at https://cloud.google.com/vision/docs/ocr
Setup Google Document AI using the instructions at https://cloud.google.com/document-ai
- To read the PDF file and create .csv files
export GOOGLE_APPLICATION_CREDENTIALS="path to security credentials json file"
python doc_ai_table.py --pdf <path to pdf file> --folder <output folder>
Check the csv files produced for each table detected in the PDF.
Also check the header.json produced based on form-fields (key value pairs) detected in first page of PDF
Customize the post-processing logic based on your need to write the Invoice.csv<br?
- Run the UI demo
streamlit run app.py