Skip to content

This web application allows users to upload images containing text in both Hindi and English. The app extracts text using OCR and provides a keyword search functionality to search within the extracted text.

Notifications You must be signed in to change notification settings

Anugupta5102/ocr-demo

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

38 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

OCR Web Application with Hindi and English Text Extraction

This web application allows users to upload images containing text in both Hindi and English. The app extracts text using OCR and provides a keyword search functionality to search within the extracted text.

Features:

  • Upload an image (JPEG, PNG).
  • Extract text from images using Tesseract OCR.
  • Search for specific keywords in the extracted text.

How to Run Locally:

  1. Clone this repository: git clone

  2. Install the required Python packages: pip install -r requirements.txt

  3. Install Tesseract OCR:

  • On Ubuntu:
    sudo apt-get install tesseract-ocr
    
  • On Windows, download and install Tesseract.
  1. Install required dependencies and libraries.

pip install pytesseract

pip install Pillow

pip install streamlit

pip install torch

pip install transformers

  1. Run the application: streamlit run anu.py

Screenshots

English Text Hindi Text Extracted Text Hindi Keyword OCR App Search Results Words

License:

This project is licensed under the MIT License.

About

This web application allows users to upload images containing text in both Hindi and English. The app extracts text using OCR and provides a keyword search functionality to search within the extracted text.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages