This is a graphical tool for performing Optical Character Recognition (OCR) on images and converting PDF files to images. Additionally, it allows for merging text files within a selected folder. The tool is built using CustomTkinter
for the GUI, EasyOCR
for OCR, pypdfium2
for PDF manipulation, and Pillow
for image handling.
- PDF to Image Conversion: Convert PDF files into images, with adjustable DPI settings for image quality.
- OCR on Images: Perform OCR on images in a selected folder to extract text and save it as
.txt
files. - Merge Text Files: Merge all text files in a folder into a single text file.
- User-friendly GUI: Built with
CustomTkinter
, making it easy to navigate.
To run this project, you need to have Python installed. Follow these steps to set it up:
Clone the repository:
bashCopy codegit clone https://github.com/yourusername/ocr-pdf-helper.git cd ocr-pdf-helper
Install the required dependencies:
bashCopy codepip install customtkinter pypdfium2 Pillow easyocr
You may need additional libraries like
pytorch
forEasyOCR
depending on your system.
Once installed, you can run the program directly using Python. The interface provides buttons and options for performing the tasks mentioned below.
- File Selector: Choose a PDF file that you want to convert into images.
- Set DPI: Adjust the DPI (dots per inch) for image quality (default is 100%).
- Convert: Convert the PDF into images. The images will be saved in a new folder named after the PDF.
- Folder Selector: Select a folder containing images on which OCR should be performed.
- Set OCR Language: Input the languages for OCR in a comma-separated format (e.g.,
eng,bn
for English and Bengali). - Perform OCR: The tool will scan each image, extract text, and save it as a
.txt
file in the same folder.
- Folder Selector: Select a folder that contains multiple
.txt
files. - Merge All Text Files: Click the "Merge All the Text Files" button to combine all the
.txt
files in the folder into one single file.
- PDF Path: Displays the selected PDF file path.
- Image Preview: After PDF to image conversion, the preview of the first image will be displayed.
- OCR and Merge Options: Available after selecting a folder for OCR and text merging.
Contributions are welcome! Feel free to fork this repository, make changes, and submit a pull request.
- Fork the repository.
- Create a new branch (
git checkout -b feature/your-feature-name
). - Commit your changes (
git commit -m 'Add some feature'
). - Push to the branch (
git push origin feature/your-feature-name
). - Open a pull request.
This project is licensed under the MIT License. See the LICENSE file for details.
Developed by Shaon An Nafi.
Feel free to reach out for any questions or suggestions.
This README.md
provides clear instructions for installation, usage, and contributing, making your project easy to understand for new users. Let me know if you need any changes!