Gemma-3 OCR

Overview

Gemma-3 OCR is a Streamlit web application that leverages the Gemma-3 Vision model to extract and structure text from images. This tool makes it easy to convert text in images to well-formatted, editable content with a simple and intuitive user interface.

Features

Simple Image Upload: Easily upload images containing text using the file picker
Powerful Text Extraction: Uses Gemma-3 Vision model to extract text with high accuracy
Structured Output: Returns extracted text in a well-organized Markdown format
Clean Interface: Intuitive design with clear separation of input and results
Instant Processing: Get results within seconds of uploading an image

Requirements

Python 3.7+
Streamlit
Ollama
PIL/Pillow
An Ollama-compatible system with the Gemma-3 model installed

Installation

Clone this repository:

git clone https://github.com/yourusername/gemma3-ocr.git
cd gemma3-ocr

Install the required dependencies:
```
pip install streamlit ollama pillow
```
Make sure Ollama is installed and running with the Gemma-3 model:
```
ollama pull gemma3:12b
```

Usage

Start the application:
```
streamlit run app.py
```
Open your browser and navigate to the provided URL (typically http://localhost:8501)
Upload an image containing text using the sidebar upload button
Click "Extract Text" to process the image
View the extracted text in the main panel
Use the "Clear" button to reset results when needed

Example Use Cases

Digitizing printed documents
Extracting text from screenshots
Converting handwritten notes to digital text
Capturing text from presentation slides
Extracting content from diagrams and infographics

Acknowledgments

This project uses the Gemma-3 Vision model, developed by Google DeepMind and made available through Ollama.

Name	Name	Last commit message	Last commit date
Latest commit sushantdhumak Update README.md Mar 22, 2025 c7a9735 · Mar 22, 2025 History 7 Commits
README.md	README.md	Update README.md	Mar 22, 2025
gemma3.jpg	gemma3.jpg	Add files via upload	Mar 20, 2025
ocr_gemma3.py	ocr_gemma3.py	Add files via upload	Mar 21, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Gemma-3 OCR

Overview

Features

Requirements

Installation

Usage

Example Use Cases

Acknowledgments

Output

About

Releases

Packages

Languages

sushantdhumak/Gemma-3-OCR

Folders and files

Latest commit

History

Repository files navigation

Gemma-3 OCR

Overview

Features

Requirements

Installation

Usage

Example Use Cases

Acknowledgments

Output

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages