Flask application for building RAG (Retrieval Augmented Generation) systems using Google Vertex AI. Process PDFs with layout-aware parsing, chat with documents in multiple languages, and build AI-powered document analysis systems.
- Clone the repository:
git clone https://github.com/cherninlab/vertex-rag-flask.git
cd vertex-rag-flask
- Set up Python environment:
python -m venv .venv
source .venv/bin/activate # On Windows: .venv\Scripts\activate
pip install --upgrade pip
- Install the project:
pip install -e ".[dev]"
- Configure environment:
cp .env.example .env
# Edit .env with your configuration:
# - Add your Google Cloud Project ID
# - Set path to service account credentials
- Run the application:
flask run
-
Create a new project in Google Cloud Console
-
Enable required APIs:
# Enable Vertex AI API
gcloud services enable aiplatform.googleapis.com
# Enable Cloud Storage API
gcloud services enable storage.googleapis.com
# Enable IAM API
gcloud services enable iam.googleapis.com
- Create and configure service account:
# Create service account
gcloud iam service-accounts create vertex-rag-sa --display-name="Vertex RAG Service Account"
# Grant Vertex AI user role
gcloud projects add-iam-policy-binding YOUR_PROJECT_ID --member="serviceAccount:vertex-rag-sa@YOUR_PROJECT_ID.iam.gserviceaccount.com" --role="roles/aiplatform.user"
# Grant Storage Admin role (for bucket and file management)
gcloud projects add-iam-policy-binding YOUR_PROJECT_ID --member="serviceAccount:vertex-rag-sa@YOUR_PROJECT_ID.iam.gserviceaccount.com" --role="roles/storage.admin"
# Download credentials
gcloud iam service-accounts keys create credentials.json --iam-account=vertex-rag-sa@YOUR_PROJECT_ID.iam.gserviceaccount.com
Important Notes:
- Keep your credentials.json secure and never commit it to version control
- The service account needs both Vertex AI and Storage permissions to function properly
- You can use more granular permissions instead of storage.admin if needed
- Start the application:
flask run
# FLASK_DEBUG=1 FLASK_APP=src/app flask run --debug
-
Open http://localhost:5000 in your browser
-
Upload a document
-
After processing, you'll be redirected to the chat interface
vertex-rag-flask/
├── src/
│ ├── app/
│ │ ├── routes/ # API and web routes
│ │ ├── services/ # Business logic
│ │ ├── templates/ # HTML templates
│ │ └── utils/ # Helper functions
│ └── config/ # Configuration
├── tests/ # Test files
├── uploads/ # Temporary upload directory
├── credentials.json # GCP service account key
├── pyproject.toml # Project dependencies
└── README.md
- Python 3.11+
- Google Cloud Project with enabled APIs
- Service account with appropriate permissions
# Run pytest
pytest
# Run type checking
mypy .
# Run linting
flake8 .
The project uses several tools to maintain code quality:
- black: Code formatting
- isort: Import sorting
- flake8: Style guide enforcement
- mypy: Static type checking
These are automatically run as pre-commit hooks when you commit changes.
If you use VS Code:
- Install the "Remote - Containers" extension
- Open the project
- Click "Reopen in Container" when prompted
- VS Code will set up the development environment automatically
-
Permission Denied Errors
- Verify that your service account has all required roles
- Check if credentials.json is properly configured
- Ensure APIs are enabled in your project
-
Upload Failures
- Verify the file format is supported
- Check if the selected bucket exists
- Ensure your service account has storage permissions
-
Chat Not Working
- Verify Vertex AI API is enabled
- Check if the model has access to your document
- Ensure proper network connectivity
Contributions are welcome!
Distributed under the MIT License. See LICENSE
for more information.
⭐ If you find this project useful, please consider giving it a star! It helps make the project more visible and encourages development.