This project showcases AI/ML expertise through the development of a custom-trained image generation model focused on Indian tabla drums. By leveraging diffusion models and GAN architectures, the goal is to correct inaccuracies in existing AI-generated tabla drum images and produce high-quality, culturally accurate representations.
- Curate & preprocess a high-quality dataset of tabla drum images.
- Train a custom fine-tuned model using Stable Diffusion and GAN architectures.
- Evaluate & optimize image fidelity using FID and perceptual loss metrics.
- Deploy the model via a web-based API with a front-end for image generation.
- Frameworks: PyTorch, TensorFlow
- Model Architectures: Stable Diffusion, DreamBooth, StyleGAN, GAN-based approaches
- Data Augmentation: OpenCV, Albumentations
- Web Scraping: Selenium, Playwright, Requests
- Annotation: Label Studio
- Storage: Cloud-based dataset hosting (AWS S3, Hugging Face Datasets)
- API Hosting: FastAPI, Flask
- Web UI: Streamlit, Gradio, Hugging Face Spaces
- Infrastructure: Docker, Kubernetes (planned for production-scale deployment)
- GitHub Actions: CI/CD automation for training jobs & model updates
- Experiment Tracking: Weights & Biases (planned integration)
- Collaboration Tools: Notion, Trello
tabla-image-gen-ai/
βββ data_collection/ # Web scraping & dataset collection
β βββ scraper/ # Selenium/Playwright-based image scraper
β βββ images/ # Raw and processed dataset images
β βββ metadata/ # Image metadata & annotations
β
βββ data_processing/ # Dataset cleaning & augmentation pipeline
β βββ preprocessing.py # Image resizing, enhancement, noise reduction
β βββ augmentation.py # Data augmentation transformations
β
βββ model_training/ # ML model training pipeline
β βββ train.py # Fine-tunes diffusion model or GAN on dataset
β βββ evaluation.py # Calculates FID, PSNR, SSIM
β
βββ deployment/ # Web-based model serving
β βββ api/ # FastAPI/Flask server for generating images
β βββ ui/ # Streamlit/Gradio front-end
β
βββ notebooks/ # Jupyter notebooks for data exploration
βββ app/requirements.txt # Python dependencies
βββ README.md # This document
git clone https://github.com/tyler-pritchard/tabla-image-gen-ai.git
cd tabla-image-gen-ai
pip install -r app/requirements.txt
python data_collection/scraper/tabla_image_scraper.py
python data_processing/preprocessing.py
python model_training/train.py
python deployment/api/main.py
β
Web scraping functional (collecting 100+ high-res tabla images)
β
Dataset preprocessing implemented (image cleaning, augmentation)
π§ Model fine-tuning in progress (Stable Diffusion adaptation)
π§ Deployment infrastructure in planning
- Addresses a real-world AI failure: Existing generative AI models struggle with authentic tabla representations.
- Demonstrates ML proficiency: Covers data collection β model training β deployment.
- Production-ready deployment: Architected for real-world applications with Scalable APIs.
- Extensible for other domains: Can be adapted for any niche image dataset.
- Author: Tyler Pritchard
- GitHub: github.com/tyler-pritchard
- LinkedIn: linkedin.com/in/tyler-pritchard