Skip to content

πŸͺ˜ Tabla Drum Image Generator – AI-powered tabla drum image generation using Stable Diffusion & GANs. Features custom dataset curation, ML training pipeline, and scalable API deployment.

License

Notifications You must be signed in to change notification settings

Tyler-Pritchard/tabla-image-gen-ai

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

12 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Tabla Drum Image Generator 🎡🎨

A Machine Learning Approach to Realistic Tabla Drum Image Generation

This project showcases AI/ML expertise through the development of a custom-trained image generation model focused on Indian tabla drums. By leveraging diffusion models and GAN architectures, the goal is to correct inaccuracies in existing AI-generated tabla drum images and produce high-quality, culturally accurate representations.


πŸš€ Project Goals

  • Curate & preprocess a high-quality dataset of tabla drum images.
  • Train a custom fine-tuned model using Stable Diffusion and GAN architectures.
  • Evaluate & optimize image fidelity using FID and perceptual loss metrics.
  • Deploy the model via a web-based API with a front-end for image generation.

πŸ›  Tools & Technologies

Machine Learning & AI

  • Frameworks: PyTorch, TensorFlow
  • Model Architectures: Stable Diffusion, DreamBooth, StyleGAN, GAN-based approaches
  • Data Augmentation: OpenCV, Albumentations

Data Collection & Processing

  • Web Scraping: Selenium, Playwright, Requests
  • Annotation: Label Studio
  • Storage: Cloud-based dataset hosting (AWS S3, Hugging Face Datasets)

Deployment & Serving

  • API Hosting: FastAPI, Flask
  • Web UI: Streamlit, Gradio, Hugging Face Spaces
  • Infrastructure: Docker, Kubernetes (planned for production-scale deployment)

Project Management & Version Control

  • GitHub Actions: CI/CD automation for training jobs & model updates
  • Experiment Tracking: Weights & Biases (planned integration)
  • Collaboration Tools: Notion, Trello

πŸ“‚ Project Structure

tabla-image-gen-ai/
│── data_collection/        # Web scraping & dataset collection
β”‚   β”œβ”€β”€ scraper/            # Selenium/Playwright-based image scraper
β”‚   β”œβ”€β”€ images/             # Raw and processed dataset images
β”‚   β”œβ”€β”€ metadata/           # Image metadata & annotations
β”‚
│── data_processing/        # Dataset cleaning & augmentation pipeline
β”‚   β”œβ”€β”€ preprocessing.py    # Image resizing, enhancement, noise reduction
β”‚   β”œβ”€β”€ augmentation.py     # Data augmentation transformations
β”‚
│── model_training/         # ML model training pipeline
β”‚   β”œβ”€β”€ train.py            # Fine-tunes diffusion model or GAN on dataset
β”‚   β”œβ”€β”€ evaluation.py       # Calculates FID, PSNR, SSIM
β”‚
│── deployment/             # Web-based model serving
β”‚   β”œβ”€β”€ api/                # FastAPI/Flask server for generating images
β”‚   β”œβ”€β”€ ui/                 # Streamlit/Gradio front-end
β”‚
│── notebooks/              # Jupyter notebooks for data exploration
│── app/requirements.txt    # Python dependencies
│── README.md               # This document

⚑ Setup Instructions

1️⃣ Clone this repository

git clone https://github.com/tyler-pritchard/tabla-image-gen-ai.git
cd tabla-image-gen-ai

2️⃣ Install dependencies

pip install -r app/requirements.txt

3️⃣ Run the web scraper (to collect tabla drum images)

python data_collection/scraper/tabla_image_scraper.py

4️⃣ Preprocess the images for model training

python data_processing/preprocessing.py

5️⃣ Train the AI model

python model_training/train.py

6️⃣ Deploy the image generation model

python deployment/api/main.py

πŸ“ˆ Current Progress

βœ… Web scraping functional (collecting 100+ high-res tabla images)
βœ… Dataset preprocessing implemented (image cleaning, augmentation)
🚧 Model fine-tuning in progress (Stable Diffusion adaptation)
🚧 Deployment infrastructure in planning


πŸ‘¨β€πŸ’» Why This Project Matters

  1. Addresses a real-world AI failure: Existing generative AI models struggle with authentic tabla representations.
  2. Demonstrates ML proficiency: Covers data collection β†’ model training β†’ deployment.
  3. Production-ready deployment: Architected for real-world applications with Scalable APIs.
  4. Extensible for other domains: Can be adapted for any niche image dataset.

πŸ“© Connect & Collaborate

Releases

No releases published

Packages

No packages published

Languages