deepcast is an AI-powered CLI tool that generates engaging podcast-style conversations with realistic text-to-speech capabilities. Perfect for creating educational content, practice conversations, or exploring topics in a dialogue format.
- 🤖 AI-Powered Conversations: Uses Deepseek-V3 model for generating natural, educational dialogues
- 🎧 Interactive Format: Generates engaging podcast-style conversations between two speakers
- 📚 Educational Content: Creates deep, insightful discussions on any given topic
- 🗣️ Text-to-Speech: Integrates PlayHT for converting conversations into realistic audio
- 🚀 Background Music: Add ambient music with adjustable volume
- 😊 Voice Emotions: Control speaker emotions (happy, serious, excited, etc.)
- 📄 Rich File Support: Generate from TXT, PDF, DOCX, EPUB, Markdown, HTML files
- 🌐 Web Content: Generate from web articles, YouTube transcripts, and URLs
- 🔄 Content Combination: Combine multiple sources into one podcast
- 🌍 Multiple Languages: Support for English, Spanish, French, German, Italian, and Portuguese
- 🎭 Podcast Styles: Different conversation styles (interview, debate, storytelling, etc.)
- 📊 Complexity Levels: Adjust content for beginner, intermediate, or expert audiences
- 🚀 Easy to Use: Simple CLI interface with rich terminal output
- Clone the repository:
git clone https://github.com/byigitt/deepcast.git
cd deepcast
- Install dependencies using uv:
uv venv
uv pip install -e .
- Create a
.env
file from the example:
cp .env.example .env
- Add your API keys to the
.env
file:
- Get an OpenRouter API key from OpenRouter
- Get a FAL API key from FAL.ai
List available podcast styles:
deepcast styles
List available background music:
deepcast music
List available voice emotions:
deepcast emotions
Create a podcast about any topic with custom settings:
# Basic usage
deepcast generate "Quantum Computing"
# With custom style
deepcast generate "Quantum Computing" --style debate
# With background music
deepcast generate "Quantum Computing" --music ambient --volume 0.2
# With voice emotions
deepcast generate "Quantum Computing" \
--speaker1-emotion professional \
--speaker2-emotion friendly
# Full customization
deepcast generate "Quantum Computing" \
--style educational \
--complexity expert \
--language french \
--exchanges 7 \
--music soft_piano \
--volume 0.15 \
--speaker1-emotion serious \
--speaker2-emotion excited \
--save-audio \
--format mp3
Create a podcast from various file types:
# From a single file with music
deepcast generate "Research Paper" \
--file paper.pdf \
--music ambient
# From multiple files with emotions
deepcast generate "Research Summary" \
--file paper1.pdf \
--file paper2.pdf \
--speaker1-emotion professional \
--speaker2-emotion friendly
# From different file types with full audio
deepcast generate "Documentation" \
--file intro.md \
--file chapter1.docx \
--file appendix.pdf \
--music soft_piano \
--volume 0.2 \
--save-audio
Create a podcast from web content:
# From a web article with music
deepcast generate "News Article" \
--url "https://example.com/article" \
--music cinematic
# From a YouTube video with emotions
deepcast generate "Video Summary" \
--youtube "https://youtube.com/watch?v=..." \
--speaker1-emotion excited \
--speaker2-emotion professional
# Combine web and file content with full audio
deepcast generate "Research Review" \
--file paper.pdf \
--url "https://example.com/article" \
--youtube "https://youtube.com/watch?v=..." \
--music jazz \
--volume 0.1 \
--save-audio
Save the transcript to a file:
deepcast generate "Artificial Intelligence" --output transcript.txt
Only get the audio URL:
deepcast generate "Space Exploration" --audio-only
Save audio locally:
deepcast generate "Nature Documentary" \
--music nature \
--save-audio \
--format mp3
Combine all features:
deepcast generate "Advanced Physics" \
--file research.pdf \
--file notes.md \
--url "https://example.com/article" \
--youtube "https://youtube.com/watch?v=..." \
--style educational \
--complexity expert \
--language french \
--exchanges 7 \
--music cinematic \
--volume 0.15 \
--speaker1-emotion professional \
--speaker2-emotion excited \
--save-audio \
--format mp3 \
--output transcript.txt
src/
├── models/ # Data models (Podcast, Config, Audio)
├── services/ # Core services (LLM, Audio, File, Content)
├── utils/ # Utility functions (Config)
└── cli.py # CLI interface
The following environment variables can be configured in .env
:
OPENROUTER_API_KEY
: Your OpenRouter API key for accessing the Deepseek modelFAL_KEY
: Your FAL.ai API key for text-to-speech conversionLOG_LEVEL
: Optional logging level (default: INFO)
Contributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.
- Fork the repository
- Create your feature branch (
git checkout -b feature/AmazingFeature
) - Commit your changes (
git commit -m 'Add some AmazingFeature'
) - Push to the branch (
git push origin feature/AmazingFeature
) - Open a Pull Request
This project is licensed under the MIT License - see the LICENSE file for details.
- OpenRouter for providing access to the Deepseek model
- FAL.ai for the text-to-speech capabilities
- PlayHT for voice synthesis
- Pixabay for background music
- All our contributors and users