This demo showcases an AI RAG Agent that leverages Text-To-Speech (TTS) and Speech-To-Text (STT) for LLM interactions using Deepgram and Groq LPU's.
BERT LLM to build vector embeddings for the user message and uploaded documents that undergo cosine similarity testing to find the most relevant for LLM context management.
DB connection through SQLAlchemy for documents, transcription sessions, user registration and vector embeddings of the uploaded documents.
The demo is designed to stream STT and TTS to enhance speed.
INSTALLATION macos:
- brew install ffmpeg and portaudio
- pip install -r requirements.txt
windows powershell:
-
cd C:
curl -L -o ffmpeg-release-essentials.zip https://www.gyan.dev/ffmpeg/builds/ffmpeg-release-essentials.zip -
Extract the FFmpeg Package: powershell -command "Expand-Archive -Path .\ffmpeg-release-essentials.zip -DestinationPath C:\ffmpeg"
-
Add FFmpeg to the System PATH: setx /M PATH "%PATH%;C:\ffmpeg\ffmpeg-\bin" ###Replace with the actual version directory inside C:\ffmpeg (e.g., ffmpeg-5.1-essentials_build)###
LAUNCH FLASK WEB APP: python3 app2.py
Toggle the sidebar for the AI RAG AGENT
CLI: python3 Quickagent.py
Create .env file for: GROQ_API_KEY = "" DEEPGRAM_API_KEY = ""
MAIL_USERNAME = "" MAIL_PASSWORD = "" MAIL_DEFAULT_SENDER = ""
OPENWEATHER_API_KEY = "" X-Api-Key =