- Python 3.11+
git clone https://github.com/ThomasJanssen-tech/Agentic-RAG-with-LangChain.git
cd Agentic RAG with LangChain
python -m venv venv
venv\Scripts\Activate
(or on Mac): source venv/bin/activate
pip install -r requirements.txt
- Create a free account on Supabase: https://supabase.com/
- Create an API key for OpenAI: https://platform.openai.com/api-keys
Execute the following SQL query in Supabase:
-- Enable the pgvector extension to work with embedding vectors
create extension if not exists vector;
-- Create a table to store your documents
create table
documents (
id uuid primary key,
content text, -- corresponds to Document.pageContent
metadata jsonb, -- corresponds to Document.metadata
embedding vector (1536) -- 1536 works for OpenAI embeddings, change if needed
);
-- Create a function to search for documents
create function match_documents (
query_embedding vector (1536),
filter jsonb default '{}'
) returns table (
id uuid,
content text,
metadata jsonb,
similarity float
) language plpgsql as $$
#variable_conflict use_column
begin
return query
select
id,
content,
metadata,
1 - (documents.embedding <=> query_embedding) as similarity
from documents
where metadata @> filter
order by documents.embedding <=> query_embedding;
end;
$$;
- Rename .env.example to .env
- Add the API keys for Supabase and OpenAI to the .env file
-
Open a terminal in VS Code
-
Execute the following command:
python ingest_in_db.py
python agentic_rag.py
streamlit run agentic_rag_streamlit.py
While making this video, I used the following sources:
- https://python.langchain.com/docs/integrations/vectorstores/supabase/
- https://python.langchain.com/docs/integrations/text_embedding/openai/
- https://platform.openai.com/docs/guides/embeddings
- https://www.kaggle.com/code/youssef19/documents-splitting-with-langchain
- https://openai.com/index/new-embedding-models-and-api-updates/
- https://zilliz.com/ai-models/text-embedding-3-small