This repository provides an implementation of contextual retrieval, a novel approach that enhances the performance of retrieval systems by incorporating chunk-specific explanatory context. By prepending contextual information to each chunk before embedding and indexing, this method improves the relevance and accuracy of retrieved results.
- Llama-Index: A powerful framework for building semantic search applications.
- Ollama: A local LLMs serving solution, using the gemma2:2b model.
- Streamlit: A Python framework for building interactive web applications.
- FastAPI: A high-performance API framework for building web applications.
- ChromaDB: A vector database for efficient storage and retrieval of high-dimensional embeddings.
Clone the GitHub repo
git clone https://github.com/RionDsilvaCS/contextual-retrieval-by-anthropic.git
cd contextual-retrieval-by-anthropic
Create Python env
and run requirements.txt
file
pip install -r requirements.txt
Create a directory data
and add all the PDF's there
mkdir data
Create .env
file and add below variables.
DATA_DIR="./data"
SAVE_DIR="./src/db"
VECTOR_DB_PATH="./src/db/cook_book_db_vectordb"
BM25_DB_PATH="./src/db/cook_book_db_bm25"
COLLECTION_NAME="add_collection_name"
API_URL="http://127.0.0.1:8000/rag-chat"
Run Python file create_save_db.py
to create ChromaDB and BM25 databases
python create_save_db.py
Begin with running Ollama
in separate terminal
ollama serve
Run the python file app.py
to boot up FastAPI server
python app.py
Run the python file main.py to start streamlit app
streamlit run main.py
- Contextual Embedding: The process of prepending chunk-specific explanatory context to each chunk before embedding.
- Contextual BM25: A modified version of BM25 that incorporates contextual information for improved relevance scoring.
GitHub @RionDsilvaCS · Linkedin @Rion Dsilva · Twitter @Rion_Dsilva_CS