Perform RAG (Retrieval-Augmented Generation) from your PDFs using this Colab notebook!
Powered by Llama 2
- Free, no API or Token required
- Fast inference on Colab's free T4 GPU
- Powered by Hugging Face quantized LLMs (llama-cpp-python)
- Powered by Hugging Face local text embedding models
- Set custom prompt templates
- Prepared Chat mode (not QA)
- Open in colab
- Make sure the Colab's Runtime Type is set to T4 GPU (at least)
- Edit preferences in Block 4
- Upload your PDF into Files (Default name:
rag_data.pdf
) - Runtime > Run all