Adding memory in LLMs currently is through the use of Vector Databases.
This leads to what has been dubbed as: Retrieval Augmented Generation
To build a vector storage, you will need to:
- Ingest Documents
- Split Documents
- Create an Embedding Model
- Load it to a Vector Database
Example of building with Langchain is given here: Langchain RAG
Using orchestrator frameworks like LangChain and LlamaIndex can make things easier.
This project uses code from the following source:
- ****: Available at: URL to the original source