Chat with your data while uploading a pdf file and using a local LLM.
- PDF File Structure Support
- Language Support
- Key Dependencies
- Setup Guidelines
- System Support
- Credits
- Upcoming:Files with well organized tables i.e.: a single row/column ins not divided in multi row/column
- Usually Research Paper Structure:
- Abstract
- Intorduction
- Background Works
- Dataset
- Methodology
- Result Analysis
- Discussion
- Future Works
- Conclusion
- No Image support for now
- Up coming: meta data support
- English
- Others are loading...
- Ollama with or without GPU
- Sentence-transformers
- Langchain
The models in use:
- Attempted Sentence Embedding, chosen on mainly MTEB leaderboard and personal experience:
- multi-qa-distilbert-cos-v1
- multi-qa-mpnet-base-dot-v1
- e5-base-v2
- RobBERT [Currently, In Use]
- Attemtped LLMs, chosen based on Mistral-7b's acceptable performence for low resource devices:
- Mistral-7b: instruct-v0.2-q2_K
- Mistral-7b: instruct-v0.2-q5_K_M
- Mistral-7b: instruct-v0.2-q6_K [Currently, In use]
To store models, open a sub-directory inside the "api" directory open a directory.
For example: "lang_models":
- OS tested:
Ubuntu>=20.04 LTS
- Create a
Python>=3.11
environment using conda or virtual env - Use the requirements file to install the dependencies:
pip install -r requirements.txt
- Use Ollama docker and Huggingface to pull/download all the models, refer to section: Key Dependencies for details and where to store the models inside your machine.
- Set the
.env
file according to the.env.example
structure. Note: For CPU inference, setUSE_GPU=0
- From the parent directory, to run the system, execute the command below in the termnal:
streamlit run api/app.py
- Integrated frontend with Streamlit
- Up-coming: Separated backend support
- Up-coming: Docker support
- Sharif Ahamed, MSc. in AI, University of Bradford, Bradford, United Kingdom, Email:
- For advising me through
- Soroush Yaghoubi, BSc. In Informatics, Technical University Dortmund, Dortmund, Germany:
- For the frontend idea and more works in future