A Simple Discord Bot powered by locally run LLM using llama.cpp & llama.cpp python binding
Following instruction from llama-cpp-python
OR 👇
llama-cpp-python
offers a web server which aims to act as a drop-in replacement for the OpenAI API.
This allows you to use llama.cpp compatible models with any OpenAI compatible client (language libraries, services, etc).
To install the server package and get started:
pip install llama-cpp-python[server]
python3 -m llama_cpp.server --model models/7B/llama-model.gguf
Similar to Hardware Acceleration section above, you can also install with GPU (cuBLAS) support like this:
CMAKE_ARGS="-DLLAMA_CUBLAS=on" FORCE_CMAKE=1 pip install llama-cpp-python[server]
python3 -m llama_cpp.server --model models/7B/llama-model.gguf --n_gpu_layers 35
Navigate to http://localhost:8000/docs to see the OpenAPI documentation.
aiohttp
asyncio
json
discord
dotenv
- Plan to implement them as telegram bots instead.