You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hugging Face transformers already support gguf. However, only several model architectures. So, we will do some test first. If it is ok we will suppert CPU accelerate smoothly. More detail see our discussion
Currently, we support CPU inference accelerate using llama.cpp. However, we will keep working on kimchima repo. We need to implement the CPT and fine-tune in kimchima
https://huggingface.co/microsoft/Phi-3-mini-4k-instruct-gguf?source=post_page-----7d1fa0b0b623--------------------------------
The text was updated successfully, but these errors were encountered: