Skip to content

Whisper

VRCWizard edited this page Aug 18, 2024 · 18 revisions

Implementation using WhisperNet a C# wrapper for whisper.cpp

Setup instructions no longer required, Whisper models can be downloaded directly in app.

STT method "Whisper"

  1. To get started using Whisper Download one of the models below or from the official whisper.cpp model list
    • Keep in mind that this implementation of Whisper uses your GPU.
    • The medium model may cause stuttering in a GPU intensive game like VRChat while in VR.
Recommended Model Download Size Memory
ggml-medium.bin 1.5 GB ~2.6 GB
ggml-small.bin 466 MB ~1.0 GB
ggml-base.bin 142 MB ~500 MB
ggml-tiny.bin 75 MB ~390 MB
  1. Add the model to Speech Provider > Local > Whisper.cpp Model (BIN file) image

Notes

  • Noises that the Whisper AI recognizes are filtered out by default. It can recognize music, keyboard and mouse clicks etc.
  • If you notice that after talking for too long that you end up outputting 2 TTS messages at the same time increase the Max Duration Setting under where you select your model. The default for TTS Voice Wizard is 8.0 or 8 seconds (the message is cut off after 8 seconds). (This issue has been solved with the TTS Message Queue System)

Need Help / Have Questions / Wanna make suggestions?

Donate

  • Leave me a Github Star ⭐ (it's free) or

Buy Me a Coffee at ko-fi.com

Clone this wiki locally