-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Where to find datasets for wake word detection training? #9
Comments
The problem is that I have no idea where to find those. I was recommended to use these ones https://github.com/Picovoice/wake-word-benchmark/tree/master/audio but you'll still need to collect the records with no wakeword sound. What I did with the wakeword I built is to collect some minutes of records from a podcast through the microphone by running a for loop in a bash script using I also noticed that the audio quality matters, I collected all the initial records using my MacBook microphone with captures very clean sound, when I changed to use the model on my Jabra speaker with a Raspberry PI, I have to take some records there and added it to the dataset to achieve similar performance, because it captures the audio with some minor eco and background noise. That is another thing that stopped me to trying to collect and share a dataset, at the end it seems like a task that requires a group of people with several devices to be involved in. Currently I'm using a medium size model with threshold 0.93, min counter 15, and the gain normalizer filter and I having a pretty good experience where the detection works most of the time even when I'm watching tv. In case you are interested on my setup, I'm using it with OpenHAB with a whisper.cpp add-on I'm working on (for voice generation I'm still using a cloud service), and it gives me an acceptable experience, my server is running in an Orange Pi 5, I will get a Raspberry 5 in a couple of weeks, which can be overclocked to 3.0hz I think, I hope it works a little faster there. I'm using a small fine tuned whisper model for Spanish I found on HF. As speaker I'm using a Jabra Speaker2 40 connected to a Raspberry Pi Zero 2 W (previously I was using an older Jabra speaker but the sound was not too good as commented). video_2023-11-03_12-55-24-2.mp4The setup is summarized here https://community.openhab.org/t/dialog-processing-with-the-pulseaudiobinding/148191, still I don't think anyone has tried it and succeed. |
There is Mozilla Common Voice. And the words cut are provided by mswc. I could open a PR. |
Hello. I think it would be nice to include some links or hints in the README about this.
The text was updated successfully, but these errors were encountered: