-
Notifications
You must be signed in to change notification settings - Fork 4.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug] Bark examples not working out of the box? #2781
Comments
Do you have write access to the folder? Seems like you don't. |
should I even have? anyway even with chmod 777 it still fails. |
This is your error |
I used the jfk.zip example from this other post: #2745 And it worked fine. I think it has to do with the folder structure, or the file types. I've tried to make them equal but still get that .local permission error with my wav files. |
Hi, I also encountered the same issue yesterday. I found out HuBERT custom tokenizer download path is not set in the current implementation. This is the model.config.LOCAL_MODEL_PATHS at https://github.com/coqui-ai/TTS/blob/dev/TTS/tts/layers/bark/inference_funcs.py#L134
I think other model paths are set at https://github.com/coqui-ai/TTS/blob/dev/TTS/tts/models/bark.py#L270, but hubert and tokenieer path is not set, so it directing ./root, which is read-only. I think you can fix it by modifying the hubert_tokenizer model path from ./root to others by hard-code or downloading the hubert_tokenizer manually to the /root/.local/share/tts/suno/bark_v0/. (this path may be different in your setting). I fixed this issue by adding the following line at https://github.com/coqui-ai/TTS/blob/dev/TTS/tts/models/bark.py#L270 like this.
I'm unsure if it helps your situation, but I just share my way. |
Any update on this? Just ran into this issue out-of-the-box myself. It seems that it's trying to download something to |
I encountered this problem too. After resolving the code, I found the problem arises from the bark config file
You can modify this config file to this to resolve this problem:
I have pulled a request to huggingface model card |
Should be fixed by #2894 |
Same bug encountered as of v0.22.0 for the github version of TTS. |
Same bug with 0.22.0 |
@erogol not fixed, you should reopen this. |
@reopio you have little big error in the code you fixed! so basically what you did was changing this
into this
replacing So you correctly identified the problem, but you didn't consider that python, not being bash, does NOT automatically expand the Running "tts" after your change, will cause the code to create a subdirectory named The correct solution would be using the
I hope you can add this correction to your erogol/bark PR :) |
ouch, just forgot to remember that was a JSON file ! 🤦♂😅 well, then the change has to be made in the model loading functions in |
same problem here. what is the recommended fix ? I did do the changes recommended ( edit
|
@arthurwolf: redownloading the models may fix it (as advised in #3567) but keep in mind you are likely to encounter other bugs, as this codebase is no longer officially maintained. If Bark is what you were after, you can install it from its official repository featuring updated code working out of the box. On the other hand, if you were looking for XTTS, you could try AllTalk. It's a newer XTTSv2 implementation coming with an API, DeepSpeed support, and other interesting additional features. |
Is this something that should be mentioned in the README.md? Until you said this I was not aware of this fact 🙂 |
I've been trying to get bark to work for weeks (generating speech then trying to use other methods to change the voice to match a sample), came to coqui-ai looking for an alternative (as it seemed to be able to do both tts and voice at the same time), and then coqui-ai docs say "hey if you want to do that use our version of bark" ... You might actually know how to do what I'm looking for. I need to either do text to speech with a custom voice (from a sample), or even just convert existing speech to have a different voice (from a custom sample). What would you recommend as the best way to get there, currently ? I'll look at https://github.com/erew123/alltalk_tts/ thanks a lot for that. |
Bark it's an amazing open-source TTS model from "Suno" but the version released by Suno however is not incredibly practical as it will only let you generate 16 words per run. Also, I think that recreating voices with it is a bit more convoluted than with XTTS, at least with the original Suno code, and I haven't researched other third-party implementations enough to be able to suggest one. However, the good news is that what you are trying to do is exactly what Coqui XTTS excels at! In facts, it only needs a 7-10 secs audio file for it to learn to speak approximately with the same voice. There is only a little problem, after Cocqui's shutdown and release of the model as open source, all of the codebase necessary to run it (this repository) got unmaintained, with some parts becoming broken... and this is exactly where third-parties re-implementations like AllTalk come into play essentially providing an updated, refined and enhanced version of it. Btw, if you need an even easier to use alternative than AllTalk, take a look at github.com/daswer123/xtts-webui, as you will be able to run all the steps I described above entirely from the included browser UI it comes with! That's all! I hope this clarifies all of your doubts and helps you getting on track PS: for an added bonus I'll just leave this here: dozens of free voices ready to be downloaded and used in XTTS, enjoy! |
Not another word on this please! It brings back... mixed memories ;) (#3569 (comment)) |
@illtellyoulater thank you so much for the help, I was stuck for a long time trying to get projects to work that I now realize were completely outdated/abandonned, I got alltalk running and I went from 20% to 90% of the way to what I want, absolutely amazing. Thank you again. Do you know if there's any way to get it to generate whispering or shouting? Some kind of keyword or prompting trick? Or some other project that'd be able to do that? I searched a lot and had not much luck. Bark is able to do it a little bit some of the time, but not with a custom voice... |
This is now fixed in our fork, available via |
Describe the bug
I have been following this tutorial: https://tts.readthedocs.io/en/dev/models/bark.html#example-use
To Reproduce
But this is the result I got:
Expected behavior
For it to produce the output.wav with the voice in the bark_voices folder
Logs
No response
Environment
Additional context
No response
The text was updated successfully, but these errors were encountered: