Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tensor size problem of se_features.pt extracted with openvoice #367

Open
pes0427 opened this issue Mar 1, 2025 · 1 comment
Open

Tensor size problem of se_features.pt extracted with openvoice #367

pes0427 opened this issue Mar 1, 2025 · 1 comment
Labels
bug Something isn't working

Comments

@pes0427
Copy link

pes0427 commented Mar 1, 2025

Hello, I'm doing a test of extracting se_features.pt with openvoice and putting it in another TTS speaker_embedding. However, the result is too noisy.
I extracted the SE file with openvoice, the size of the tensor comes out (1,256,1), but the size required by the TTS model is (1,512,1), so I padded it and adjusted the size. Maybe this is the problem, so I fixed the size of the tensor in a different way, but it was still there. When extracting the SE file with openvoice, is there a way to extract it in size (1,512,1)?

Copy link

dosubot bot commented Mar 1, 2025

Hey @pes0427! 👋 I'm here to assist you with any bugs, questions, or contributions you have. Let's squash those bugs together!

The tensor size of the speaker embeddings extracted with OpenVoice is determined by the architecture of the ref_enc module, which outputs a tensor of size (1, 256, 1) [1]. Unfortunately, there is no direct configuration or parameter in OpenVoice to specify a different tensor size, such as (1, 512, 1) [2].

To address the size mismatch, you might consider transforming the output tensor using techniques like linear transformation or neural network layers to map the 256-dimensional embeddings to 512 dimensions. This approach could help maintain the integrity of the embeddings while adapting them to the required size for your TTS model.

To continue talking to Dosu, mention @dosu.


Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Bug Report | Other

@dosubot dosubot bot added the bug Something isn't working label Mar 1, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant