Some Chinese text pronunciations are weird. #101

lyris · 2025-02-10T10:31:15Z

What happened?

[A bug happened!]
the .onnx version has unclear pronunciation of some words such as "质量", but the .pt version has no problem.
text: '图的质量总体来说比之前好，会有些图的质量非常高，但不稳定，并且画风会变来变去。'
phonemes: 'tʰu↗ tɤ ꭧɨ↘lja↘ŋ ʦʊ↓ŋtʰi↓ lai↗ʂwo→ pi↓ ꭧɨ→ʨʰjɛ↗n xau↓, xwei↘ jou↓ɕje→ tʰu↗ tɤ ꭧɨ↘lja↘ŋ fei→ꭧʰa↗ŋ kau→, ta↘n pu↘ wə↓nti↘ŋ. pi↘ŋʨʰje↓ xwa↘fə→ŋ xwei↘ pjɛ↘nlai↗pjɛ↘nʨʰy↘.'

example audios:
.onnx
.pt

Steps to reproduce

The latest onnx version kokoro-onnx 0.4.2

import soundfile as sf
from kokoro_onnx import Kokoro
from misaki.zh import ZHG2P

kokoro = Kokoro("kokoro-v1.0.onnx", "voices-v1.0.bin")
text = '图的质量总体来说比之前好，会有些图的质量非常高，但不稳定，并且画风会变来变去。'
lang = 'cmn'
g2p = ZHG2P()
for voice in kokoro.get_voices():
    if voice.startswith('z'):
        phonemes = g2p(text)
        print(f'{voice} speak {text} {phonemes}')
        samples, sample_rate = kokoro.create(phonemes, voice=voice, lang=lang, is_phonemes=True, trim=False)
        sf.write(f"output/onnx_version_{voice}.wav", samples, sample_rate)

native kokoro==0.7.12 misaki[zh]==0.7.12 (https://github.com/hexgrad/kokoro) has no problem:

import soundfile as sf
from kokoro import KPipeline

pipeline = KPipeline(lang_code='z', device='cpu')
text = '图的质量总体来说比之前好，会有些图的质量非常高，但不稳定，并且画风会变来变去。'
zh_voices = ['zf_xiaobei', 'zf_xiaoni', 'zf_xiaoxiao', 'zf_xiaoyi',
             'zm_yunjian', 'zm_yunxi', 'zm_yunxia', 'zm_yunyang']
for voice in zh_voices:
    for graphemes, phonemes, audio in pipeline(text, voice=voice):
        samples = audio.shape[0] if audio is not None else 0
        assert samples > 0, "No audio generated"
        print(f'{voice} speak {text} {phonemes}')
        sf.write(f'output/pt_version_{voice}.wav', audio, 24000)

example audio files are attatched above

### What OS are you seeing the problem on?

Window

### Package version

0.4.2

### Relevant log output

```shell

The text was updated successfully, but these errors were encountered:

thewh1teagle · 2025-02-10T20:12:59Z

Try language.py example

lyris · 2025-02-11T03:21:17Z

The English has never been a problem, but there are issues with the Chinese in the example above. I modified it based on language.py, which uses an English example. The bug I reported is the Chinese example.

fastfading · 2025-02-14T01:52:30Z

same here on m1 mac

fastfading · 2025-02-14T02:00:45Z

@lyris
from deepseek

I'm not the expert , hope it can help you

if you fix it , could you send me the fix , thanks

fastfading · 2025-02-14T02:15:22Z

maybe some bug in
https://github.com/hexgrad/misaki/blob/main/misaki/zh.py

lyris added the bug Something isn't working label Feb 10, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Some Chinese text pronunciations are weird. #101

Some Chinese text pronunciations are weird. #101

lyris commented Feb 10, 2025 •

edited

Loading

thewh1teagle commented Feb 10, 2025

lyris commented Feb 11, 2025

fastfading commented Feb 14, 2025

fastfading commented Feb 14, 2025

fastfading commented Feb 14, 2025

Some Chinese text pronunciations are weird. #101

Some Chinese text pronunciations are weird. #101

Comments

lyris commented Feb 10, 2025 • edited Loading

What happened?

Steps to reproduce

thewh1teagle commented Feb 10, 2025

lyris commented Feb 11, 2025

fastfading commented Feb 14, 2025

fastfading commented Feb 14, 2025

fastfading commented Feb 14, 2025

lyris commented Feb 10, 2025 •

edited

Loading