doc: update information

netease-youdao · Dec 15, 2023 · d32305f · d32305f
1 parent d6ed14f
commit d32305f
Show file tree

Hide file tree

Showing 4 changed files with 18 additions and 17 deletions.
diff --git a/README.md b/README.md
@@ -122,7 +122,7 @@ You may find more information from our [wiki](https://github.com/netease-youdao/
 
 ## Training
 
-To be released.
+[Voice Cloning with your personal data](https://github.com/netease-youdao/EmotiVoice/wiki/Voice-Cloning-with-your-personal-data) has been released on December 13th, 2023.
 
 
 ## Roadmap & Future work

diff --git a/README.zh.md b/README.zh.md
@@ -123,7 +123,7 @@ uvicorn openaiapi:app --reload
 
 ## 训练
 
-待推出。
+[用你自己的数据定制音色](https://github.com/netease-youdao/EmotiVoice/wiki/Voice-Cloning-with-your-personal-data)已于2023年12月13日发布上线。
 
 ## 路线图和未来的工作
 

diff --git a/ROADMAP.md b/ROADMAP.md
@@ -10,15 +10,16 @@ The plan is to finish 0.2 to 0.4 in Q4 2023.
 ## EmotiVoice 0.4
 
 - [ ] Updated model with potentially improved quality.
-- [ ] If time allows, release training code to support fine-tuning using your own data.
-
-## EmotiVoice 0.3
-
 - [ ] First version of desktop application.
 - [ ] Support longer text.
-- [ ] Documentation: wiki page for hardware requirements. [#30](../../issues/30)
 
-## EmotiVoice 0.2
+## EmotiVoice 0.3 (2023.12.13)
+
+- [x] Release [The EmotiVoice HTTP API](https://github.com/netease-youdao/EmotiVoice/wiki/HTTP-API) provided by [Zhiyun](https://mp.weixin.qq.com/s/_Fbj4TI4ifC6N7NFOUrqKQ).
+- [x] Release [Voice Cloning with your personal data](https://github.com/netease-youdao/EmotiVoice/wiki/Voice-Cloning-with-your-personal-data) along with [DataBaker Recipe](https://github.com/netease-youdao/EmotiVoice/tree/main/data/DataBaker) and [LJSpeech Recipe](https://github.com/netease-youdao/EmotiVoice/tree/main/data/LJspeech).
+- [x] Documentation: wiki page for hardware requirements. [#30](../../issues/30)
+
+## EmotiVoice 0.2 (2023.11.17)
 
 - [x] Support mixed Chinese and English input text. [#28](../../issues/28)
 - [x] Resolve bugs related to certain modal particles, to make it more robust. [#18](../../issues/18)

diff --git a/data/DataBaker/README.md b/data/DataBaker/README.md
@@ -40,6 +40,8 @@ mkdir data/DataBaker/raw
 
 ### Step1 Preprocess Data
 
+For this recipe, since DataBaker has already provided phoneme labels, we will simply utilize that information.
+
 ```bash
 # format data
 python data/DataBaker/src/step1_clean_raw_data.py \
@@ -50,16 +52,14 @@ python data/DataBaker/src/step2_get_phoneme.py \
 --data_dir data/DataBaker
 ```
 
-If you want to get phonemes from the TTS frontend, then run:
-```bash
-# get phoneme
-python data/DataBaker/src/step2_get_phoneme.py \
---data_dir data/DataBaker \
---generate_phoneme True
-```
+If you have prepared your own data with only text labels, you can obtain phonemes using the Text-to-Speech (TTS) frontend. For example, you can run the following command: `python data/DataBaker/src/step2_get_phoneme.py --data_dir data/DataBaker --generate_phoneme True`. However, please note that in this specific DataBaker's recipe, you should omit this command.
+
+
 
 ### Step2 Run MFA (Optional, since we already have labeled prosody)
 
+Please be aware that in this particular DataBaker's recipe, **you should skip this step**. Nonetheless, if you have already prepared your own data with only text labels, the following commands might assist you:
+
 ```bash
 # MFA environment install
 conda install -c conda-forge kaldi sox librosa biopython praatio tqdm requests colorama pyyaml pynini openfst baumwelch ngram postgresql -y
@@ -174,7 +174,7 @@ Training tips:
 tensorboard --logdir=exp/DataBaker
 ```
 - The model checkpoints are saved at `exp/DataBaker/ckpt`.
-- The bert features are extracted in the first epoch and saved in `tmp/` folder, you can change the path in `exp/DataBaker/config/config.py`.
+- The bert features are extracted in the first epoch and saved in `exp/DataBaker/tmp/` folder, you can change the path in `exp/DataBaker/config/config.py`.
 
 
 ### Step5 Inference
@@ -187,6 +187,6 @@ python inference_am_vocoder_exp.py \
 --checkpoint g_00010000 \
 --test_file $TEXT
 ```
-__Please change the speaker name in the `data/inference/text`__
+__Please change the speaker names in the `data/inference/text`__
 
 the synthesized speech is under `exp/DataBaker/test_audio`.