ManaTTS-Persian-Tacotron2-Model

This repository introduces a Persian Text-to-Speech (TTS) model trained on the ManaTTS dataset, the largest publicly accessible single-speaker Persian corpus. The dataset comprises over 100 hours of high-quality audio (44.1 kHz) sourced from the Nasl-e-Mana magazine. The model is based on the Tacotron2 architecture and is designed to generate natural and high-quality Persian speech.

Model Weights: The trained model weights are hosted on Hugging Face. You can access them here: Persian-Tacotron2-on-ManaTTS.

Inference

You can use the provided inference notebook to generate speech from text.

Inference Notebook:

GitHub Notebook: inference.ipynb
Google Colab: Open in Colab

Output Samples

You can find output samples synthesized by the trained model in this directory along with the same utterances generated by two baseline models, the natural utterances, and utterances with gold spectrograms where the waveform is generated by the vocoder used in the study.

Ethical Use

The ManaTTS dataset and model are provided exclusively for research and development purposes. We emphasize the critical importance of ethical conduct in utilizing this dataset. Please refrain from any misuse, including but not limited to voice impersonation, identity theft, or fraudulent activities.

By accessing and using the ManaTTS dataset and model, you are obligated to uphold the highest standards of integrity and respect for user privacy. Any violation of these principles may have severe legal and ethical consequences.

Acknowledgments

We would like to express our sincere gratitude to Nasl-e-Mana, the monthly magazine of the blind community of Iran, for their generosity. Their commitment to openness and collaboration has been instrumental in advancing research and development in speech synthesis. We are especially thankful for their choice to release the data under the Creative Commons CC-0 license, allowing for unrestricted use and distribution.

Collaboration and Community Impact

We encourage researchers, developers, and the broader community to utilize the resources provided in this project, particularly in the development of high-quality screen readers and other assistive technologies to support the Iranian blind community. By fostering open-source collaboration, we aim to drive innovation and improve accessibility for all.

References

ManaTTS Dataset: Hugging Face Dataset | GitHub Repository
Tacotron2 Implementation: GitHub Repository
Model Weights: Hugging Face Model Repository

License

The model weights are licensed under CC0-1.0, the same license as the ManaTTS dataset.

The model implementation is based on Real-Time-Voice-Cloning, which is licensed under the MIT License. Below is the copyright statement for the original and modified works:

Modified & original work Copyright (c) 2019 Corentin Jemine (https://github.com/CorentinJ)  
Original work Copyright (c) 2018 Rayhane Mama (https://github.com/Rayhane-mamah)  
Original work Copyright (c) 2019 fatchord (https://github.com/fatchord)  
Original work Copyright (c) 2015 braindead (https://github.com/braindead)  
Modified work Copyright (c) 2025 Majid Adibian (https://github.com/Adibian)  
Modified work Copyright (c) 2025 Mahta Fetrat (https://github.com/MahtaFetrat)

Citation

If you use the ManaTTS dataset or this model in your research, please cite the following paper:

@article{fetrat2024manatts,
      title={ManaTTS Persian: A Recipe for Creating TTS Datasets for Lower-Resource Languages}, 
      author={Mahta Fetrat Qharabagh and Zahra Dehghanian and Hamid R. Rabiee},
      journal={arXiv preprint arXiv:2409.07259},
      year={2024},
}

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
output_samples		output_samples
LICENSE		LICENSE
README.md		README.md
inference.ipynb		inference.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ManaTTS-Persian-Tacotron2-Model

Inference

Inference Notebook:

Output Samples

Ethical Use

Acknowledgments

Collaboration and Community Impact

References

License

Citation

About

Releases

Packages

Languages

License

MahtaFetrat/ManaTTS-Persian-Tacotron2-Model

Folders and files

Latest commit

History

Repository files navigation

ManaTTS-Persian-Tacotron2-Model

Inference

Inference Notebook:

Output Samples

Ethical Use

Acknowledgments

Collaboration and Community Impact

References

License

Citation

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages