Transcribe .ogg speech files with the Microsoft Speech Java SDK

This project demonstrates how use ffmpeg to convert .ogg files (Vorbis and Opus) to the right format for Speech-to-Text transcription using the Microsoft Cognitive Services Speech Service. This could be used to transcribe voice messages encoded using the Opus (https://en.wikipedia.org/wiki/Opus_(audio_format)) codec or other codecs using the .ogg container format.

One use for this project is the transcription of WhatsApp voice messages received through the WhatsApp Business API

To make this sample work, you need the Cognitive Services Speech Service Java SDK which has been already added to the pom file.

public final static String MS_SPEECH_KEY = "your-microsoft-speech-key";
public final static String MS_SPEECH_REGION = "westeurope";
public final static String MS_SPEECH_RECOGNITION_LANG = "de-de";

You also need to download ffmpeg which is used for transcoding and set the right path to it in the source. An audio file can be read from disk or passed as a byte array. It will then, in memory, be transcoded to wav / pcm format for transcription using the Cognitive Services Speech Service.

Also check out the Microsoft Speech SDK Sample Repository to learn more and use more of it's functionality.

Thank you @chgeuer for your contributions.

Name	Name	Last commit message	Last commit date
Latest commit malantin Update README.md Dec 5, 2019 2a64a18 · Dec 5, 2019 History 4 Commits
src/main/java/com/microsoft/cognitiveservices/speech/samples/ogg	src/main/java/com/microsoft/cognitiveservices/speech/samples/ogg	Add project source code	Sep 17, 2019
.classpath	.classpath	Add project source code	Sep 17, 2019
.gitignore	.gitignore	Add project source code	Sep 17, 2019
.project	.project	Add project source code	Sep 17, 2019
LICENSE	LICENSE	Initial commit	Sep 17, 2019
README.md	README.md	Update README.md	Dec 5, 2019
pom.xml	pom.xml	Add project source code	Sep 17, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Transcribe .ogg speech files with the Microsoft Speech Java SDK

About

Releases

Packages

Languages

License

malantin/java-ogg-to-ms-speech

Folders and files

Latest commit

History

Repository files navigation

Transcribe .ogg speech files with the Microsoft Speech Java SDK

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages