- 11020CS573100 Music Information Retrieval Final Project @NTHU
- Contributors: 翁玉芯、石郁琳、倪又晞
We developed an word for word automatic lyrics alignment system to reduce manual labeling time of lyrics time stamp. We first separated voice and accompaniment of the input audio with Spleeter, and used wav2vec model + CTC algorithm to apply speech recognition, finally, we forced aligned the results with the groundtruth lyrics and got the time stamps. We made a website to visualize how well the lyrics and the audio were aligned through our system.