Compare lyrics between languages by just providing the title
This repository contains integration to Genius API client or OpenAI for fetching lyrics and uses LaBSE [1] and AI4Bharat transliteration [2] for translating and calculating similarity.
Just run
python main.py --mode [global, local] <--source [genius, openai]>
Those in [
indicate possible options and those in <
indicate optional arguments
To utilize lyrics that are available locally, please create a lyrics_text
folder with subfolders as tamil
and telugu
You can then add files with song names in lowercase and spaces replaced by _
. For example, in lyrics_text/tamil
, you can have a file named pachai_nirame.txt
.
This repository includes an example where the files have already been downloaded. We will leave it upto the user to source lyrics from an appropriate source.
Please direct your queries to gpavanb1 for any questions.
[1] Feng, F., Yang, Y., Cer, D., Arivazhagan, N., & Wang, W. (2020). Language-agnostic BERT sentence embedding. arXiv preprint arXiv:2007.01852.
[2] Madhani, Y., Parthan, S., Bedekar, P., Khapra, R., Seshadri, V., Kunchukuttan, A., ... & Khapra, M. M. (2022). Aksharantar: Towards building open transliteration tools for the next billion users. arXiv preprint arXiv:2205.03018.
Many thanks to discussions with Prafulla Chandra A and Sankeerth Rao K on bringing this to fruition