Skip to content

Music Embedding Clustering Using a Pretrained Speaker Verification Model. Can a model trained for speaker verification separate songs from different bands?

License

Notifications You must be signed in to change notification settings

gabrielziegler3/music-clustering

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

18 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Music Embedding Clustering

Music Embedding Clustering Using a Pretrained Speaker Verification Model. Can a model trained for speaker verification separate songs from different bands?

Method

Given a dataset with songs from some artists, I have extracted 15s excerpts from these songs and generated an embedding with ECAPA-TDNN pretrained for speaker-verification task on the VoxCeleb2 dataset.

Once we have the embeddings, we can visualize them on a TSNE plot:

The artists where the vocal components are the most predominant, like pop and rap, are the ones that the model is capable to separate the best. Interestingly, the techno genre represented by Boris Brejcha is also nicely separated and is closer to the metal and rock bands than to rap and pop

Future work

I intend to come back at this task to finetune the model for genre/artist/album identification.

About

Music Embedding Clustering Using a Pretrained Speaker Verification Model. Can a model trained for speaker verification separate songs from different bands?

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published