GitHub - daxeda/unsupervised_ml_moosic: In this project, we will apply different unsupervised machine learning clustering techniques on Spotify data to classify 5000 songs into 25 playlists.

Unsupervised ML Case study: Moosic

Moosic, a small emerging company, specializes in crafting playlists meticulously curated by music experts with a keen eye on both classic and contemporary trends. With a dataset comprising approx. 5000 songs, encompassing details on distinct audio attributes like tempo, energy, and danceability, Moosic has tasked us with employing advanced clustering algorithms such as K-Means. The objective is to categorize the dataset into several clusters, creating recommendations for users with diverse playlists.

With this, we should be able to answer the following two questions:

Are Spotify’s audio features able to identify “similar songs”, as defined by humanly detectable criteria?
Is K-Means a good method to create playlists?

As a deliverable, we will present our findings to Moosic CEO.

Data Cleaning and Scaling

We first clean our data by dropping missing values and duplicates. For columns containing strings (song and artist names), we omit redundant spaces. We then scale data using MinMaxScaler().

Layered Approach to Clustering

We perform clustering by following a layered approach described below.

Main Clusters

To obtain our main clusters, we first identify relevant features using the Principal Component Analysis (PCA). We then select the number of main clusters using the Elbow and Silhouette charts (5 clusters).

Subclusters

We repeat the above procedure to create 5 subclusters for each of our main clusters. In total, we obtained 25 subclusters that we analyzed.

The presentation slides discuss the efficacy of these subclusters and provide our take on the questions asked (see presentation slides).

Note: This project is collaboration between Philip, Sumit and I.

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
.ipynb_checkpoints		.ipynb_checkpoints
data		data
notebooks		notebooks
README.md		README.md
api_spotify.json		api_spotify.json
moosic_group4.pdf		moosic_group4.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Unsupervised ML Case study: Moosic

Data Cleaning and Scaling

Layered Approach to Clustering

Main Clusters

Subclusters

About

Releases

Packages

Contributors 3

Languages

daxeda/unsupervised_ml_moosic

Folders and files

Latest commit

History

Repository files navigation

Unsupervised ML Case study: Moosic

Data Cleaning and Scaling

Layered Approach to Clustering

Main Clusters

Subclusters

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages