Skip to content

Classifier that can accurately predict if a song is either "Rock", "Hip Hop", or "Pop" based on the lyrics.

Notifications You must be signed in to change notification settings

nickyp17/LyricsClassificationChallenge

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 

Repository files navigation

LyricsClassificationChallenge

The goal of this classification challenge is to build a classifier that can accurately predict if a song is either “Rock”, “Hip Hop”, or “Pop” based on the lyrics.

Dataset:

Provided with the dataframe: 'data.csv'. It contains 55,000 observations. The first 50,000 observations, called the training set include the lyrics and the Genre("Rock", "Hip Hop", or "Pop"). The last 5,000 observations (id starting with 'TEST_x') are named the testing set. The lyrics are provided for the testing set, but the 'Genre' is missing (value is 'unkown'). Using this, I will build a machine learning model that learns from the training set to predict the 'Genre' of the testing set. The goal is to be as accurate as possible but also to be able to estimate the accuracy well.

What I learned

  • Working with scikit learn in Python
  • Selecting the proper classifier/machine learning model
  • Using pandas for machine learning classification
  • Estimating accuracy of the classifier I built
  • How to adjust the model for machine learning
  • Gained a better understanding of different models for distribution

ea.csv

The 'ea.csv' file includes the estimated accuracy of the model used for the assignment.

About

Classifier that can accurately predict if a song is either "Rock", "Hip Hop", or "Pop" based on the lyrics.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published