A basic machine learning model built in python jupyter notebook to classify whether a set of tweets into two categories:
- racist/sexist
- non-racist/sexist
Sentiment analysis (also known as opinion mining) is one of the many applications of Natural Language Processing. It is a set of methods and techniques used for extracting subjective information from text or speech, such as opinions or attitudes. In simple terms, it involves classifying a piece of text as positive, negative or neutral.
-
Understand the Problem Statement
-
Tweets Preprocessing and Cleaning
- Data Inspection
- Data Cleaning
-
Story Generation and Visualization from Tweets
-
Extracting Features from Cleaned Tweets
- Bag-of-Words
- TF-IDF
- Word Embeddings
-
Model Building: Sentiment Analysis
- Logistic Regression
- Support Vector Machine
- RandomForest
- XGBoost
-
Model Fine-tuning
-
Summary
- Anaconda
- Jupyter Notebook
- Python Libraries(pandas,Numpy,seaborn matplotlib,re..etc)