A spam classification model which uses a sms spam classification dataset from https://archive.ics.uci.edu/ml/datasets/sms+spam+collection
The model displays the use of nlp techniques such as Porter Stemming, WordNet Lemmatization, Tokenization, Stopwords Removal, Bag of Words Model and TF-IDF Model.
The model has pretty good accuracy though due to imbalanced dataset there might be some errors. We can fix this by upsampling or downsampling the dataset in further edits.
The model uses a Multinomial Naive Bayes Classifer for making predictions.
Thanks.