Skip to content

Latest commit

 

History

History
17 lines (17 loc) · 753 Bytes

README.md

File metadata and controls

17 lines (17 loc) · 753 Bytes

Spam ham detection

Objective:

  • To identify text messages/sms as spam or ham(non-spam).

Challenges:

  • SMSes are limited in length, number of features that can be used for classification is small.
  • Text messages generally include abbreviations, informal language, text-speak, other languages written in english.

Dataset:

  • UCI Machine Learning Repository has a collection of sms – SMS Spam Dataset.
  • It contains: A total of 4827 ham and 747 spam = 5574 messages

Classification technique

  • The NaiveBayes classifier is used here.

File Description

  • 'spam.py' file has the main program. GUI is build with tkinter.
  • Execute the file in the terminal with : 'python spam.py'
  • 'spamDetection.ipynb' has the analysis of the dataset.