Dataset that I have used about ham-spam. In short this is a email dataset for email classification. Our goal is to classify an email is ham or spam.
Working steps in below:
- Import dataset
- some visualization
- convert email to word counter
- convert word counter to vector
- custom pipeline for transforming those data
- Training those data in different Machine Learning algorithms
- Finally, found accuracy score