Email-Spam-Filtering-Using-Decision-Trees-and-KNN

This repository contains code for Email Spam Filtering using two machine learning algorithms, Decision Trees and K-Nearest Neighbors (KNN). The code includes data preprocessing, model creation, hyperparameter tuning, and model evaluation. The models are trained on a dataset of emails labeled as spam or non-spam. Spam Detection using Machine Learning

In this project, we have developed a machine learning model to classify emails as spam or not spam. The dataset used for this task is the SpamBase dataset, which contains features extracted from 4,601 emails, with a total of 57 features, and a binary target variable indicating whether an email is spam or not. Requirements

Python 3.6+
Pandas
Numpy
Matplotlib
Seaborn
Scikit-learn
Imblearn

Steps

Importing Required Libraries
Loading Dataset
Data Exploration And Preprocessing
    Checking Missing Values
    Check the class distribution
    Distribution of each feature
    Checking correlation between the features
    Visualize the distribution of each feature
    Finding Outliers
    Removing Duplicate Records
    Data Oversampling
Feature Scaling
Dimensionality Reduction
Model Development
Model Evaluation

Usage

To run this project, clone the repository and open the Jupyter notebook spam_detection.ipynb. The notebook contains all the code for the project, and you can run each cell sequentially to reproduce the results.

Results

The best performing model was a KNN without PCA and the Qt, with an accuracy of 95% and F1-score of 0.94. Conclusion

In this project, we have developed a machine learning model to classify emails as spam or not spam. We explored the dataset, preprocessed the data, applied feature scaling and dimensionality reduction, and oversampled the minority class. We trained and evaluated various models and identified a KNN as the best performing model. The model achieved an accuracy of 95% and F1-score of 0.94.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
README.md		README.md
spamDetection.ipynb		spamDetection.ipynb
spam_final.csv		spam_final.csv
spam_final_test.csv		spam_final_test.csv
spam_final_train.csv		spam_final_train.csv
spambase_remake.csv		spambase_remake.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Email-Spam-Filtering-Using-Decision-Trees-and-KNN

About

Releases

Packages

Languages

KasunAbeyweera/Email-Spam-Filtering-Using-Decision-Trees-and-KNN

Folders and files

Latest commit

History

Repository files navigation

Email-Spam-Filtering-Using-Decision-Trees-and-KNN

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages