Email-Spam-Detector

This project implements a machine learning model to classify emails as spam or ham (non-spam). It uses a Multinomial Naive Bayes classifier trained on TF-IDF vectorized email text.

Features

Trains a Multinomial Naive Bayes model.
Uses TF-IDF (Term Frequency-Inverse Document Frequency) for feature extraction.
Includes a script for interactive spam/ham prediction.
Saves the trained model and TF-IDF vectorizer for later use.
Handles potential errors during data loading, model training, and file saving/loading.

Requirements

Python 3.x
pandas
scikit-learn
matplotlib (optional - for confusion matrix visualization)
seaborn (optional - for confusion matrix visualization)
nltk
pickle

You can install these using:

pip install -r requirements.txt

Project Structure

Email-Spam-Detection/ ├── data/ │ └── email.csv # Your email data (CSV format)

├── models/ │ ├── spam_classifier.pkl # Saved trained model

│ └── tfidf.pkl # Saved TF-IDF vectorizer

├── notebooks/ │ └── model_training.ipynb # Notebook for initial exploration (optional)

├── src/ │ └── preprocessing.py # Data preprocessing functions

├── model_training.py # Script to train and save the model

├── prediction.py # Script for interactive prediction

├── requirements.txt # Project dependencies

└── README.md # This file

Code Explanation

src/preprocessing.py: Contains the load_and_preprocess_data() and preprocess_text() functions for loading data from the CSV file and cleaning and preprocessing the email text (removing non-alphanumeric characters, converting to lowercase, removing stop words).
model_training.py: Trains the model, evaluates it, and saves the trained model and TF-IDF vectorizer.
prediction.py: Loads the saved model and vectorizer and provides an interactive interface for making predictions on new emails.

Contributing

Contributions are welcome! Please feel free to open issues or submit pull requests.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
Data Preprocessing		Data Preprocessing
Model Training		Model Training
README.md		README.md
prediction.py		prediction.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Email-Spam-Detector

Features

Requirements

Project Structure

Code Explanation

Contributing

About

Languages

realniyaz/Email-Spam-Detector

Folders and files

Latest commit

History

Repository files navigation

Email-Spam-Detector

Features

Requirements

Project Structure

Code Explanation

Contributing

About

Topics

Resources

Stars

Watchers

Forks

Languages