SpamFilter

Bayesian Spam Filter to organize Spam and Ham

spam.py

This script will create 2 dictionaries utilizing the two provided test cases “learning_ham” and “learning_spam”. After creating the two dictionaries, the script will then create two files titled “outputHam.txt” and “outputSpam.txt” which contains the total number of words matched for all data sets, a list of all the words, their frequencies, P(word|spam or ham), and P(spam or ham|word).

To use this script, ensure that learning datasets are in the current directory, then on the command line type: python spam.py

This will produce the two files stated above.

In addition to learning the two dataset provided, the spam.py program also comes with the function to determine whether or not a folder containing email messages is considered spam or not. Simply going into the program and choosing the correct path for the variable fileTest will allow users to receive a list of files within the test folder that are either SPAM or HAM depending on the set confidence level.

Name	Name	Last commit message	Last commit date
Latest commit IrenaeusChan Delete README.txt Apr 16, 2015 22a65f8 · Apr 16, 2015 History 6 Commits
Assignment 5.docx	Assignment 5.docx	Initializing SpamFilter Repository with BayesianSpamFilter Files	Apr 16, 2015
README.md	README.md	Update README.md	Apr 16, 2015
hard_ham.tar.gz	hard_ham.tar.gz	Initializing SpamFilter Repository with BayesianSpamFilter Files	Apr 16, 2015
learning_ham.tar.gz	learning_ham.tar.gz	Initializing SpamFilter Repository with BayesianSpamFilter Files	Apr 16, 2015
learning_spam.tar.gz	learning_spam.tar.gz	Initializing SpamFilter Repository with BayesianSpamFilter Files	Apr 16, 2015
outputHam.txt	outputHam.txt	Initializing SpamFilter Repository with BayesianSpamFilter Files	Apr 16, 2015
outputSpam.txt	outputSpam.txt	Initializing SpamFilter Repository with BayesianSpamFilter Files	Apr 16, 2015
spam.py	spam.py	Initializing SpamFilter Repository with BayesianSpamFilter Files	Apr 16, 2015
test.tar.gz	test.tar.gz	Initializing SpamFilter Repository with BayesianSpamFilter Files	Apr 16, 2015
test_ham.tar.gz	test_ham.tar.gz	Initializing SpamFilter Repository with BayesianSpamFilter Files	Apr 16, 2015
test_spam.tar.gz	test_spam.tar.gz	Initializing SpamFilter Repository with BayesianSpamFilter Files	Apr 16, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SpamFilter

spam.py

About

Releases

Packages

Languages

IrenaeusChan/SpamFilter

Folders and files

Latest commit

History

Repository files navigation

SpamFilter

spam.py

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages