Skip to content

Latest commit

 

History

History
41 lines (27 loc) · 1.97 KB

Readme.md

File metadata and controls

41 lines (27 loc) · 1.97 KB

Text Classification with Naive Bayes

This repository contains a Jupyter notebook that demonstrates the implementation of text classification using the Naive Bayes algorithm. The notebook is implemented using Python and popular libraries such as Pandas, NumPy, and Scikit-Learn.

The dataset used in this project is the 20 Newsgroups dataset, which contains approximately 20,000 newsgroup posts, partitioned into 20 different categories. The goal of this project is to build a machine learning model that can accurately classify these newsgroup posts into their respective categories.

Getting Started

To run the implementations, you will need to have Python 3 installed on your machine. You will also need to install the following libraries:

  • NumPy
  • Pandas
  • Matplotlib
  • Sklearn
  • NLTK

You can install these libraries using pip. For example, to install NumPy, you can run the following command:

pip install nltk

Once you have installed the required libraries, you can clone this repository to your local machine using Git. To do this, run the following command:

git clone https://github.com/reeba212/Text-Classification-Naive-Bayes

To run the notebook, navigate to the project directory in your terminal and run the following command:

jupyter notebook

This will open the Jupyter Notebook interface in your web browser. From here, you can open the notebook and run the cells to train and test the model.

Result

After training the model on the newsgroups dataset, I achieved an accuracy of over 87% on the test set. This demonstrates that the model is effective at classifying articles in the newsgroups dataset.

Conclusion

This project provides a practical example of text classification using the Naive Bayes algorithm. By studying the notebook, you can gain a deeper understanding of how the algorithm works and how it can be applied to real-world problems. With this knowledge, you can extend the implementation or use it as a starting point for your own projects.