Tweets-preprocessing

Preprocessing for tweets dataset using NLTK.

As we are all know we are in the era of data and most of this data are unstructured and based on article on mongodb :

From 80 to 90 percent of data generated and collected by organizations, is unstructured,, and its volumes are growing rapidly — many times faster than the rate of growth for structured databases.

So part of our work is to handle and clean this data so that it becomes useful and meaningful.

So here is my work as part of my assignment for natural language preprocessing.

I'm beginner so any improvements even a little ones will be appreciated.

Link of the dataset : https://www.kaggle.com/manchunhui/us-election-2020-tweets

Link of the article : https://www.mongodb.com/unstructured-data

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
Assignment 3 Github.ipynb		Assignment 3 Github.ipynb
README.md		README.md
Trump.csv		Trump.csv
joebiden.csv		joebiden.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Tweets-preprocessing

About

Releases

Packages

Languages

omarragi9/Tweets-preprocessing

Folders and files

Latest commit

History

Repository files navigation

Tweets-preprocessing

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages