Skip to content

A project where we use the Tweepy python API to scrape tweets from Twitter, and then carry out LDA to model tweet topics and predict likely topics for any given tweet.

Notifications You must be signed in to change notification settings

MrBernoulli/Tweet-Topic-Modeling

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 

Repository files navigation

Tweet-Topic-Modeling

In this unsupervised learning project, we utilize latent Dirichlet allocation (LDA) to model likely topics for a given tweet.

We read tweets from Twitter using the Tweepy python API, then carry out the necessary preprocessings such as punctuation removal, tokenization, stemming, and lemmatization.

We then proceed to use the powerful Gensim library to carry out the LDA; under both Bag-of-words and TF-IDF tweet representations, and while varying the key hyperparamter in topic modeling: The number of topics.

About

A project where we use the Tweepy python API to scrape tweets from Twitter, and then carry out LDA to model tweet topics and predict likely topics for any given tweet.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published