Skip to content

Latest commit

 

History

History
21 lines (17 loc) · 699 Bytes

File metadata and controls

21 lines (17 loc) · 699 Bytes

Twitter Data Analysis

Team 7

Dinesh Kumar MR - CB.EN.U4AIE20011 Jangala Gouthami - CB.EN.U4AIE20032 Paval KS - CB.EN.U4AIE20047 Shreya Sanghamitra - CB.EN.U4AIE20066 Shrish Surya NT - CB.EN.U4AIE20067

About:

This project uses the Tweepy library and the Twitter API to stream tweets from Twitter and store them in a MongoDB cluster. The stored tweets are then analyzed using PyMongo and also two machine learning models: HashingTF+IDF+Logistic Regression and CountVectorizer+IDF+Logistic Regression. The results are then visualized using MongoDB Charts.

Requirements

  • Python v3.9.0
  • Tweepy v4.4.0
  • PyMongo v4.0.1
  • Pyspark v3.3.1
  • Pandas v1.3.4
  • Jupyter Notebook v6.4.5
  • MongoDB