Skip to content

Latest commit

 

History

History
23 lines (16 loc) · 940 Bytes

README.md

File metadata and controls

23 lines (16 loc) · 940 Bytes

HackerNews ⭐

This is a data analysis project consisting of:

  • Web scrapping using Scrapy and Postgres.
  • And exploratory data analysis.

Web Scrapper:

Post items and usernames were scrapped from the publicly available Hacker News API and stored on a postgres database running on the cloud.More information about the scrapper can be found in the README in the hackernews_scrapper folder.

In total: 3 months of posts comprising:

  • 1.2 million posts
  • and 77k users were scrapped from the api.

Data Analysis:

  • Data cleaning was performed to tidy up data.
  • Exploratory Analysis and Named Entity Recognition were done on the data to answer questions such has:
    • Best day and hour to post.
    • What topics have the highest form of engagement during the period.