Skip to content

eitrheim/Big-Data-Platforms

Repository files navigation

Big Data Platforms

Using Hadoop, PySpark in Python, and the University of Chicago’s high performance computing cluster to run machine learning algorithms and sentiment analysis on a 68GB+ dataset containing reviews of Amazon products in a JSON format that has a nested structure.

About

Big Data Platforms Final Project and Helpful Notes

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published