SPARK DEMO

Small demo of how to set up and use spark.

First, we set up a small spark cluster locally.

Then, we run a small data trasformation task on a python notebook using the cluster that was just set up.

The data used is a subset of the NYC taxi trips dataset, downloaded from maven analytics data playground.

To start the system up, run:

docker compose up --attach jupyter

Then, copy and paste the url that jupyter spits out on your browser.

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
db		db
src		src
.env		.env
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
docker-compose.yaml		docker-compose.yaml

Provide feedback