Skip to content

Georgia Tech's Spring 2020 CSE 6040 Computing for Data Analysis Class with Dr. Richard Vuduc

Notifications You must be signed in to change notification settings

rmelikov/computing_for_data_analysis

Repository files navigation

Folder Infomation

This repository contains all of the class materials for Georgia Tech's Spring 2020 CSE 6040, Computing for Data Analysis class with Dr. Richard Vuduc. (As a side note, this was the best class I ever took.) All of the content is in the project_files folder and it is sequenced for easier navigation. Inside of the project_files, you will find the Getting Started folder. So, start there.

This class is based on Python 3.7 and Jupyter notebooks. I've included the virtual environment that you can use with it. However, you might need to install the needed packages every so often as you run into them throughout the course. I created the virtual environment after the fact.

You can ignore the Scratch Pad folder. You don't have to look at it, but you might find some interesting code there and hence why I kept it in there because I want to find that code myself if I'll be looking for it.

Something else to note: some datasets are larger than 100 MB and they have to be stored as parts. So, you might see that a dataset is combined into a single file from multiple files and then the combined file is deleted again. This is so because of limitations of GitHub.