Skip to content

This notebook accompanies the course Use PyStarburst for Data Analysis, which is part of the overall course Getting Started with PyStarburst on Starburst Academy.

License

Notifications You must be signed in to change notification settings

mebauer/pystarburst-data-analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

PyStarburst for Data Analysis

Course author: Starburst Academy Team
Notebook author: Mark Bauer

1. Introduction

This notebook accompanies the course Use PyStarburst for Data Analysis, which is part of the overall course Getting Started with PyStarburst on Starburst Academy.

To reproduce the results and follow along, you'll need to set up your environment in Starburst. You can prepare your environment by visiting: https://academy.starburst.io/getting-started-with-pystarburst/192527.

2. Notebook

Explore the notebook: aviation.ipynb.

3. Data

From the course:

Background
In one of the prerequisites to this tutorial, you created a catalog called tmp_cat which you connected to an S3 bucket owned by Starburst. The S3 bucket that you connected to also contains some flight data, which you will use in this tutorial. Before you can analyze the data, you'll need to complete a few tasks to prepare your environment.

First, you'll have to add a location privilege to the accountadmin role in Starburst Galaxy to ensure that you can write to the folder within the S3 bucket that contains the flight data. After that, you'll use the Query editor to execute some SQL to create the necessary schema and tables.

The data you'll be working with comprises four csv files: flights.csv, airports.csv, carriers.csv, and plane-data.csv. You're going to make one table for each file, beginning with flights.csv.

Source: Use PyStarburst for data analysis

4. Additional Resources

5. Say Hello!

Feel free to reach out for further discussions.

About

This notebook accompanies the course Use PyStarburst for Data Analysis, which is part of the overall course Getting Started with PyStarburst on Starburst Academy.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published