ETL-Project

U.S. Quality of life by State

Team Members

Harry Feldman
Jessica Pardo
Andrey Tokarev
Raven Washington

Introduction

This project is designed to conduct an ETL process by extracting, transforming, and loading data that pertains to the quality of life in different States of the USA. The purpose of extracting the data will be to create a database for potential future country-wide analysis of housing, healthcare, and other measure of quality of life.

Data Extraction

In this project, CSV datasets were extrated from the following sources:

To complete the process of data extraction and transformation, it was prepared a python file ETL Notebook.

Data Enginering

After extracting the data, we made an Entity-Relationship Diagram (ERD) by using an open-source toolkit called Quick Database Diagrams. The model looks as follows:

Data Transformation

The transformation of the data include the following workflow:

Pandas functions in Jupyter Notebook to transform all CSV files responses.
CSV files transformed into a dataframes.
Python transformation functions for data cleaning, joining, filtering, and null values removed.
Several columns removed
Duplicate rows was removed, and successfully managed.

More detail of the transformation of the data can be seen in ETL Notebook.

Data Loading

After the process of extracting and transforming the data, we created a SQL database to load the dababase. First, we made a table schema SQL Table Schema for each of the CSV files saved in the Resources directory.

Using Python and SQLAlchemy, we loaded our data into the tables into PostgreSQL for population.

Name		Name	Last commit message	Last commit date
Latest commit History 89 Commits
ERD		ERD
Images		Images
Resources		Resources
Team_folder		Team_folder
.gitignore		.gitignore
ETL_Notebook.ipynb		ETL_Notebook.ipynb
README.md		README.md
ScrapeRatings.ipynb		ScrapeRatings.ipynb
schema.sql		schema.sql

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ETL-Project

U.S. Quality of life by State

Team Members

Introduction

Data Extraction

Data Enginering

Data Transformation

Data Loading

About

Releases

Packages

Languages

theAH-Lab/ETL-challenge

Folders and files

Latest commit

History

Repository files navigation

ETL-Project

U.S. Quality of life by State

Team Members

Introduction

Data Extraction

Data Enginering

Data Transformation

Data Loading

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages