Los Angeles Daily Temperature Collection Project

Description

This project collects daily temperature data for Los Angeles from the Open-Meteo API for April through May 2024 and ingests it into an AWS data pipeline. The data is processed and stored in AWS S3, transformed and cleaned using AWS Glue, and made available for querying in AWS Athena. A Grafana dashboard is created to visualize the data, providing insights.

1. Prerequisites

AWS account: Sign up for AWS
Grafana Lab account: Sign up for Grafana

2. Architecture

3. Data Flow

Data Ingestion: A Lambda function ingests weather data from the Open-Meteo API and sends it to a Kinesis Data Firehose stream.
Data Storage: Kinesis Data Firehose delivers the data to an S3 bucket.
Data Crawling: AWS Glue crawls the data in S3 to create a table in the AWS Glue Data Catalog.
Data Transformation: AWS Glue jobs transform the data, perform data quality checks, and save the cleaned data as Parquet files in S3.
Data Querying: The transformed data is available for querying in AWS Athena.
Data Visualization: Grafana is used to build a dashboard for visualizing the data.

4. AWS Services Used

AWS Lambda: To run the function that ingests data from the Open-Meteo API.
AWS Kinesis Data Firehose: To deliver the ingested data to S3.
AWS S3: To store raw and transformed data.
AWS Glue: To crawl, transform, and clean the data.
AWS Athena: To query the transformed data.
Grafana: To visualize the data.

5. Setup

AWS Lambda: Deploy the LA_weather_lambda_put_record_batch.py Lambda function in the lambda/ directory using the AWS Lambda Console or CLI.
AWS Kinesis Data Firehose: Create a Kinesis Data Firehose delivery stream to deliver data to your S3 bucket.
- Example configuration:
  - Source: Direct PUT or other sources
  - Destination: S3 bucket
  - S3 bucket ARN: arn:aws:s3:::your-bucket-name
AWS Glue:
- Create a Glue Crawler to crawl the data in your S3 bucket and create a Glue Data Catalog table.
- Create and run Glue jobs using the scripts in the glue/ directory to transform data and perform data quality checks.
AWS Athena: Configure Athena to query the data stored in your S3 bucket.
Grafana: Set up Grafana to visualize the data.

6. Pipeline

Trigger the Lambda function to start data ingestion.
Verify that the data is being delivered to your S3 bucket via Kinesis Data Firehose.
Run the Glue crawler to update the Glue Data Catalog.
Execute Glue jobs to transform and clean the data.

Query the transformed data in Athena to verify the data quality and structure.

Use Grafana to visualize the data.

7. Visualization

8. Troubleshooting

Lambda Function Errors:
- Check CloudWatch logs for detailed error messages.
- Verify the IAM role has the necessary permissions.
Kinesis Data Firehose Issues:
- Ensure the Firehose stream is properly configured with the correct S3 bucket.
- Check Firehose monitoring metrics for delivery failures.
Glue Job Failures:
- Review Glue job logs for errors.
- Ensure the Glue job script paths and S3 bucket permissions are correct.
Athena Query Problems:
- Verify the Glue Data Catalog table is correctly configured.
- Check for syntax errors in your SQL queries.

9. Acknowledgements

Special thanks to: David Freitag for his course on Maven: Build Your First Serverless Data Engineering Project

Data source: Weather Data Open Meteo API

Name		Name	Last commit message	Last commit date
Latest commit History 34 Commits
glue		glue
lambda		lambda
LA_Temperature_de_project.png		LA_Temperature_de_project.png
README.md		README.md
visualization.png		visualization.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Los Angeles Daily Temperature Collection Project

Description

Table of Contents

1. Prerequisites

2. Architecture

3. Data Flow

4. AWS Services Used

5. Setup

6. Pipeline

7. Visualization

8. Troubleshooting

9. Acknowledgements

About

Releases

Packages

Languages

bsrikanth24/la-temperature-collection

Folders and files

Latest commit

History

Repository files navigation

Los Angeles Daily Temperature Collection Project

Description

Table of Contents

1. Prerequisites

2. Architecture

3. Data Flow

4. AWS Services Used

5. Setup

6. Pipeline

7. Visualization

8. Troubleshooting

9. Acknowledgements

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages