This repository contains the analytics pipeline for the Kwanza Tukule study case project. The pipeline is designed to ingest, clean, transform,Analyze and visualize data from Google Sheets using Google Colab and Looker Studio. The pipeline is automated using GitHub Actions, which runs every 5 hours.
The Kwanza Tukule Analytics Pipeline automates the process of:
- Data Ingestion: Pulling data from Google Sheets using the Google Sheets API.
- Data Cleaning ,Transformation, Analysis: Processing the data in Google Colab.
- Data Visualization: Visualizing the transformed data in Looker Studio.
The pipeline is scheduled to run every 5 hours using GitHub Actions.
Below is the architecture of the analytics pipeline:
- Data Source: Google Sheets.
- Ingestion: Data is pulled using the Google Sheets API.
- Cleaning and Transformation, Analysis: Performed in Google Colab.
- Visualization: Data is visualized in Looker Studio.
- Automation: GitHub Actions triggers the pipeline every 5 hours.
The core logic of the Analysis is implemented in the following Jupyter Notebook:
📒 Kwanza Tukule Case Study Notebook
This notebook contains the code for data ingestion, cleaning, transformation,Analysis and preparation for visualization.
https://lookerstudio.google.com/s/ou_fip2m4aY
The pipeline is automated using GitHub Actions. The workflow is defined in the following YAML file:
The workflow runs every 5 hours and executes the notebook
To set up this project locally, follow these steps:
- Clone the repository:
git clone https://github.com/24jmwangi/KwanzaTukule.git
- Navigate to the project directory:
cd KwanzaTukule
- Install dependencies (if any):
pip install -r requirements.txt
- Open the notebook in Google Colab or Jupyter:
jupyter notebook KWANZA_TUKULE_CASE_STUDY.ipynb
To use the pipeline:
- Ensure your Google Sheets API credentials are set up.
- Update the notebook with your Google Sheet ID and range.
- Run the notebook to ingest, clean,transform, Analyze the data.
- Visualize the data in Looker Studio.
For automation, the GitHub Actions workflow will handle the execution every 5 hours.
Contributions are welcome! If you'd like to contribute, please follow these steps:
- Fork the repository.
- Create a new branch:
git checkout -b feature/your-feature-name
- Commit your changes:
git commit -m "Add your commit message here"
- Push to the branch:
git push origin feature/your-feature-name
- Open a pull request.
This project is licensed under the MIT License. See the LICENSE file for details.