Skip to content

Latest commit

 

History

History
30 lines (17 loc) · 1.47 KB

File metadata and controls

30 lines (17 loc) · 1.47 KB

Seattle Building Energy Prediction

Project Overview

This project focuses on analyzing and preprocessing building energy benchmarking data in Seattle from 2015 and 2016. The dataset includes property details, energy usage, and emissions. The notebooks in the repository cover data cleaning, exploratory data analysis, and the implementation of a machine learning pipeline for predicting energy usage.

Notebooks

  1. Data Cleaning and Exploration: The initial notebook involves loading and exploring the dataset, handling missing values, and performing basic visualizations.

  2. Feature Engineering: This notebook delves into feature engineering, preparing the data for machine learning tasks.

  3. Machine Learning Pipeline:

    • Preprocessing: Standardization, scaling, and handling categorical variables.
    • Models: Implementation of various machine learning models, including Decision Trees, Random Forest, and Gradient Boosting.
    • Hyperparameter Tuning: Grid search for optimizing model parameters.
    • Evaluation Metrics: Utilization of metrics such as Mean Absolute Error and R-squared for model assessment.

Data

The dataset is derived from the Seattle Building Energy Benchmarking project, with separate files for 2015 and 2016. The cleaned dataset (data_p3.csv) is available in the data/cleaned directory.

Acknowledgments

Data sourced from the Seattle Building Energy Benchmarking project.

Author

This project was created by Fouad Maherzi.