Skip to content

Predicting the temprature in Seoul using relevant data from the LDAPS model operated by the Korea Meteorological Administration

License

Notifications You must be signed in to change notification settings

Sambonic/seoul-temprature-prediction

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Seoul Temprature Prediction Documentation

GitHub License GitHub Issues GitHub Top Language Project Status Python Version

This project uses machine learning to predict minimum air temperatures.

Last Updated: January 13th, 2025

Table of Contents

  1. Installation
  2. Usage
  3. Features

Installation

Make sure you have python downloaded if you haven't already. Follow these steps to set up the environment and run the application:

  1. Clone the Repository:
git clone https://github.com/Sambonic/seoul-temprature-prediction
cd seoul-temprature-prediction
  1. Create a Python Virtual Environment:
python -m venv env
  1. Activate the Virtual Environment:
  • On Windows:

    env\Scripts\activate
    
  • On macOS and Linux:

    source env/bin/activate
    
  1. Ensure Pip is Up-to-Date:
python.exe -m pip install --upgrade pip
  1. Install Dependencies:

    pip install -r requirements.txt
  2. Import Seoul Temprature Prediction as shown below.

Usage

  1. Run the notebook: Execute the air_temprature_prediction_ml.ipynb Jupyter Notebook. This performs data loading, preprocessing, exploratory data analysis (EDA), model training (using Random Forest, Linear Regression, and LightGBM), hyperparameter tuning, and model evaluation.

  2. Examine results: The notebook outputs visualizations (histograms, scatter plots, heatmaps, learning curves) during EDA and model evaluation. Numerical results (RMSE, MAE, R-squared) for each model are displayed in the notebook's output. A series of bar charts compare the performance of the different models across various metrics and configurations (with/without feature selection and hyperparameter tuning). The notebook also provides a legend to interpret the model names.

Features

  • Air Temperature Prediction: Predicts next-day minimum air temperature using machine learning.
  • Data Exploration and Visualization: Loads, cleans, and visualizes the dataset using various plots (histograms, scatter plots, box plots, heatmaps) to understand data distribution, trends, and relationships between features.
  • Data Preprocessing: Handles missing values (mean/median imputation) and outliers (IQR and Z-score methods).
  • Feature Engineering: Creates new features from existing ones (e.g., year, month, season, daily temperature range).
  • Feature Selection: Employs feature selection techniques (Sequential Feature Selection (SFS), Sequential Backward Selection (SBS), SelectKBest) to identify the most relevant features for prediction.
  • Model Training and Evaluation: Trains and evaluates multiple regression models (Random Forest, Linear Regression, LightGBM) using metrics like RMSE, MAE, and R².
  • Hyperparameter Tuning: Uses GridSearchCV to optimize model hyperparameters.
  • Learning Curve Analysis: Plots learning curves to assess model performance and identify potential overfitting.
  • Model Comparison: Compares the performance of different models and feature selection strategies.

About

Predicting the temprature in Seoul using relevant data from the LDAPS model operated by the Korea Meteorological Administration

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published