AI-powered spatiotemporal imputation and prediction of chlorophyll-a concentration in coastal oceans

This repository contains the code for the STIMP method, an advanced AI framework to impute and predict Chl_a across a broad spatiotemporal scale in coastal oceans. STIMP's results can be utilized to diagnose and analyze the ecosystem health of coastal oceans based on the remote sensing measurement.

Reproducibility

We provide source code for reproducing the experiments of the paper "AI-powered spatiotemporal imputation and prediction of chlorophyll-a concentration in coastal oceans".

Step1. Install

git clone https://github.com/YangLabHKUST/STIMP.git
cd /path/to/STIMP
conda create -n stimp python=3.9
conda activate stimp
pip install -r requirements.txt

Step2. Prepare data

All data used in this work are publicly available through online sources. The chlorophyll-a observation datasets were 8-day averaged Level 3 mapped products from Moderate Resolution Imaging Spectroradiometer (MODIS) Aqua projects with a spatial resolution of 4 km https://search.earthdata.nasa.gov/search?q=10.5067/AQUA/MODIS/L3M/CHL/2022. You can select the data with .8D..4km.nc as filter.

We also uploaded the datasets on Zenodo at https://doi.org/10.5281/zenodo.14724760. Then,

mv data.zip /path/to/STIMP/
unzip e data.zip

Prepare the dataset from the raw data We generate the 4 datasets, including Pearl River Estuary, the Northern of Mexico, Chesapeake Bay and Yangtze River Estuary, following this tutorials. The generated datasets are also included in the data.zip

Step3. Train the imputation function $p_\theta$ of STIMP

Taking the Pearl River Estuary as an example, we construct 9 datasets with different missing rate. We train the STIMP with each dataset:

for i in {0.1,0.2,0.3,0.4,0.5,0.6,0.7,0.8,0.9}
do
  python imputation/train_stimp.py --missing_ratio $i --area PRE
done

Baselines can be trained:

for i in {0.1,0.2,0.3,0.4,0.5,0.6,0.7,0.8,0.9}
do
  python imputation/train_cf.py --missing_ratio $i --area PRE
  python imputation/train_csdi.py --missing_ratio $i --area PRE
  python imputation/train_dineof.py --missing_ratio $i --area PRE
  python imputation/train_imputeformer.py --missing_ratio $i --area PRE
  python imputation/train_inpainter.py --missing_ratio $i --area PRE
  python imputation/train_lin_itp.py --missing_ratio $i --area PRE
  python imputation/train_mae.py --missing_ratio $i --area PRE
  python imputation/train_mean.py --missing_ratio $i --area PRE
  python imputation/train_trmf.py --missing_ratio $i --area PRE
done

Some visualization results are contained within Imputation in Pearl River Estuary

For other coastal ocean areas, STIMP and baselines are trained by replacing PRE with MEXICO, Chesapeake or Yangtze.

Step4. Impute the observation

Observations of Chl_a in Pearl River Estuary are imputed:

python dataset/generate_data_with_stimp.py --area PRE

Step5. Train the prediction fuction $p_\Phi$ of STIMP

We sample 10 different imputed Chl_a distribution from $p_\theta(\mathbf{X}|\mathbf{X}^{ob})$. Then we can learn 10 different $p_\Phi(\tilde{\mathbf{Y}}|\mathbf{X})$ due to differet input $\mathbf{X}$:

for i in {0..9}  
do  
  python prediction/train.py --index $i --area PRE
done

Baselines are learned based on the original observations $\mathbf{X}^{ob}$:

python prediction/train_without_spatial_imputation.py --method "CrossFormer" --area PRE
python prediction/train_without_spatial_imputation.py --method "iTransformer" --area PRE
python prediction/train_without_spatial_imputation.py --method "TSMixer" --area PRE
python prediction/train_without_imputation.py --method "MTGNN" --area PRE
python prediction/train_as_image_without_imputation.py --method "PredRNN" --area PRE
python prediction/train_xgboost_without_imputation.py --area PRE

We also train baselines based on the imputed Chl_a distribution (in supplementary material):

for i in {0..9}  
do  
  python prediction/train_without_spatial.py --method "CrossFormer" --area PRE --index $i
  python prediction/train_without_spatial.py --method "iTransformer" --area PRE --index $i
  python prediction/train_without_spatial.py --method "TSMixer" --area PRE --index $i
  python prediction/train.py --method "MTGNN" --area PRE --index $i
  python prediction/train_as_image.py --method "PredRNN" --area PRE --index $i
  python prediction/train_xgboost.py --area PRE --index $i
done

We provide the source code for overall prediction performance in each coastal ocean area:

Some case studies are included in the following tutorials:

Name		Name	Last commit message	Last commit date
Latest commit History 199 Commits
dataset		dataset
imputation		imputation
model		model
prediction		prediction
tutorials		tutorials
.gitignore		.gitignore
README.md		README.md
__init__.py		__init__.py
requirements.txt		requirements.txt
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AI-powered spatiotemporal imputation and prediction of chlorophyll-a concentration in coastal oceans

Reproducibility

Step1. Install

Step2. Prepare data

Step3. Train the imputation function $p_\theta$ of STIMP

Step4. Impute the observation

Step5. Train the prediction fuction $p_\Phi$ of STIMP

About

Releases

Packages

Languages

YangLabHKUST/STIMP

Folders and files

Latest commit

History

Repository files navigation

AI-powered spatiotemporal imputation and prediction of chlorophyll-a concentration in coastal oceans

Reproducibility

Step1. Install

Step2. Prepare data

Step3. Train the imputation function $p_\theta$ of STIMP

Step4. Impute the observation

Step5. Train the prediction fuction $p_\Phi$ of STIMP

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages