To adequately incorporate physical information to improve pure DL models, we developed a hybrid model based on an attention mechanism and condition hybrid schemes. We further proposed an ensemble model by averaging outputs of various hybrid models using simple average, condition, and attention methods. We applied two proposed models to improve ConvLSTM by GFS forecasts and compared with three widely used hybrid methods based on in-situ and gridded data evaluation. The results showed that the proposed ensemble hybrid model achieves the best general performance among all hybrid models from 1 to 16 days forecasting, and is amenable to different soil conditions. It is highlighted that the ensemble model improves at least 65% of R compared to ConvLSTM for 16-day forecasting, and outperforms it over 79.5% in-situ stations. Moreover, our proposed attention-based hybrid model, which detects 60.6% and 56.8% drought events separately for 1-week and 2-week forecasts, achieves the best drought events predictability over arid, temperate, cold and polar regions. Our findings emphasized that the proposed hybrid models could address the problem of pure DL models on long-term and extreme forecasting and could break the performance ceiling constrained by training datasets.
1. Clone our repo
git clone git@github.com:leelew/HybridHydro.git
2. Download dataset
The dataset used in the paper could be download from zenodo. You could test the repo with downloading each sub-tasks rather than the whole data. Please make an input
dir and move download data into this fold.
3. Wandb init
The code is monitored by Weight & Biases (Wandb). Please register and wandb init in your own server (choose "Create New" when first set up wandb project).
4. (Optional) Search for best parameters
We use Wandb Sweeps to automate hyperparameter search with Bayesian methods (All logs is shown in logs). If you want to explore the best parameters of HybridHydro, you could train models with our sweep configuration (sweep.yml
)
(1) Initialize sweep and get the Sweep ID:
wandb sweep sweep.yml
(2) Start sweep agent
wandb agent sweep_id
5. (Optional) Change configuration
Change configs.py
if you want to train HybridHydro model with your own configurations. Otherwise, it will train by default parameters.
6. Run
Change your work path (e.g., forecast, saved models), job numbers and model name in run.sh
, and perform:
bash run.sh
7. Postprocess
The forecast of models is shown in forecast path defined in run.sh
. If you have trained all 24 sub-tasks models, change the model name in postprocess.sh
and perform:
bash postprocess.sh
HybridHydro has five edition, shown in different branchs.
V1: train local model for each patch (112 x 112)
V2: train global model for all patches (112 x 112)
V3: train global model for all patches (28 x 28)
V4: train local model for each patch based on transfer learning V2 (112 x 112)
V5: train local model for each patch based on transfer learning V3 (28 x 28)
- Add ancillary data (DEM, land cover) would slightly decrease the performance.
NOTES: The paper is not accepted yet.
In case you use HybridHydro in your research or work, please cite this GitHub codes:
Copyright (c) 2022, Lu Li