-
Notifications
You must be signed in to change notification settings - Fork 279
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
StormCast training code improvements (#738)
* adding stormcast raw files * major cleanup, refactor and consolidation Signed-off-by: Peter Harrington <pharrington@nvidia.com> * More cleanup and init readme Signed-off-by: Peter Harrington <pharrington@nvidia.com> * port command line args to standard argparse Signed-off-by: Peter Harrington <pharrington@nvidia.com> * remove unused network and loss wrappers Signed-off-by: Peter Harrington <pharrington@nvidia.com> * add torchrun instructions Signed-off-by: Peter Harrington <pharrington@nvidia.com> * drop dnnlib utils Signed-off-by: Peter Harrington <pharrington@nvidia.com> * use Modulus DistributedManager, streamline cmd args Signed-off-by: Peter Harrington <pharrington@nvidia.com> * Use standard torch checkpoints instead of pickles Signed-off-by: Peter Harrington <pharrington@nvidia.com> * Standardize model configs and channel selection across training and inference Signed-off-by: Peter Harrington <pharrington@nvidia.com> * checkpoint format standardization for train/inference Signed-off-by: Peter Harrington <pharrington@nvidia.com> * finalize additional deps Signed-off-by: Peter Harrington <pharrington@nvidia.com> * format and linting Signed-off-by: Peter Harrington <pharrington@nvidia.com> * drop docker and update changelog Signed-off-by: Peter Harrington <pharrington@nvidia.com> * Address feedback Signed-off-by: Peter Harrington <pharrington@nvidia.com> * add variables to readme, rename network types Signed-off-by: Peter Harrington <pharrington@nvidia.com> * swap stormcast to modulus nn and loss defs Signed-off-by: Peter Harrington <pharrington@nvidia.com> * Swap to modulus checkpoint save and load utils Signed-off-by: Peter Harrington <pharrington@nvidia.com> * Swap to modulus networks/losses, use modulus checkpointing and logging Signed-off-by: Peter Harrington <pharrington@nvidia.com> * add power spectrum to modulus metrics, remove unused utils Signed-off-by: Peter Harrington <pharrington@nvidia.com> * Readme update and unit tests Signed-off-by: Peter Harrington <pharrington@nvidia.com> * drop unused files Signed-off-by: Peter Harrington <pharrington@nvidia.com> * drop unused diffusions files Signed-off-by: Peter Harrington <pharrington@nvidia.com> * update changelog Signed-off-by: Peter Harrington <pharrington@nvidia.com> --------- Signed-off-by: Peter Harrington <pharrington@nvidia.com> Co-authored-by: nvssh nssswitch user account <mnabian@ngcdgx104.nsv.sjc4.nvmetal.net>
- Loading branch information
1 parent
c933538
commit c6dab48
Showing
37 changed files
with
1,677 additions
and
2,941 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
70 changes: 70 additions & 0 deletions
70
examples/generative/stormcast/config/dataset/hrrr_era5.yaml
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,70 @@ | ||
# SPDX-FileCopyrightText: Copyright (c) 2023 - 2024 NVIDIA CORPORATION & AFFILIATES. | ||
# SPDX-FileCopyrightText: All rights reserved. | ||
# SPDX-License-Identifier: Apache-2.0 | ||
# | ||
# Licensed under the Apache License, Version 2.0 (the "License"); | ||
# you may not use this file except in compliance with the License. | ||
# You may obtain a copy of the License at | ||
# | ||
# http://www.apache.org/licenses/LICENSE-2.0 | ||
# | ||
# Unless required by applicable law or agreed to in writing, software | ||
# distributed under the License is distributed on an "AS IS" BASIS, | ||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
# See the License for the specific language governing permissions and | ||
# limitations under the License. | ||
|
||
# Main dataset | ||
location: 'data' # Path to the dataset | ||
conus_dataset_name: 'hrrr_v3' # Version name for the dataset | ||
hrrr_stats: 'stats_v3_2019_2021' # Summary stats name for the dataset | ||
|
||
# Domain | ||
hrrr_img_size: [512, 640] # Image dimensions of the HRRR region of interest | ||
boundary_padding_pixels: 0 # set this to 0 for no padding of ERA5 beyond HRRR domain, | ||
# 32 for 32 pixels of padding in each direction, etc. | ||
|
||
# Temporal selection | ||
dt: 1 # Timestep between samples (in multiples of the base HRRR 1hr timestep) | ||
train_years: [2018, 2019, 2020, 2021] # Years to use for training | ||
valid_years: [2022] # Years to use for validation | ||
|
||
# Variable selection | ||
invariants: ["lsm", "orog"] # Invariant quantitites to include | ||
input_channels: 'all' #'all' or list of channels to condition on | ||
diffusion_channels: "all" #'all' or list of channels to condition on | ||
exclude_channels: # Dataset channels to exclude from inputs/predicitons | ||
- u35 | ||
- u40 | ||
- v35 | ||
- v40 | ||
- t35 | ||
- t40 | ||
- q35 | ||
- q40 | ||
- w1 | ||
- w2 | ||
- w3 | ||
- w4 | ||
- w5 | ||
- w6 | ||
- w7 | ||
- w8 | ||
- w9 | ||
- w10 | ||
- w11 | ||
- w13 | ||
- w15 | ||
- w20 | ||
- w25 | ||
- w30 | ||
- w35 | ||
- w40 | ||
- p25 | ||
- p30 | ||
- p35 | ||
- p40 | ||
- z35 | ||
- z40 | ||
- tcwv | ||
- vil |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,45 @@ | ||
# SPDX-FileCopyrightText: Copyright (c) 2023 - 2024 NVIDIA CORPORATION & AFFILIATES. | ||
# SPDX-FileCopyrightText: All rights reserved. | ||
# SPDX-License-Identifier: Apache-2.0 | ||
# | ||
# Licensed under the Apache License, Version 2.0 (the "License"); | ||
# you may not use this file except in compliance with the License. | ||
# You may obtain a copy of the License at | ||
# | ||
# http://www.apache.org/licenses/LICENSE-2.0 | ||
# | ||
# Unless required by applicable law or agreed to in writing, software | ||
# distributed under the License is distributed on an "AS IS" BASIS, | ||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
# See the License for the specific language governing permissions and | ||
# limitations under the License. | ||
|
||
# Defaults | ||
defaults: | ||
|
||
# Dataset | ||
- dataset/hrrr_era5 | ||
|
||
# Model | ||
- model/stormcast | ||
|
||
# Training | ||
- training/default | ||
|
||
# Sampler | ||
- sampler/edm_deterministic | ||
|
||
# Hydra | ||
- hydra/default | ||
|
||
- _self_ | ||
|
||
# Diffusion model specific changes | ||
model: | ||
use_regression_net: True | ||
regression_weights: "stormcast_checkpoints/regression/StormCastUNet.0.0.mdlus" | ||
previous_step_conditioning: True | ||
spatial_pos_embed: True | ||
|
||
training: | ||
loss: 'edm' |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,18 @@ | ||
# SPDX-FileCopyrightText: Copyright (c) 2023 - 2024 NVIDIA CORPORATION & AFFILIATES. | ||
# SPDX-FileCopyrightText: All rights reserved. | ||
# SPDX-License-Identifier: Apache-2.0 | ||
# | ||
# Licensed under the Apache License, Version 2.0 (the "License"); | ||
# you may not use this file except in compliance with the License. | ||
# You may obtain a copy of the License at | ||
# | ||
# http://www.apache.org/licenses/LICENSE-2.0 | ||
# | ||
# Unless required by applicable law or agreed to in writing, software | ||
# distributed under the License is distributed on an "AS IS" BASIS, | ||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
# See the License for the specific language governing permissions and | ||
# limitations under the License. | ||
|
||
run: | ||
dir: ${training.outdir}/${training.experiment_name}/${training.run_id} |
Oops, something went wrong.