Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing and Unexpected keys in state_dict #5

Open
Mvk122 opened this issue Feb 26, 2024 · 0 comments
Open

Missing and Unexpected keys in state_dict #5

Mvk122 opened this issue Feb 26, 2024 · 0 comments

Comments

@Mvk122
Copy link

Mvk122 commented Feb 26, 2024

I am trying to run inference using baseline_m.toml however the given checkpoint files seem to have the wrong state keys.

Given the following command:

 accelerate launch /mnt/c/Users/madha/code/spiking-fullsubnet/recipes/intel_ndns/spiking_fullsubnet/run.py -C /mnt/c/Users/madha/code/spiking-fullsubnet/recipes/intel_ndns/spiking_fullsubnet/baseline_m.toml -M test --ckpt_path /mnt/c/Users/madha/code/spiking-fullsubnet/model_zoo/intel_ndns/spike_fsb/baseline_m/checkpoints/best

I am getting the following error:

02-26 19:19:32: Initialized logger with log file in /mnt/c/Users/madha/code/spiking-fullsubnet/recipes/intel_ndns/spiking_fullsubnet/exp/baseline_m.
Loading dataset from /mnt/c/Users/madha/code/spiking-fullsubnet/datasets/validation_set/...
Found 3243 files.
02-26 19:19:37: Configuration file is saved to /mnt/c/Users/madha/code/spiking-fullsubnet/recipes/intel_ndns/spiking_fullsubnet/exp/baseline_m/config__2024_02_26--19_19_36.toml.
02-26 19:19:37: Environment information:
- `Accelerate` version: 0.27.2
- Platform: Linux-5.10.102.1-microsoft-standard-WSL2-x86_64-with-glibc2.31
- Python version: 3.10.13
- Numpy version: 1.26.4
- PyTorch version (GPU?): 2.2.1 (True)
- System RAM: 12.26 GB
- GPU Available: True
- GPU IDs: 1
- GPU type: NVIDIA T500
02-26 19:19:37:
 ==========================================================================================
Layer (type:depth-idx)                                            Param #
==========================================================================================
OptimizedModule                                                   --
├─SpikingFullSubNet: 1-1                                          --
│    └─SequenceModel: 2-1                                         --
│    │    └─LayerNorm: 3-1                                        128
│    │    └─StackedGSU: 3-2                                       330,240
│    │    └─Linear: 3-3                                           20,544
│    │    └─Identity: 3-4                                         --
│    └─SubbandModel: 2-2                                          --
│    │    └─ModuleList: 3-5                                       603,500
==========================================================================================
Total params: 954,412
Trainable params: 954,412
Non-trainable params: 0
==========================================================================================
Using device: 0
02-26 19:19:38: Begin testing...
02-26 19:19:38: Loading states from /mnt/c/Users/madha/code/spiking-fullsubnet/model_zoo/intel_ndns/spike_fsb/baseline_m/checkpoints/best
Traceback (most recent call last):
  File "/mnt/c/Users/madha/code/spiking-fullsubnet/recipes/intel_ndns/spiking_fullsubnet/run.py", line 151, in <module>
    run(config, args.resume)
  File "/mnt/c/Users/madha/code/spiking-fullsubnet/recipes/intel_ndns/spiking_fullsubnet/run.py", line 97, in run
    trainer.test(test_dataloaders, config["meta"]["ckpt_path"])
  File "/home/madhav/miniconda3/envs/spiking-fullsubnet/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/mnt/c/Users/madha/code/spiking-fullsubnet/audiozen/trainer.py", line 537, in test
    self._load_checkpoint(ckpt_path)
  File "/mnt/c/Users/madha/code/spiking-fullsubnet/audiozen/trainer.py", line 225, in _load_checkpoint
    self.accelerator.load_state(ckpt_path, map_location="cpu")
  File "/home/madhav/miniconda3/envs/spiking-fullsubnet/lib/python3.10/site-packages/accelerate/accelerator.py", line 2922, in load_state
    load_accelerator_state(
  File "/home/madhav/miniconda3/envs/spiking-fullsubnet/lib/python3.10/site-packages/accelerate/checkpointing.py", line 205, in load_accelerator_state
    models[i].load_state_dict(state_dict, **load_model_func_kwargs)
  File "/home/madhav/miniconda3/envs/spiking-fullsubnet/lib/python3.10/site-packages/torch/nn/modules/module.py", line 2153, in load_state_dict
    raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for SpikingFullSubNet:
        Missing key(s) in state_dict: "fb_model.pre_layer_norm.weight", "fb_model.pre_layer_norm.bias", "fb_model.proj.weight", "fb_model.proj.bias", "sb_model.sb_models.0.pre_layer_norm.weight", "sb_model.sb_models.0.pre_layer_norm.bias", "sb_model.sb_models.0.proj.weight", "sb_model.sb_models.0.proj.bias", "sb_model.sb_models.1.pre_layer_norm.weight", "sb_model.sb_models.1.pre_layer_norm.bias", "sb_model.sb_models.1.proj.weight", "sb_model.sb_models.1.proj.bias", "sb_model.sb_models.2.pre_layer_norm.weight", "sb_model.sb_models.2.pre_layer_norm.bias", "sb_model.sb_models.2.proj.weight", "sb_model.sb_models.2.proj.bias".
        Unexpected key(s) in state_dict: "fb_model.fc_output_layer.weight", "fb_model.fc_output_layer.bias", "sb_model.sb_models.0.fc_output_layer.weight", "sb_model.sb_models.0.fc_output_layer.bias", "sb_model.sb_models.1.fc_output_layer.weight", "sb_model.sb_models.1.fc_output_layer.bias", "sb_model.sb_models.2.fc_output_layer.weight", "sb_model.sb_models.2.fc_output_layer.bias".
Traceback (most recent call last):
  File "/home/madhav/miniconda3/envs/spiking-fullsubnet/bin/accelerate", line 8, in <module>
    sys.exit(main())
  File "/home/madhav/miniconda3/envs/spiking-fullsubnet/lib/python3.10/site-packages/accelerate/commands/accelerate_cli.py", line 47, in main
    args.func(args)
  File "/home/madhav/miniconda3/envs/spiking-fullsubnet/lib/python3.10/site-packages/accelerate/commands/launch.py", line 1023, in launch_command
    simple_launcher(args)
  File "/home/madhav/miniconda3/envs/spiking-fullsubnet/lib/python3.10/site-packages/accelerate/commands/launch.py", line 643, in simple_launcher
    raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)
subprocess.CalledProcessError: Command '['/home/madhav/miniconda3/envs/spiking-fullsubnet/bin/python', '/mnt/c/Users/madha/code/spiking-fullsubnet/recipes/intel_ndns/spiking_fullsubnet/run.py', '-C', '/mnt/c/Users/madha/code/spiking-fullsubnet/recipes/intel_ndns/spiking_fullsubnet/baseline_m.toml', '-M', 'test', '--ckpt_path', '/mnt/c/Users/madha/code/spiking-fullsubnet/model_zoo/intel_ndns/spike_fsb/baseline_m/checkpoints/best']' returned non-zero exit status 1.

Is additional configuration required to run inference on this model?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant