Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Qwen2.5 多图微调时 出现TypeError: 'NoneType' object is not subscriptable错误 #7477

Open
1 task done
han-lx opened this issue Mar 25, 2025 · 0 comments
Open
1 task done
Labels
bug Something isn't working pending This problem is yet to be addressed

Comments

@han-lx
Copy link

han-lx commented Mar 25, 2025

Reminder

  • I have read the above rules and searched the existing issues.

System Info

  • llamafactory version: 0.9.3.dev0
  • Platform: Linux-5.15.0-133-generic-x86_64-with-glibc2.35
  • Python version: 3.10.4
  • PyTorch version: 2.6.0+cu124 (GPU)
  • Transformers version: 4.49.0
  • Datasets version: 2.18.0
  • Accelerate version: 1.4.0
  • PEFT version: 0.12.0
  • TRL version: 0.9.6
  • GPU type: NVIDIA RTX A6000
  • GPU number: 4
  • GPU memory: 44.45GB
  • DeepSpeed version: 0.16.4
  • Bitsandbytes version: 0.45.3
  • Git commit: d8a5571

Reproduction

[rank3]: Traceback (most recent call last):
[rank3]:   File "/home/hx/Qwen/Qwen2.5-VL-7B-machine/LLaMA-Factory/src/llamafactory/launcher.py", line 23, in <module>
[rank3]:     launch()
[rank3]:   File "/home/hx/Qwen/Qwen2.5-VL-7B-machine/LLaMA-Factory/src/llamafactory/launcher.py", line 19, in launch
[rank3]:     run_exp()
[rank3]:   File "/home/hx/Qwen/Qwen2.5-VL-7B-machine/LLaMA-Factory/src/llamafactory/train/tuner.py", line 107, in run_exp
[rank3]:     _training_function(config={"args": args, "callbacks": callbacks})
[rank3]:   File "/home/hx/Qwen/Qwen2.5-VL-7B-machine/LLaMA-Factory/src/llamafactory/train/tuner.py", line 69, in _training_function
[rank3]:     run_sft(model_args, data_args, training_args, finetuning_args, generating_args, callbacks)
[rank3]:   File "/home/hx/Qwen/Qwen2.5-VL-7B-machine/LLaMA-Factory/src/llamafactory/train/sft/workflow.py", line 52, in run_sft
[rank3]:     model = load_model(tokenizer, model_args, finetuning_args, training_args.do_train)
[rank3]:   File "/home/hx/Qwen/Qwen2.5-VL-7B-machine/LLaMA-Factory/src/llamafactory/model/loader.py", line 135, in load_model
[rank3]:     model = load_unsloth_pretrained_model(config, model_args)
[rank3]:   File "/home/hx/Qwen/Qwen2.5-VL-7B-machine/LLaMA-Factory/src/llamafactory/model/model_utils/unsloth.py", line 55, in load_unsloth_pretrained_model
[rank3]:     model, _ = FastLanguageModel.from_pretrained(**unsloth_kwargs)
[rank3]:   File "/home/hx/miniconda3/envs/Qwen/lib/python3.10/site-packages/unsloth/models/loader.py", line 308, in from_pretrained
[rank3]:     return FastModel.from_pretrained(
[rank3]:   File "/home/hx/miniconda3/envs/Qwen/lib/python3.10/site-packages/unsloth/models/loader.py", line 714, in from_pretrained
[rank3]:     model, tokenizer = FastBaseModel.from_pretrained(
[rank3]:   File "/home/hx/miniconda3/envs/Qwen/lib/python3.10/site-packages/unsloth/models/vision.py", line 258, in from_pretrained
[rank3]:     model_type_arch = model_types[0]
[rank3]: TypeError: 'NoneType' object is not subscriptable
[rank2]: Traceback (most recent call last):
[rank2]:   File "/home/hx/Qwen/Qwen2.5-VL-7B-machine/LLaMA-Factory/src/llamafactory/launcher.py", line 23, in <module>
[rank2]:     launch()
[rank2]:   File "/home/hx/Qwen/Qwen2.5-VL-7B-machine/LLaMA-Factory/src/llamafactory/launcher.py", line 19, in launch
[rank2]:     run_exp()
[rank2]:   File "/home/hx/Qwen/Qwen2.5-VL-7B-machine/LLaMA-Factory/src/llamafactory/train/tuner.py", line 107, in run_exp
[rank2]:     _training_function(config={"args": args, "callbacks": callbacks})
[rank2]:   File "/home/hx/Qwen/Qwen2.5-VL-7B-machine/LLaMA-Factory/src/llamafactory/train/tuner.py", line 69, in _training_function
[rank2]:     run_sft(model_args, data_args, training_args, finetuning_args, generating_args, callbacks)
[rank2]:   File "/home/hx/Qwen/Qwen2.5-VL-7B-machine/LLaMA-Factory/src/llamafactory/train/sft/workflow.py", line 52, in run_sft
[rank2]:     model = load_model(tokenizer, model_args, finetuning_args, training_args.do_train)
[rank2]:   File "/home/hx/Qwen/Qwen2.5-VL-7B-machine/LLaMA-Factory/src/llamafactory/model/loader.py", line 135, in load_model
[rank2]:     model = load_unsloth_pretrained_model(config, model_args)
[rank2]:   File "/home/hx/Qwen/Qwen2.5-VL-7B-machine/LLaMA-Factory/src/llamafactory/model/model_utils/unsloth.py", line 55, in load_unsloth_pretrained_model
[rank2]:     model, _ = FastLanguageModel.from_pretrained(**unsloth_kwargs)
[rank2]:   File "/home/hx/miniconda3/envs/Qwen/lib/python3.10/site-packages/unsloth/models/loader.py", line 308, in from_pretrained
[rank2]:     return FastModel.from_pretrained(
[rank2]:   File "/home/hx/miniconda3/envs/Qwen/lib/python3.10/site-packages/unsloth/models/loader.py", line 714, in from_pretrained
[rank2]:     model, tokenizer = FastBaseModel.from_pretrained(
[rank2]:   File "/home/hx/miniconda3/envs/Qwen/lib/python3.10/site-packages/unsloth/models/vision.py", line 258, in from_pretrained
[rank2]:     model_type_arch = model_types[0]
[rank2]: TypeError: 'NoneType' object is not subscriptable
Unsloth: WARNING `trust_remote_code` is True.
Are you certain you want to do remote code execution?
[rank1]: Traceback (most recent call last):
[rank1]:   File "/home/hx/Qwen/Qwen2.5-VL-7B-machine/LLaMA-Factory/src/llamafactory/launcher.py", line 23, in <module>
[rank1]:     launch()
[rank1]:   File "/home/hx/Qwen/Qwen2.5-VL-7B-machine/LLaMA-Factory/src/llamafactory/launcher.py", line 19, in launch
[rank1]:     run_exp()
[rank1]:   File "/home/hx/Qwen/Qwen2.5-VL-7B-machine/LLaMA-Factory/src/llamafactory/train/tuner.py", line 107, in run_exp
[rank1]:     _training_function(config={"args": args, "callbacks": callbacks})
[rank1]:   File "/home/hx/Qwen/Qwen2.5-VL-7B-machine/LLaMA-Factory/src/llamafactory/train/tuner.py", line 69, in _training_function
[rank1]:     run_sft(model_args, data_args, training_args, finetuning_args, generating_args, callbacks)
[rank1]:   File "/home/hx/Qwen/Qwen2.5-VL-7B-machine/LLaMA-Factory/src/llamafactory/train/sft/workflow.py", line 52, in run_sft
[rank1]:     model = load_model(tokenizer, model_args, finetuning_args, training_args.do_train)
[rank1]:   File "/home/hx/Qwen/Qwen2.5-VL-7B-machine/LLaMA-Factory/src/llamafactory/model/loader.py", line 135, in load_model
[rank1]:     model = load_unsloth_pretrained_model(config, model_args)
[rank1]:   File "/home/hx/Qwen/Qwen2.5-VL-7B-machine/LLaMA-Factory/src/llamafactory/model/model_utils/unsloth.py", line 55, in load_unsloth_pretrained_model
[rank1]:     model, _ = FastLanguageModel.from_pretrained(**unsloth_kwargs)
[rank1]:   File "/home/hx/miniconda3/envs/Qwen/lib/python3.10/site-packages/unsloth/models/loader.py", line 308, in from_pretrained
[rank1]:     return FastModel.from_pretrained(
[rank1]:   File "/home/hx/miniconda3/envs/Qwen/lib/python3.10/site-packages/unsloth/models/loader.py", line 714, in from_pretrained
[rank1]:     model, tokenizer = FastBaseModel.from_pretrained(
[rank1]:   File "/home/hx/miniconda3/envs/Qwen/lib/python3.10/site-packages/unsloth/models/vision.py", line 258, in from_pretrained
[rank1]:     model_type_arch = model_types[0]
[rank1]: TypeError: 'NoneType' object is not subscriptable
[rank0]: Traceback (most recent call last):
[rank0]:   File "/home/hx/Qwen/Qwen2.5-VL-7B-machine/LLaMA-Factory/src/llamafactory/launcher.py", line 23, in <module>
[rank0]:     launch()
[rank0]:   File "/home/hx/Qwen/Qwen2.5-VL-7B-machine/LLaMA-Factory/src/llamafactory/launcher.py", line 19, in launch
[rank0]:     run_exp()
[rank0]:   File "/home/hx/Qwen/Qwen2.5-VL-7B-machine/LLaMA-Factory/src/llamafactory/train/tuner.py", line 107, in run_exp
[rank0]:     _training_function(config={"args": args, "callbacks": callbacks})
[rank0]:   File "/home/hx/Qwen/Qwen2.5-VL-7B-machine/LLaMA-Factory/src/llamafactory/train/tuner.py", line 69, in _training_function
[rank0]:     run_sft(model_args, data_args, training_args, finetuning_args, generating_args, callbacks)
[rank0]:   File "/home/hx/Qwen/Qwen2.5-VL-7B-machine/LLaMA-Factory/src/llamafactory/train/sft/workflow.py", line 52, in run_sft
[rank0]:     model = load_model(tokenizer, model_args, finetuning_args, training_args.do_train)
[rank0]:   File "/home/hx/Qwen/Qwen2.5-VL-7B-machine/LLaMA-Factory/src/llamafactory/model/loader.py", line 135, in load_model
[rank0]:     model = load_unsloth_pretrained_model(config, model_args)
[rank0]:   File "/home/hx/Qwen/Qwen2.5-VL-7B-machine/LLaMA-Factory/src/llamafactory/model/model_utils/unsloth.py", line 55, in load_unsloth_pretrained_model
[rank0]:     model, _ = FastLanguageModel.from_pretrained(**unsloth_kwargs)
[rank0]:   File "/home/hx/miniconda3/envs/Qwen/lib/python3.10/site-packages/unsloth/models/loader.py", line 308, in from_pretrained
[rank0]:     return FastModel.from_pretrained(
[rank0]:   File "/home/hx/miniconda3/envs/Qwen/lib/python3.10/site-packages/unsloth/models/loader.py", line 714, in from_pretrained
[rank0]:     model, tokenizer = FastBaseModel.from_pretrained(
[rank0]:   File "/home/hx/miniconda3/envs/Qwen/lib/python3.10/site-packages/unsloth/models/vision.py", line 258, in from_pretrained
[rank0]:     model_type_arch = model_types[0]
[rank0]: TypeError: 'NoneType' object is not subscriptable
[rank0]:[W325 08:29:49.570195329 ProcessGroupNCCL.cpp:1496] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown  (function operator())
W0325 08:29:51.881845 2541207 site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 2541332 closing signal SIGTERM
W0325 08:29:51.882242 2541207 site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 2541333 closing signal SIGTERM
W0325 08:29:51.883206 2541207 site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 2541335 closing signal SIGTERM
E0325 08:29:54.067641 2541207 site-packages/torch/distributed/elastic/multiprocessing/api.py:869] failed (exitcode: 1) local_rank: 2 (pid: 2541334) of binary: /home/hx/miniconda3/envs/Qwen/bin/python
Traceback (most recent call last):
  File "/home/hx/miniconda3/envs/Qwen/bin/torchrun", line 8, in <module>
    sys.exit(main())
  File "/home/hx/miniconda3/envs/Qwen/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/errors/__init__.py", line 355, in wrapper
    return f(*args, **kwargs)
  File "/home/hx/miniconda3/envs/Qwen/lib/python3.10/site-packages/torch/distributed/run.py", line 918, in main
    run(args)
  File "/home/hx/miniconda3/envs/Qwen/lib/python3.10/site-packages/torch/distributed/run.py", line 909, in run
    elastic_launch(
  File "/home/hx/miniconda3/envs/Qwen/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 138, in __call__
    return launch_agent(self._config, self._entrypoint, list(args))
  File "/home/hx/miniconda3/envs/Qwen/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 269, in launch_agent
    raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError: 
============================================================
/home/hx/Qwen/Qwen2.5-VL-7B-machine/LLaMA-Factory/src/llamafactory/launcher.py FAILED
------------------------------------------------------------
Failures:
  <NO_OTHER_FAILURES>
------------------------------------------------------------
Root Cause (first observed failure):
[0]:
  time      : 2025-03-25_08:29:51
  host      : g1a6000
  rank      : 2 (local_rank: 2)
  exitcode  : 1 (pid: 2541334)
  error_file: <N/A>
  traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html 
============================================================

Others

在训练时总是显存使用不均,所以选择使用unsloth加速,但是出现了现在的错误

@han-lx han-lx added bug Something isn't working pending This problem is yet to be addressed labels Mar 25, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working pending This problem is yet to be addressed
Projects
None yet
Development

No branches or pull requests

1 participant