-
Notifications
You must be signed in to change notification settings - Fork 592
Issues: modelscope/ms-swift
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
Do swift support these type of data for training multimodal reward model(having value head)
#3888
opened Apr 15, 2025 by
zhang123434
Is there a specific method for training GRPO using Qwen2.5-VL-3B-Instruct with LoRA?
#3882
opened Apr 15, 2025 by
sms-s
AssertionError: quant_method: bnb, quantized model and does not support merge-lora.
#3859
opened Apr 12, 2025 by
cahya-wirawan
GRPO 算法如果设置 reward_model 而不是--reward_funcs ,reward模型和 model都加载到一张卡里去了
#3843
opened Apr 11, 2025 by
wellhowtosay
grpo TypeError: CosineReward.__call__() missing 1 required positional argument: 'solution'
#3840
opened Apr 11, 2025 by
kanqgg
Previous Next
ProTip!
Exclude everything labeled
bug
with -label:bug.