Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DS Chat Step 3 - Add separate Lora Adam optimizer group #633

Merged
merged 3 commits into from
Jul 12, 2023

Conversation

lekurile
Copy link
Contributor

@lekurile lekurile commented Jul 10, 2023

This PR adds a separate Lora Adam optimizer group (for lora_right_weight and lora_left_weight params) with a Lora-specific learning rate of lr=5e-4. After this change, Step 3 training convergence with Lora enabled improved across various configurations when using zero stage 2.

Thanks to @yaozhewei for the insight!

BEFORE:
image

AFTER:
image

@lekurile lekurile merged commit b093f58 into master Jul 12, 2023
2 checks passed
LeetJoe pushed a commit to LeetJoe/DeepSpeedExamples that referenced this pull request Sep 15, 2023
This PR adds a separate Lora Adam optimizer group (for lora_right_weight and lora_left_weight params) with a Lora-specific learning rate of lr=5e-4. After this change, Step 3 training convergence with Lora enabled improved across various configurations when using zero stage 2.

Thanks to @yaozhewei for the insight!
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants