Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

smart Schedule中R操作没有和C操作重叠 #213

Open
WhatBrain opened this issue Sep 29, 2024 · 5 comments
Open

smart Schedule中R操作没有和C操作重叠 #213

WhatBrain opened this issue Sep 29, 2024 · 5 comments

Comments

@WhatBrain
Copy link

Describe the bug
我使用megatron-LM V2.5 patch ,执行命令为
FMOE_FASTER_SHADOW_ENABLE=1 FMOE_FASTER_SCHEDULE_ENABLE=1 FMOE_FASTER_GROUP_SIZE=4 bash pretrain_gpt_distributed.sh
用单机8卡跑gpt2+moe,设置了一共16个expert,在profiler中可以看到每个卡有2个expert,分成2组,每个expert跑2次
image
但4个R操作是在所有expert的C操作执行完后才一起进行:
image
这是怎么回事,非常感谢能回答这个问题的人

Logs
If applicable, add logs to help explain your problem.

Platform

  • Device: [e.g. NVIDIA A100]
  • CUDA version: [12.1]
  • NCCL version: [2.18.1]
  • PyTorch version: [2.1.0]
@laekov
Copy link
Owner

laekov commented Sep 29, 2024

@zms1999 之前是否观察到过同样的问题?

@WhatBrain
Copy link
Author

image
how to use @SagarChandra07

@laekov
Copy link
Owner

laekov commented Sep 29, 2024

image how to use @SagarChandra07

请谨慎操作. 此贴疑似钓鱼.

@thelabcat
Copy link

image

how to use @SagarChandra07

Do not run that file! The account has been spamming links to malware all over GitHub. In at least some cases they use a password protected zip archive to evade the auto-check by MediaFire.

@WhatBrain
Copy link
Author

image
how to use @SagarChandra07

Do not run that file! The account has been spamming links to malware all over GitHub. In at least some cases they use a password protected zip archive to evade the auto-check by MediaFire.

Ok.Thank you very much

@github-staff github-staff deleted a comment Sep 29, 2024
@github-staff github-staff deleted a comment Oct 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants