loss=nan #18

05063112lcs · 2024-04-22T09:01:31Z

Hello, thank you very much for your excellent work, but when I was about to reproduce your work recently, I got an error that showed "loss=nan", I was training on the A5000GPU, what is the reason for this?

ryf1123 · 2024-04-26T09:00:36Z

Hi, thank you for posting this issue. We did not observe this problem in our testing. I am not sure whether this is caused by some corrupted data samples. Can you first try to filter the loss with function torch.nan_to_num and see if it helps?

05063112lcs · 2024-04-27T06:21:34Z

嗨，感谢您发布此问题。我们在测试中没有观察到这个问题。我不确定这是否是由某些损坏的数据样本引起的。你能先尝试用函数过滤损失，看看是否有帮助吗？torch.nan_to_num

Hello, thank you very much for your reply. I had a problem in the process of training, I didn't make any changes, in the first or second round of training there would be "loss=nan", I tried to adjust the learning rate, but it didn't work. I used an A5000 graphics card for reproduction

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

loss=nan #18

loss=nan #18

05063112lcs commented Apr 22, 2024

ryf1123 commented Apr 26, 2024

05063112lcs commented Apr 27, 2024

loss=nan #18

loss=nan #18

Comments

05063112lcs commented Apr 22, 2024

ryf1123 commented Apr 26, 2024

05063112lcs commented Apr 27, 2024