Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

蒸馏损失只升不降 #387

Open
aiyuedeyang opened this issue Jan 16, 2025 · 2 comments
Open

蒸馏损失只升不降 #387

aiyuedeyang opened this issue Jan 16, 2025 · 2 comments

Comments

@aiyuedeyang
Copy link

您好,我在训练的过程中蒸馏的loss一直在升高,然后震荡,最后会在30多ep后突然nan,l1和tea的loss能够下降,但是蒸馏loss变nan之后也会直接影响到l1和tea,想问一下作者有遇到这样的问题吗。应该能够排除掉坏样本的问题,

@hzwer
Copy link
Owner

hzwer commented Jan 20, 2025

你好,我有时候换数据集训练会遇到这种情况
通常我会调小蒸馏损失的权重系数以及增加优化器的 weight decay

@aiyuedeyang
Copy link
Author

!!我也是这么做的,但是还是会崩,主要是不知道为什么会突然nan,我再看一看,谢谢回复

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants