We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
您好,我在训练的过程中蒸馏的loss一直在升高,然后震荡,最后会在30多ep后突然nan,l1和tea的loss能够下降,但是蒸馏loss变nan之后也会直接影响到l1和tea,想问一下作者有遇到这样的问题吗。应该能够排除掉坏样本的问题,
The text was updated successfully, but these errors were encountered:
你好,我有时候换数据集训练会遇到这种情况 通常我会调小蒸馏损失的权重系数以及增加优化器的 weight decay
Sorry, something went wrong.
!!我也是这么做的,但是还是会崩,主要是不知道为什么会突然nan,我再看一看,谢谢回复
No branches or pull requests
您好,我在训练的过程中蒸馏的loss一直在升高,然后震荡,最后会在30多ep后突然nan,l1和tea的loss能够下降,但是蒸馏loss变nan之后也会直接影响到l1和tea,想问一下作者有遇到这样的问题吗。应该能够排除掉坏样本的问题,
The text was updated successfully, but these errors were encountered: