Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

运行run_classifier的时候oom了 #43

Open
zhengchang231 opened this issue Oct 6, 2019 · 5 comments
Open

运行run_classifier的时候oom了 #43

zhengchang231 opened this issue Oct 6, 2019 · 5 comments

Comments

@zhengchang231
Copy link

请问模型对最低显存有要求吗?seq_len和batch_size等我都调到1了还是oom,我的显存是8G的,跑的是large版的,谢谢

@brightmart
Copy link
Owner

和这个对显存的要求是一致的:

System Seq Length Max Batch Size
BERT-Base 64 64
... 128 32
... 256 16
... 320 14
... 384 12
... 512 6
BERT-Large 64 12
... 128 6
... 256 2
... 320 1
... 384 0
... 512 0

@currenttime
Copy link

同样2080GPU 8G显存
已经调的非常小了
--max_seq_length=32 --train_batch_size=2
依旧OOM
Check failed: err == cudaSuccess || err == cudaErrorInvalidValue Unexpected CUDA error: out of memory

@brightmart
Copy link
Owner

后来早点原因了吗。或能被其他应用占了显存?

@currenttime
Copy link

已经找到原因了,当一次OOM运行过后,需要手动释放内存。如果不kill掉进程,参数设置的再小还依旧是OOM。

@liucongg
Copy link

@zhengchang231 您的问题解决了吗,我也遇到了这个问题,对于bert_large模型batch_size=2,max_len=32,都OOM,显存11G.
但是我发现,使用简单的adam优化器就可以跑,但是使用Bert自带的优化器就跑不了,oom。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants