You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Transformers test_cpu_offload and few other tests fail for blip, dab_detr, roberta and vilt models running with XPU backend. Note: I can't reproduce this issue with CUDA on A10.
When issue happens self.tied_params_map[value_pointer] set is empty. The trivial if condition to check whether it's not empty allows to avoid the issue and tests pass. As I noted, I don't see this issue happening for CUDA. I also see that self.tied_pointers_to_remove is populated twice with the same values for XPU and then post_forward() is also called twice in a row with the issue happening on the second pass.
The trivial if condition to check whether it's not empty allows to avoid the issue and tests pass.
See #3403 with such a fix. if condition helps to avoid KeyError.... However, I am not sure why this situation happens. I afraid that I might not have addressed actual issue and just fixed symptom.... Can someone help suggest a better fix or explain why this fix would be correct one?
With:
On:
Transformers
test_cpu_offload
and few other tests fail forblip
,dab_detr
,roberta
andvilt
models running with XPU backend. Note: I can't reproduce this issue with CUDA on A10.Issue happens here:
accelerate/src/accelerate/hooks.py
Lines 397 to 399 in 8039158
When issue happens
self.tied_params_map[value_pointer]
set is empty. The trivialif
condition to check whether it's not empty allows to avoid the issue and tests pass. As I noted, I don't see this issue happening for CUDA. I also see thatself.tied_pointers_to_remove
is populated twice with the same values for XPU and thenpost_forward()
is also called twice in a row with the issue happening on the second pass.CC: @SunMarc @faaany @zucchini-nlp
The text was updated successfully, but these errors were encountered: