Skip to content

Commit

Permalink
Clarify the gradient accumulation in TensorFlow
Browse files Browse the repository at this point in the history
  • Loading branch information
IvanUkhov committed Dec 5, 2024
1 parent b112933 commit 602d76c
Showing 1 changed file with 4 additions and 4 deletions.
8 changes: 4 additions & 4 deletions _posts/2024-01-31-gradient-accumulation.md
Original file line number Diff line number Diff line change
Expand Up @@ -101,10 +101,10 @@ It is important to note that the learning rate keeps on changing (if variable)
and the weights keep on decaying (if enabled) during accumulation. Therefore,
one should account for this when configuring the optimizer at hand.

One should also note that Keras does support gradient accumulation, which is
controlled via the `gradient_accumulation_steps` option of optimizers. However,
it does not play well with distributed training strategies, which will hopefully
be rectified in the future.
One should also note that TensorFlow does support gradient accumulation as of
version 2.16, which is controlled by the `gradient_accumulation_steps` option of
Keras optimizers. However, it does not play well with distributed training
strategies, which will hopefully be rectified in the future.

# Acknowledgments

Expand Down

0 comments on commit 602d76c

Please sign in to comment.