Clarify the gradient accumulation in TensorFlow

IvanUkhov · Dec 5, 2024 · 602d76c · 602d76c
1 parent b112933
commit 602d76c
Showing 1 changed file with 4 additions and 4 deletions.
diff --git a/_posts/2024-01-31-gradient-accumulation.md b/_posts/2024-01-31-gradient-accumulation.md
@@ -101,10 +101,10 @@ It is important to note that the learning rate keeps on changing (if variable)
 and the weights keep on decaying (if enabled) during accumulation. Therefore,
 one should account for this when configuring the optimizer at hand.
 
-One should also note that Keras does support gradient accumulation, which is
-controlled via the `gradient_accumulation_steps` option of optimizers. However,
-it does not play well with distributed training strategies, which will hopefully
-be rectified in the future.
+One should also note that TensorFlow does support gradient accumulation as of
+version 2.16, which is controlled by the `gradient_accumulation_steps` option of
+Keras optimizers. However, it does not play well with distributed training
+strategies, which will hopefully be rectified in the future.
 
 # Acknowledgments