1.18.17
[1.18.17]
Changed
- Updated to MXNet 1.2
- Use of the new LayerNormalization operator to save GPU memory.
[1.18.16]
Fixed
- Removed summation of gradient arrays when logging gradients.
This clogged the memory on the primary GPU device over time when many checkpoints were done.
Gradient histograms are now logged to Tensorboard separated by device.