2.2.0
[2.2.0]
Changed
-
Replaced multi-head attention with interleaved_matmul_encdec operators, which removes previously needed transposes and improves performance.
-
Beam search states and model layers now assume time-major format.
[2.1.26]
Fixed
- Fixes a backwards incompatibility introduced in 2.1.17, which would prevent models trained with prior versions to be used for inference.
[2.1.25]
Changed
- Reverting PR #772 as it causes issues with
amp
.
[2.1.24]
Changed
- Make sure to write a final checkpoint when stopping with
--max-updates
,--max-samples
or--max-num-epochs
.
[2.1.23]
Changed
- Updated to MXNet 1.7.0.
- Re-introduced use of softmax with length parameter in DotAttentionCell (see PR #772).
[2.1.22]
Added
- Re-introduced
--softmax-temperature
flag forsockeye.score
andsockeye.translate
.