Releases: lucidrains/native-sparse-attention-pytorch
Releases · lucidrains/native-sparse-attention-pytorch
0.0.50
0.0.49
found issue with intermittent backwards error, get e2e train script f…
0.0.48
dq down
0.0.47
redo with approach of using compressed similarities for interpolation…
0.0.45
mask out the block diagonal for the importance score, as block causal…
0.0.44
when doing interpolation of importance score, remask to 0 for illegal…
0.0.43
default to one mem kv for compressed attn
0.0.42
Full Changelog: 0.0.41...0.0.42
0.0.41
ready to be compared with full attention.
0.0.40
oops