v0.1.2: LightGlue-ONNX-MP-Flash
LightGlue-ONNX Flash Attention Models
This release provides exported ONNX LightGlue models with Flash Attention enabled, in both full- (*_flash.onnx
) and mixed-precision (*_mp_flash.onnx
). Both standalone models and end-to-end pipelines (*_end2end_*.onnx
) are provided. Mixed-precision combined with Flash Attention produces the fastest inference times. Please refer to EVALUATION.md for detailed speed comparisons.
All models are exported with flash-attn==1.0.8
. Note that flash-attn
does NOT need to be installed for inference.