Skip to content

Latest commit

 

History

History
50 lines (32 loc) · 2.6 KB

README.md

File metadata and controls

50 lines (32 loc) · 2.6 KB

Development files

This directory contains intermediate development artifacts that are not part of the final mlkem-native sources.

It is only relevant to you if you are developing mlkem-native or would like to understand the origin of the assembly source files.

AArch64 arithmetic assembly

Clean

aarch64_clean contains the 'clean' assembly underlying the AArch64 native backend of mlkem-native. The files in this directory are handwritten and kept readable through the extensive use of register aliases and macros.

Optimized

aarch64_opt contains the results of running the SLOTHY superoptimizer on the clean assembly files in aarch64_clean. The optimized sections are 'raw' assembly in the sense that they no longer use register macros or aliases, but the surrounding code (such as the function preamble and postamble) typically still use those register aliases/macros. Also, the macros and alias definitions themselves are still kept.

Final

The final AArch64 arithmetic assembly from mlkem/native/aarch64/src is auto-generated from the optimized assembly using the simpasm script, which simplifies it through a combination of assembly+disassembly. This final assembly does not contain any register aliases or macros anymore.

The final assembly is autogenerated from the optimized assembly through the autogen script. Non-assembly files are synchronized by copy between this directory and mlkem.

Testing clean/optimized assembly

To test the clean assembly, run autogen --aarch64-clean. This will import the clean backend into mlkem/native/aarch64/*, replacing the optimized one. With autogen --aarch64-clean --no-simplify or autogen --no-simplify you can moreover reinstate the non-simplified assembly in the main source tree.

Alternatively, you can also just manually copy the entire aarch64_clean and aarch64_opt trees into mlkem/native/aarch64/.

AArch64 FIPS-202 assembly

As for the AArch64 arithmetic assembly, the final FIPS-202 assembly is the result of running simpasm on the assembly in fips202/aarch64/src. Non-assembly files are synchronized by copy.

x86_64 arithmetic assembly

As for the AArch64 arithmetic assembly, the final x86_64 arithmetic assembly is the result of running simpasm on the assembly in x86_64/src. Non-assembly files are synchronized by copy.