Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Benchmark ML-KEM #4653

Draft
wants to merge 7 commits into
base: main
Choose a base branch
from
Draft

Benchmark ML-KEM #4653

wants to merge 7 commits into from

Conversation

onvej-sl
Copy link
Contributor

@onvej-sl onvej-sl commented Feb 20, 2025

The purpose of this pull request is to test the viability of running ML-KEM on Trezor. ML-KEM (Module-Lattice-Based Key-Encapsulation Mechanism, formerly known as CRYSTALS-Kyber) is a quantum computer-resistant key encapsulation mechanism algorithm that was standardized by NIST as a finalist in the NIST post-quantum cryptography competition.

ML-KEM is a post-quantum replacement for the Diffie-Hellman protocol. Unlike Diffie-Hellman, this protocol is not symmetrical. Only one person contributes to the entropy of the resulting shared secret.

There are three variants: ML-KEM-512, ML-KEM-768, and ML-KEM-1024, in the order of increasing security strength and decreasing performance.

I adopted this implementation because they claim it to be resistant to side-channel attacks. However, I have not conducted an in-depth survey of the implementations.

ML-KEM-512 ML-KEM-768 ML-KEM-1024
bit strength 128 192 256
generate_keypair 4.9 ms / 10 kB 7.9 ms / 14 kB 11 ms / 20 kB
encapsulate 4.8 ms / 13 kB 7.8 ms / 17 kB 11 ms / 23 kB
decapsulate 5.6 ms / 13 kB 9.0 ms / 18 kB 13 ms / 25 kB
encapsulation key size 800 bytes 1184 bytes 1568 bytes
decapsulation key size 1632 bytes 2400 bytes 3168 bytes
ciphertext size 768 bytes 1088 bytes 1568 bytes
shared secret size 32 bytes 32 bytes 32 bytes
implementation size 13 kB 13 kB 13 kB

The benchmark was run on the trezor T. I don't have a way to measure RAM usage, but I had to increase the size of the C stack from 16 KB to 32 KB. The implementation size includes an implementation of SHA-3 that is 5.5 kB and is probably duplicate.

EDITED: Thanks to @romanz, I added stack usage.

Copy link

github-actions bot commented Feb 20, 2025

core UI changes device test click test persistence test
T2T1 Model T test(screens) main(screens) test(screens) main(screens) test(screens) main(screens)
T3B1 Safe 3 test(screens) main(screens) test(screens) main(screens) test(screens) main(screens)
T3T1 Safe 5 test(screens) main(screens) test(screens) main(screens) test(screens) main(screens)
All main(screens)

@romanz
Copy link
Contributor

romanz commented Feb 21, 2025

Maybe #4595 can help with estimating the stack usage?

romanz and others added 4 commits February 21, 2025 14:40
By zeroing the stack memory before the workflow runs,
we can estimate how much of it has been used (by reading
the stack memory and looking for the first non-zero value).

[no changelog]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: 🔎 Needs review
Development

Successfully merging this pull request may close these issues.

2 participants