Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixed length list support #1992

Open
wants to merge 11 commits into
base: main
Choose a base branch
from

Conversation

cpetig
Copy link
Contributor

@cpetig cpetig commented Feb 2, 2025

Implement fixed length list support as defined in WebAssembly/component-model#384

  • wit-parser
  • wasm-encoder
  • wasmparser
  • wasmprinter
  • wat
  • wit-encoder
  • wit-smith

@cpetig cpetig marked this pull request as draft February 2, 2025 16:51
@cpetig cpetig changed the title Draft: Fixed length list support Fixed length list support Feb 2, 2025
@cpetig cpetig force-pushed the fixed-length-list branch from a0c1bc4 to e5a20d0 Compare February 9, 2025 21:57
@cpetig cpetig marked this pull request as ready for review February 9, 2025 23:21
@cpetig
Copy link
Contributor Author

cpetig commented Feb 9, 2025

Fixed size lists became a separate type because they have a different ABI representation and map to different Rust/C types.

Copy link
Member

@alexcrichton alexcrichton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this! In addition to the comments below, mind adding some more tests? For example:

  • WIth a feature gate there should be a test that using this type is invalid without the feature gate (e.g. tests/local/missing-features)
  • Tests for the *.wat syntax in tests/local/component-model/*. This also tests round-trip parsing and such
  • Tests for list<ty, 10000000000000000> (something out of the 64-bit range but still looks integer-like)

Also with the new support in wit-smith (thanks!) mind running the fuzzer for a few minutes locally to ensure it doesn't turn up anything? (FUZZER=roundtrip_wit cargo +nightly fuzz run -s none run -j30)

@@ -1134,6 +1136,9 @@ impl ComponentDefinedType {
lowered_types,
),
Self::List(_) => lowered_types.push(ValType::I32) && lowered_types.push(ValType::I32),
Self::FixedSizeList(ty, length) => {
(0..*length).all(|_n| ty.push_wasm_types(types, lowered_types))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To reference a comment from above, this is where I'd be worried that if an arbitrary length was allowed (e.g. u32::MAX) then a tiny component could take a very long time here in validation which we try to avoid. (e.g. it's an DoS attack vector)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hesitated about limits because I also have support for efficient storage of AI tensors or image data in mind, but of course with these you want the outermost list a normal list to avoid expanding all elements on the stack.

On the other hand if you nest fixed width lists of a maximum size of 1000 you can still deflate bomb types here, perhaps a global fuel like argument/result limit makes sense.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense to me yeah. My basic worry is loops like this (there's a few throughout this PR) which run a risk of taking time proportional to N which is not a property validation/processing otherwise has at this time. I think a larger limit like 128k or something may be ok (probably coupled with fuel) but I suspect we probably shouldn't allow u32::MAX regardless

@@ -91,7 +91,7 @@ enum Key {
Flags(Vec<String>),
Tuple(Vec<Type>),
Enum(Vec<String>),
List(Type),
List(Type, Option<usize>),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to other changes, mind splitting this to its own FixedSizeList?

@alexcrichton
Copy link
Member

Also, as a possible bikeshed, which you can take or leave, in Rust at least there's the term "slice" which is &[T] which is sort of like a WIT "list", and in Rust the term for [T; N] is "array" which could possibly be used here instead of "fixed size list". It's easy to confuse "list" and "array" though and harder to confuse "fixed sized list" and "list". The downside of "fixed size list" is that it's a bit of a mouthful to say and type

@cpetig
Copy link
Contributor Author

cpetig commented Feb 11, 2025

@alexcrichton do you prefer me to resolve the ones I implemented or do you want to check first?

I will implement wat roundtrip tests later today (I didn't find them although I did one manually) and think about limiting on-stack elements.

@cpetig
Copy link
Contributor Author

cpetig commented Feb 11, 2025

Also with the new support in wit-smith (thanks!) mind running the fuzzer for a few minutes locally to ensure it doesn't turn up anything? (FUZZER=roundtrip_wit cargo +nightly fuzz run -s none run -j30)

It found something ;-)
Fixed size lists require the component model fixed size list feature (at offset 0x26)
so I need to figure out where to activate this feature inside the fuzzer.

@cpetig
Copy link
Contributor Author

cpetig commented Feb 11, 2025

I need to figure out where to activate this feature inside the fuzzer.

The default Validator used for the component encoder has this feature disabled (as it is correctly disabled by default), I don't see an easy way to only activate it for the WIT fuzzing, is there a secret environment variable?

@cpetig
Copy link
Contributor Author

cpetig commented Feb 11, 2025

Thanks for this!

Thank you for your in-depth review!

* Tests for `list<ty, 10000000000000000>` (something out of the 64-bit range but still looks integer-like)

I tried but wasn't able to figure out the right way to set up this test: https://github.com/bytecodealliance/wasm-tools/pull/1992/files#diff-1532b9f29c0a59c8d44ce87b39476ec5242d1b2a2329e0b72bf7b2f23c2ae485R24-R38

@alexcrichton
Copy link
Member

do you prefer me to resolve the ones I implemented or do you want to check first?

Feel free to do so yourself, I'll give the code a once-over anyway once it's ready too.

The default Validator used for the component encoder has this feature disabled (as it is correctly disabled by default), I don't see an easy way to only activate it for the WIT fuzzing, is there a secret environment variable?

For this I'd recommend in the various locations validation is performed for tests/etc to just enable the new feature unconditionally. Adding for example Validator::new_with_features(WasmFeatures::all()) I think is reasonable since the validation in the tooling is mostly test assertions and we want to exercise it all during fuzzing anyway.

I tried but wasn't able to figure out the right way to set up this test

Ah yeah it's a lexical error right in *.wat if the integer literal is out-of-bounds so I think it's reasonable to skip the *.wat test, but I think *.wit may want to be tested? It might also be reasonable to just leave the *.wat test around to ensure that some error message comes out, and it'd use assert_malformed I think or whatever directive supports module quote ...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants