Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Catching traps #15

Closed
pepyakin opened this issue Aug 16, 2018 · 7 comments
Closed

Catching traps #15

pepyakin opened this issue Aug 16, 2018 · 7 comments

Comments

@pepyakin
Copy link
Contributor

pepyakin commented Aug 16, 2018

Ref #14

We need to catch traps generated by page faults, ud2 and probably others (e.g. div by zero exceptions, but I'm not familiar how they are handled in cranelift).

As far as I know, we need to use signals on unix-like platforms. I have no idea how to handle these cases on other platforms (and even what platforms we would like to support at all).

I wonder can we provide this functionality out-of-box? Or should we require to setup all machinery from the user and just provide means to, for example, lookup trap codes?

@sunfishcode
Copy link
Member

Yeah. The TrapSink interface allows for the creation of a map from PC addresses to wasm trap ID and bytecode offset, but it does require that signals be caught and have access to the saved state. SpiderMonkey does this, on all its supported platforms, so it is doable.

So we have a few options here. One is to start building up a signal handling library and handling the signals. I imagine we'd start by just exiting the process cleanly, which shouldn't be too complex. We could then incrementally work on printing out the trap code and/or bytecode offset, or further, unwinding the stack and allowing the embedder to recover, which are doable, but more work.

Another would be to add a feature to cranelift for calling a designated callback when a trap would otherwise occur. This would make the generated code bigger, and preclude the heap guard optimizations and require explicit bounds checks on all heap accesses, but it would make it easier to embed wasmtime in environments where signals aren't available.

@pepyakin
Copy link
Contributor Author

pepyakin commented Aug 16, 2018

Hm, correct me if my thinking is too naive, but is it that difficult to start with signals with unwinding right away?

I thought that it is as simple as:

  1. before jumping to the generated code in execute call setjmp
  2. set the signal handler for SIGILL (for ud2), SIGBUS and SIGSEGV
  3. in the handler check if the signal comes from the generated code, translate signal info to trapcode and fetch location etc. Save that info somewhere, possibly fetch somehow vmctx or store it thru the thread_local (if that's signal safe enough ¯_(ツ)_/¯)
  4. longjmp back to execute. Return trap info as Err

would that work or am I missing something?

@sunfishcode
Copy link
Member

Interesting idea. I don't know how reliable longjmp from signal handlers is on various platforms these days. I believe does work on at least some though, so I wouldn't be opposed to having that as an option.

If we ever allow wasm to call into arbitrary native code and vice versa, we'd probably want to do a new setjmp each time we call back into the wasm code, so that we don't unwind through native Rust code with setjmp, but that's doable.

In the future, another option would be to do an unwind. Cranelift doesn't yet support .eh_frame or other metadata needed by native unwinders, but if we added that, then we could do a plain unwind all the way through both wasm and native code at once.

@pepyakin
Copy link
Contributor Author

pepyakin commented Aug 16, 2018

I don't know how reliable longjmp from signal handlers is on various platforms these days. I believe does work on at least some though, so I wouldn't be opposed to having that as an option.

Ha! Just implemented (or rather hacked :) ) setting a signal handler for SIGILL and catching unreachable/ud2 on a macOS machine and it works! I think, there is a good chance that it will work on Linux as well. Theoretically I can also test on Windows machine.

Regarding the second option: is it actually safe? For example, what if that native code wasn't compiled with unwind metadata?

@sunfishcode
Copy link
Member

Fun!

For unwinding, yeah, that may require all native code to be unwindable. On x86-64 System-V ABIs, LLVM and GCC both emit .eh_frame sections for all code, including C code. Other unwinders also have the ability to at least follow frame pointers. Ultimately we'd have to check each platform to see what's supported, but it'd be an option, and it's one we might need to explore eventually anyway when wasm gets support for EH.

@pepyakin
Copy link
Contributor Author

I've verified that my code runs correctly on the linux machine!

However, I've ran into a problem: it's unclear how to read and write data from the signal handler. We need to read data for longjmp use and we need to write data to pass illegal instruction address or faulting address back to execute to translate this data to useful form.

It turned out that thread_local is actually not safe enough (rust-lang/rust#43146), and we can't use global variables for that since it's not thread-safe.
Also, it is not clear to me how to get vmctx without resorting to black magic.

How does SpiderMonkey solve this issue?

@sunfishcode
Copy link
Member

The short answer is that SpiderMonkey doesn't use Rust's std::thread_local to do this.

I don't know of a way to do this in safe Rust. With unsafe code, we could have a global (not thread-local) variable hold a raw vmctx pointer. It can be statically initialized to null, and assigned the vmctx value before we call into any JIT code. Then, the signal handler can read it, and if it's null, it means we're not in JIT code. If it's non-null, then it points to a data structure where we can keep the known ranges of JIT code and use them to determine if that's where the fault happened.

And we can have variations on that if we want to support multiple wasmtime instances in the same process.

syrusakbary added a commit to wasmerio/wasmer that referenced this issue Nov 6, 2018
sunfishcode added a commit to sunfishcode/wasmtime that referenced this issue Nov 27, 2018
This adds signal handlers based on SpiderMonkey's signal-handler code.
The functionality for looking up the trap code and wasm bytecode offset
isn't yet implemented, but this is a start.

I considered rewriting this code in Rust, but decided against it for now
as C++ allows us to talk to the relevant OS APIs more directly.

Fixes bytecodealliance#15.
sunfishcode added a commit to sunfishcode/wasmtime that referenced this issue Nov 27, 2018
This adds signal handlers based on SpiderMonkey's signal-handler code.
The functionality for looking up the trap code and wasm bytecode offset
isn't yet implemented, but this is a start.

I considered rewriting this code in Rust, but decided against it for now
as C++ allows us to talk to the relevant OS APIs more directly.

Fixes bytecodealliance#15.
sunfishcode added a commit that referenced this issue Nov 27, 2018
* Implement wasm trap handlers.

This adds signal handlers based on SpiderMonkey's signal-handler code.
The functionality for looking up the trap code and wasm bytecode offset
isn't yet implemented, but this is a start.

I considered rewriting this code in Rust, but decided against it for now
as C++ allows us to talk to the relevant OS APIs more directly.

Fixes #15.

* Compile with -std=c++11.

* Refactor InstallState initialization.

* Compile with -fPIC.

* Factor out the code for calling a wasm function with a given index.

* Fix unclear wording in a comment.
kubkon added a commit to kubkon/wasmtime that referenced this issue Mar 11, 2020
* Draft out IntDatatype in wiggle-generate

This commit drafts out basic layout for `IntDatatype` structure in
`wiggle`. As it currently stands, an `Int` type is represented as
a one-element tuple struct much like `FlagDatatype`, however, with
this difference that we do not perform any checks on the input
underlying representation since any value for the prescribed type
is legal.

* Finish drafting IntDatatype support in wiggle

This commit adds necessary marshal stubs to properly pass `IntDatatype`
in and out of interface functions. It also adds a basic proptest.
grishasobol pushed a commit to grishasobol/wasmtime that referenced this issue Nov 29, 2021
Fuzz loading/validation against wabt.
pchickey pushed a commit to pchickey/wasmtime that referenced this issue May 16, 2023
This updates CI to upload a `wasi_snapshot_preview1.wasm` release
artifact from CI. One is stored per CI job in case it needs to be
inspected, and additionally a github action is used to maintain a
"latest" release for pushes to `main` to ensure that the latest copy of
`wasi_snapshot_preview1.wasm` is available.

I haven't thought much about official releases or tags or things like
that but figured that this was a good starting place at least.
dhil pushed a commit to dhil/wasmtime that referenced this issue Sep 29, 2023
This PR removes the libcall `cont_obj_has_state_invoked` and replaces it with a direct implementation.

To this end, we derive `IntoPrimitive` and `TryFromPrimitive` for the `State` enum, using the `num_enum` crate (already used elsewhere within wasmtime), to convert between `State` and integers.
mooori pushed a commit to mooori/wasmtime that referenced this issue Dec 20, 2023
…nce#15)

* zkasm: rename some types and structures away from RV64

* zkASM: remove the unwinding code

* Revert "zkasm: split AdjustSp to separate variants for Reserve and Release stack space (bytecodealliance#14)"

This reverts commit 0798ff7.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants