Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MSVC workarounds #1061

Merged
merged 19 commits into from
Oct 11, 2023
Merged

MSVC workarounds #1061

merged 19 commits into from
Oct 11, 2023

Conversation

vasama
Copy link
Contributor

@vasama vasama commented Sep 2, 2023

With these changes stdexec now builds and tests pass using the latest Visual Studio preview release as well as using Clang-CL.

@copy-pr-bot
Copy link

copy-pr-bot bot commented Sep 2, 2023

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@vasama
Copy link
Contributor Author

vasama commented Sep 2, 2023

I'll have to see about adding MSVC CI before this is merged.

@ericniebler
Copy link
Collaborator

/ok to test

@ericniebler
Copy link
Collaborator

I like all of this with the exception of the coroutine hack. It's an impressive hack, but it comes at a high cost, doubling the number of coroutines created. I'm curious how other coroutine libraries like cppcoro deal with this.

IIUC, this issue only effects at_coroutine_exit() and the other scope_guard-like features (on_coroutine_[succeeded|stopped|failed]). For now, I think I'd prefer stdexec simply didn't offer that functionality on msvc. Or offered it with an opt-in config macro.

@maikel
Copy link
Collaborator

maikel commented Sep 6, 2023

This is amazing! Thank you for the wonderful work

@maikel
Copy link
Collaborator

maikel commented Sep 6, 2023

Can we have a ci job for msvc without blocking PRs if it fails? Or just support msvc from now on?

@vasama
Copy link
Contributor Author

vasama commented Sep 6, 2023

I like all of this with the exception of the coroutine hack.

Fair enough. I wasn't too worried about performance for a temporary compiler bug workaround. How do you feel about this alternative solution which only creates a single extra coroutine per thread?

namespace stdexec {
  // MSVCBUG https://developercommunity.visualstudio.com/t/Incorrect-codegen-in-await_suspend-aroun/10454102

  // MSVC incorrectly allocates the return buffer for await_suspend calls within the suspended coroutine
  // frame. When the suspended coroutine is destroyed within await_suspend, the continuation coroutine handle
  // is not only used after free, but also overwritten by the debug malloc implementation when NRVO is in play.

  // This workaround delays the destruction of the suspended coroutine by wrapping the continuation in another
  // coroutine which destroys the former and transfers execution to the original continuation.

  // The wrapping coroutine is thread-local and is reused within the thread for each destroy-and-continue sequence.
  // The wrapping coroutine itself is destroyed at thread exit.

  namespace __destroy_and_continue_msvc {
    struct __task {
      struct promise_type {
        __task get_return_object() noexcept {
          return { __coro::coroutine_handle<promise_type>::from_promise(*this) };
        }

        static std::suspend_never initial_suspend() noexcept {
          return {};
        }

        static std::suspend_always final_suspend() noexcept {
          return {};
        }

        static void return_void() noexcept {
          STDEXEC_ASSERT(!"Should never get here");
        }

        static void unhandled_exception() noexcept {
          STDEXEC_ASSERT(!"Should never get here");
        }
      };

      __coro::coroutine_handle<> __coro_;
    };

    struct __continue_t {
      static constexpr bool await_ready() noexcept {
        return false;
      }

      __coro::coroutine_handle<> await_suspend(__coro::coroutine_handle<> __h) noexcept {
        return __continue_;
      }

      static void await_resume() noexcept {
      }

      __coro::coroutine_handle<> __continue_;
    };

    struct __context {
      __coro::coroutine_handle<> __destroy_;
      __coro::coroutine_handle<> __continue_;
    };

    inline __task __co_impl(__context& __c) {
      while (true) {
        co_await __continue_t{ __c.__continue_ };
        __c.__destroy_.destroy();
      }
    }

    struct __context_and_coro {
      __context_and_coro() {
        __context_.__continue_ = __coro::noop_coroutine();
        __coro_ = __co_impl(__context_).__coro_;
      }

      ~__context_and_coro() {
        __coro_.destroy();
      }

      __context __context_;
      __coro::coroutine_handle<> __coro_;
    };

    inline __coro::coroutine_handle<> __impl(
      __coro::coroutine_handle<> __destroy,
      __coro::coroutine_handle<> __continue) {
      static thread_local __context_and_coro __c;
      __c.__context_.__destroy_ = __destroy;
      __c.__context_.__continue_ = __continue;
      return __c.__coro_;
    }
  } // namespace __destroy_and_continue_msvc
} // namespace stdexec

#define STDEXEC_DESTROY_AND_CONTINUE(__destroy, __continue) \
  (::stdexec::__destroy_and_continue_msvc::__impl(__destroy, __continue))
#else
#define STDEXEC_DESTROY_AND_CONTINUE(__destroy, __continue) \
  (__destroy.destroy(), __continue)
#endif

@vasama
Copy link
Contributor Author

vasama commented Sep 6, 2023

That nvc++ CI failure looks like it uncovered a compiler bug. I couldn't quickly reproduce it using CE.

@ericniebler
Copy link
Collaborator

How do you feel about this alternative solution which only creates a single extra coroutine per thread?

I feel awed. I'll take it. Ty!

Can we have a ci job for msvc without blocking PRs if it fails?

I would also like this if it's possible.

@ericniebler
Copy link
Collaborator

/ok to test

@vasama
Copy link
Contributor Author

vasama commented Sep 9, 2023

Looks like the GitHub Windows images don't provide the Visual Studio preview version out of the box. The next minor release probably won't happen for a few months at least. I'll see if i can work around the relevant bugs using the Visual Studio release version.

@vasama
Copy link
Contributor Author

vasama commented Sep 10, 2023

It looks like making this work in the current VS release version would be a lot more work. The next release will probably be some time around november. I'm okay with waiting for that before merging this. Many of these workarounds should also become unnecessary at that time as the compiler bugs i reported are fixed.

@ericniebler
Copy link
Collaborator

/ok to test

@ericniebler
Copy link
Collaborator

@vasama do you have any objections to my merging this now instead of waiting until November? I don't want to have to keep fixing merge conflicts.

@vasama
Copy link
Contributor Author

vasama commented Oct 10, 2023

No objections if that's what you prefer. I was going to just fix the conflicts and add any new workarounds that might be needed when the next VS version is released. But i'm also fine with merging this and making another PR when that time comes.

@ericniebler
Copy link
Collaborator

/ok to test

@ericniebler ericniebler merged commit fb6a5e0 into NVIDIA:main Oct 11, 2023
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants