-
Notifications
You must be signed in to change notification settings - Fork 173
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
exec::reduce including a customization for static_thread_pool #992
base: main
Are you sure you want to change the base?
Conversation
Pull requests from external contributors require approval from a |
/ok to test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall this looks really great, thanks Andre. A few nits, and needs some more type-checking.
auto reducer = [&](auto& input) { | ||
const auto [first, last] = even_share(std::ranges::size(input), tid, total_threads); | ||
if (first != last) { | ||
auto begin = std::ranges::begin(input) + first; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where do we require that the input range is random-access? Is that over-constraining?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know that I still need to add more type checking.
If it were up to me I would constrain it to random access ranges. On the other hand std::reduce
requires only input iterators or forward iterators for the execution policy overloads. I don't know if anyone would actually expect exec::reduce
to work with an input range.
include/exec/static_thread_pool.hpp
Outdated
// deallocate | ||
sh_state.partials_ = {}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This won't actually free any memory; it'll just set the size to zero. If you really need to free the memory early, the surest way is to swap it with a default-constructed vector.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't believe I fell for this age-old trap.
sh_state.apply(finalize); | ||
} | ||
}; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't appear to be handling exceptions that may be thrown from the reduction function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know. I left exception handling out, just wanted stuff to work first. I'll add it.
_InitT, | ||
_RedOp>) | ||
&& (!stdexec::tag_invocable<reduce_t, _Sender, _InitT, _RedOp>) | ||
STDEXEC_DETAIL_CUDACC_HOST_DEVICE __sender<_Sender, _InitT, _RedOp> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should require that _RedOp is callable with (_InitT, _InitT), right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right.
template <stdexec::__decays_to<__t> _Self, class _Env> | ||
friend auto tag_invoke(stdexec::get_completion_signatures_t, _Self&&, _Env&&) | ||
-> stdexec::completion_signatures<stdexec::set_value_t(_InitT)> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At this point, we have all the type information we need in order to do full type-checking. We can and should check that the sender can only complete one way, with a single value that is a range whose element type is the same as _InitT.
We also know whether the reduction function is potentially throwing, and if it is, to add set_error_t(exception_ptr)
as a possible completion signature.
If you're feeling ambitious, you can see how an adaptor like then
detects when the types don't match and tries to return a type that describes the error -- or propagates type errors from inner senders. If that's too much, don't sweat it. We can improve diagnostics in a separate PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've seen the error handling in other sender adaptors. I think I should add it as part of this PR.
template <class T> | ||
struct dummy_ { | ||
using __t = T; | ||
}; | ||
|
||
using inrange_t = // | ||
stdexec::__decay_t<stdexec::__t<stdexec::__t< stdexec::__value_types_of_t< | ||
Sender, | ||
stdexec::env_of_t<Receiver>, | ||
stdexec::__q<dummy_>, | ||
stdexec::__q<dummy_>>>>>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is equivalent:
template <class T> | |
struct dummy_ { | |
using __t = T; | |
}; | |
using inrange_t = // | |
stdexec::__decay_t<stdexec::__t<stdexec::__t< stdexec::__value_types_of_t< | |
Sender, | |
stdexec::env_of_t<Receiver>, | |
stdexec::__q<dummy_>, | |
stdexec::__q<dummy_>>>>>; | |
using inrange_t = // | |
stdexec::__decay_t< | |
stdexec::__value_types_of_t< | |
Sender, | |
stdexec::env_of_t<Receiver>, | |
stdexec::__q<stdexec::__midentity>, | |
stdexec::__q<stdexec::__midentity>>>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I still need to get a better overview over what metaprogramming helpers exist.
#include "../stdexec/execution.hpp" | ||
|
||
#include <numeric> | ||
#include <ranges> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's ok for it to be an error to include exec/reduce.hpp if the stdlib doesn't support <ranges>
, but the reduce test should simply not run in that case instead of causing a test failure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agreed
template < | ||
stdexec::__same_as<stdexec::set_value_t> _Tag, | ||
class _Range, | ||
class _Value = stdexec::range_value_t<_Range>> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
std::ranges::range_value_t
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oops, that's a leftover I forgot to remove
e88b0fe
to
cc964a1
Compare
/ok to test |
This adds
exec::reduce
exec::reduce
for running onexec::static_thread_pool
's schedulerThis does not (yet) introduce a common customization point with
nvexec::reduce
mainly because this would require a common default reduction operation (nvexec::reduce
usescub::Sum
by default whileexec::reduce
usesstd::plus<>
).If this is desired then one should probably also provide an entire set of common reduction operations like min, max, sum, product, etc.