-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix map! undefined value exposure #56673
base: master
Are you sure you want to change the base?
Conversation
pretty sure build failures are unrelated |
This looks like a simple typo: Line 3466 in a17db2b
That's not how |
typo aside, it's checking the wrong thing. that's checking that literally all the indices are equal.
but for a correct implementation to match the docstring, I think this is one of those cases where Julia skipped directly to "make it fast" before "make it right" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for addressing this significant issue. Much appreciated. A related issue that you might want to be aware of, but don't need to fix in this PR is #56696.
I agree that it is acceptable to take some performance regressions in exchange for fixing this bug. I have some refactoring suggestions to make it a bit simpler.
I'm planning to run benchmarks once we agree on an implementation, but if you want me to run them right away instead I'd be happy to do that.
base/abstractarray.jl
Outdated
idxs1 = eachindex(LinearIndices(As[1])) | ||
@boundscheck begin | ||
idxs1 = intersect(map(eachindex ∘ LinearIndices, As)...) | ||
checkbounds(dest, idxs1) | ||
end | ||
for i = idxs1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
idxs1 = eachindex(LinearIndices(As[1])) | |
@boundscheck begin | |
idxs1 = intersect(map(eachindex ∘ LinearIndices, As)...) | |
checkbounds(dest, idxs1) | |
end | |
for i = idxs1 | |
idxs = mapreduce(LinearIndices, intersect, As) | |
checkbounds(dest, inds) | |
for i = idxs |
- I don't see why we need
eachindex
is here - We should only include bounds-check elision if benchmarks reveal that it makes a difference. In this case, we would also need to add
@propagate_inbounds
tomap!
to make it take effect. - I think that
mapreduce(LinearIndices, intersect, As)
is more elegant thanintersect(map(LinearIndices, As)...)
, unless there is some performance reason to use the latter. - If always compute the intersection of indices then
inds
is a better name thaninds1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thank you for the comments!
the eachindex
I added for the following reason
julia> mapreduce(LinearIndices, intersect, [A, B])
3-element Vector{Int64}:
1
2
3
julia> mapreduce(eachindex ∘ LinearIndices, intersect, [A, B])
Base.OneTo(3)
so I figured that if we can hit fast-path intersection for ranges rather than collecting the indices we should. Another solution might be to add a fast intersect(::LinearIndices, ::LinearIndices)
?
the mapreduce
formulation indeed looks nicer to me as well.
- If always compute the intersection of indices then
inds
is a better name thaninds1
Agreed! but I suppose we should decide if in the first place @inbounds
should allow skipping the check and iterate through the indices of the first argument? I think this is a decision on semantics rather than implementation that I am not empowered to make. the docs should probably be updated if this is intended
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the
eachindex
I added for the following reason
Good reason.
I suppose we should decide if in the first place
@inbounds
should allow skipping the check and iterate through the indices of the first argument?
@inbounds
should never expose documented semantics that are otherwise not present. This is because you can't count on bounds checks being removed. For exmaple, Julia could be run with -check-bound=yes
or the boundschecking might not be inlined into the callsite where @inbounds
is present. In either case, the bounds checks will remain. Consequently I think the answer to that question is no: @inbounds
should not change the semantics other than to skip bounds checks.
test/abstractarray.jl
Outdated
@@ -910,6 +910,9 @@ include("generic_map_tests.jl") | |||
generic_map_tests(map, map!) | |||
@test_throws ArgumentError map!(-, [1]) | |||
|
|||
# Issue #30624 | |||
@test map!(+, [0,0,0], [1,2], [10,20,30], [100]) == [111,0,0] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would you also add a test for #36235 (comment), please?
…eted_inplace_map_idxs
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Previously, map!(+, ones(1), ones(2, 2))
and map!(+, ones(1), ones(2, 2), ones(2, 2))
worked and this makes them throw (as the docstring indicates they should).
Performance is, at a glance, sometimes slightly better and sometimes slightly worse than it was on master. Correctness first, performance optimization second so I'm not too concerned.
I like that the implementation for map_n!
is a bit simpler now and that it clearly aligns with the implementations of 1-arg and 2-arg map.
@@ -910,6 +910,26 @@ include("generic_map_tests.jl") | |||
generic_map_tests(map, map!) | |||
@test_throws ArgumentError map!(-, [1]) | |||
|
|||
function test_30624() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not a testset?
Unfortunately we have to deal with them before merging because otherwise existing, valid, code that uses OffsetArrays will break. For example, julia> R = fill(0, -4:-1);
julia> B = OffsetArray(reshape(1:12, 4, 3), -5, 6);
julia> minimum!(R, B)
4-element OffsetArray(::Vector{Int64}, -4:-1) with eltype Int64 with indices -4:-1:
1
2
3
4 This is a test failure from CI which fails because the implementation of julia> x = fill(0, 1:3);
julia> y = fill(1, 2:4);
julia> map!(identity, x, y)
3-element OffsetArray(::Vector{Int64}, 1:3) with eltype Int64 with indices 1:3:
1
1
1 @nanosoldier |
I guess we'll have to iterate through destination idxs separately to fix the
I could be wrong but this seems more likely to hurt performance. Would there be a meaningful difference here between iterating through |
The package evaluation job you requested has completed - possible new issues were detected. |
Oh, wow! Nanosoldier looks clean. Rerunning failures to be sure: @nanosoldier |
Given how nice PkgEval is looking, another option would be to fix the
I doubt it, but hopefully that won't be necessary. If we can implement it in a way that isn't practically breaking, I'd love to make |
I'm finding all the indirection in but I guess the "real" issue is in
in your example, we have
so then
on |
The package evaluation job you requested has completed - possible new issues were detected. |
at the very least, this PR would swap "bad" bugs (segfaults, undefined behavior) for "regular" bugs (something throwing a |
hopefully more targeted / less disruptive than #50647 , would (at least partially) fix #36235
may need extra documentation explaining that with this implementation,
@inbounds map!(dest, As)
will always iterate through the indices ofAs[1]
regardless of the other sizestbh, the existing implementation is pretty fundamentally incorrect and does not match the behavior explicitly described in the docstring, so I think that this PR (or #50647 , or something else similar) should merge regardless of benchmarks impact. however, I would appreciate if someone could trigger a benchmark run just so we can see the impact :)