-
Notifications
You must be signed in to change notification settings - Fork 169
Fixing benchmarks with overflows from nla-digibench #1155
Conversation
The program has an overflow so the verdict should be true.
@divyeshunadkat In the future, we would like to maintain the original source files as .c files, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Commit 21cbb7d introduces a problem:
"no-overflow" is the specification, that is, the specification requires that the program does not have an overflow.
Since the program has an overflow, this specification will be violated, which means the expected verdict is false
.
Could you please revert this last commit (or correct me if I misunderstood).
@dbeyer The original c files are added which do not require pre-processing. Hence, I have reverted the check for pre-processing in the comment. Do let me know if I have understood this correctly and if any other changes are needed. |
@divyeshunadkat can you please explain me why the new version of the programs (using unsigned int) cannot have an overflow? While you now have one extra bit, you are using I think the same kind of problem is present in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as I commented, I don't think the changes guarantee no-overflow
@hernanponcedeleon I would also like to specify the reachability property with an appropriate verdict if possible when the program has only well-defined behaviors. I believe that irrespective of the overflow issue the reachability property can be specified along with an appropriate verified for all the benchmarks, due to the papers: [1], [2] , [3] However, in order to adhere to the sv-comp rules for benchmarks with user-specified properties which contain overflows, and to be sensitive to the participating tools, I am trying to make these changes as per suggestions. I understand, from the issue #1166 , that you are also working towards correcting the task definitions and verdicts of programs with overflows to adhere to sv-comp rules and your comments there are helpful. Previous year, I too had given an attempt to removing the overflow from several benchmarks in nla-digibench but wasn't able to and am still trying to do so. Please do let me know if there are any issues in my reasoning so that I can make appropriate changes to the benchmarks in this PR. |
This is only partially true. While you are right that the expected result is wrong for the no-overflow property, I'am also concerned (and thus #1166) that we have benchmarks with overflow participating in the reachability category. I know the CI will complain if it encounters a benchmarks with the no-data-race property set to false and any expected behaviour for reachability. In principle this check should be performed for any property that implies undefined behaviour, e.g. no-overflow, but I'm not sure this is the case. Maybe @dbeyer can comment on this. I think the main point is that any benchmarks with overflow will result in undefined behaviour and thus we cannot claim any expected result for reachability. My other concern (again raised in #1166) is what to do with benchmarks that are currently in the reachability category but we discover they have undefined behaviour (overflow in this case). In the ideal world we would keep a copy of the benchmark, use it for the corresponding category (e.g. no-overflow, no-data-race) and solve the problem in the benchmark for reachability, but as you said (and I experience the same during the last week trying to fix the benchmarks I reported in #1166), this is not always easy. |
The commit b3fb86c fixes the overflow and hence in commit 7b81d2a I have changed the reachability verdict to true, thus restoring the program complexity such that the assertions in the program hold. I believe that this fix can be applied to most benchmarks in nla-digibench (#1166 ) as well as other benchmarks with overflow (#1159 ). |
The changes look fine to me and I think this PR can be merged. |
If there is an overflow of an An overflow of an @hernanponcedeleon If you see overflows of Programs with undefined behavior are only good to verify that the program has undefined behavior; other properties do not make sense. @divyeshunadkat I think that your latest commits are unnecessary because the overflow was for |
Thanks for the clarification, I was not aware of this. However this might impact the verification, so we should be cautious:
would be TRUE for the reachability category. I guess this is the kind of this the BitVector sub-category is dealing with.
I'm not sure I fully understand. |
The latest commits ( b3fb86c and 7b81d2a ) make the assertion to hold with the full complexity of the program and remove all the overflows. Without the change in b3fb86c , the values of variables can still wrap around and violate the property. This is what was requested in #1105 and something that we have been aspiring to for all the programs with overflows in nla-digibench and other folders as well. |
One possibility would be to keep three benchmarks
|
Sure. Will do. Can I also make these changes for the other benchmarks in nla-digibench as mentioned in #1166 ? |
Sure. It might be that for some benchmarks (I suspect |
Modifications in lcm2.c and hard-u.c
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there are still problems with some of the benchmarks.
bresenham-ll
: there are still overflows in lines 33,39,41,46. Applying the same solution as in line 28 should work.egcd2-ll
: it hasint y
ands*y*y
which can overflowgeo3
: it hasint z
,int a
anda*z*y
which can overflowprod4br-ll
:int x
,int y
andx*y
which can overflowegcd3-ll
: I still suspect there is some problem with this benchmark, but it is hard to explain why. I'm running the verification on the code after using -O3 optimisations. The counter examples I get in the optimised code do not lead to counter examples in the original code. However I suspect this miss match between the result is due to the compiler performing aggressive optimisations due to undefined behaviour. Are you performing any analysis to check there are no overflows or just manually inspecting the benchmark? Also, in the last year, none of the tools managed to obtained a result for this benchmark so it might be just that the expected result is wrong.- hard-u: I don't think the expected result should be false. I imagine you are giving this expected result based on the counter example from Corrected verdict of hard.c (from PR #808) and added its correct version #920, but I don't think that is valid now that we use unsigned. If you still think the expected result should be
false
, can you please provide a counter example?
@hernanponcedeleon Many thanks for the review. Implicit type conversion takes care of the overflows in For |
If by Implicit type conversion you mean that whenever we have an expression where one of the operands is e.g. The problem are non atomic expressions, e.g. Here are some parts of the llvm code generated where the above becomes clear:
this corresponds to
and the casting is correctly converted to a 64-bits multiplication. However
where you can see the multiplication are only 32-bits and thus the overflow is possible. I think the above gives problem in Thanks for the counter example for Note: there was a small typo (fixed already) in my previous comment: the long comment was about |
For completeness: C does not have atomic vs. non-atomic expressions, and there are no intermediate variables involved here. What LLVM does is just an implementation detail of one particular implementation. What the C standard says, however, is that for certain operators (including the arithmetic operators) the "usual arithmetic conversions" (such as converting the operands to the "common real type") are performed before the operator is evaluated. This makes it clear that one has to look at each operator individually, deduce the common real type from its operand types according to §6.3.1.8 of the C standard and then check whether it can overflow. Expressions and in particular whether an operator is part of a larger expression are not relevant in this process. (Source: search for "usual arithmetic conversions" in the C standard) So this does not change the outcome of what you claim, this is just the official reason for it. |
Thanks @PhilippWendler for the clarification. |
@hernanponcedeleon @PhilippWendler Many thanks for your detailed, insightful and well articulated comments. This begs the question that can an llvm pass be implemented that can propagate the type conversions to sub-expressions (if possible in an optimal way)? I believe that such a pass would be useful to many tools that are based on llvm IR, especially when detecting overflows. I resorted to implicit type conversions to retain the readability of these files. |
@hernanponcedeleon @PhilippWendler @dbeyer Are there any other changes required before this PR can be merged? |
I haven't seen any commits after my last comment, so from my point of view, the benchmarks still have problems. |
llvm has a local handling of the implicity casting when generating the type cast instructions in the IR. This commit can be reverted if llvm starts looking at the entire expression as given in a C progam, rather than just the type of the variables involved in the current expression, before generating the needed type casts.
Have pushed the changes with casts for each sub-expression that would require one. |
I still get problems with 3 benchmarks. For I managed to implement solutions for the overflow in For documentation, I explain the problems and solutions here.
|
Can be reverted with same reason as the previous commit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Many thanks @hernanponcedeleon for your meticulous reviews. Happy that together we resolved the issues in the benchmarks in a way that is useful to the competition. :) 👍 |
@dbeyer I have fixed the last change that you had requested about specifying the no-overflow property. The C files do not have any pre-processor directives and hence pre-processed files are not added. Kindly let me know if I must still add the pre-processed files or if there are any other changes required. Thanks. |
Created the pull request based on the points and suggestions of @MartinSpiessl in the PR #920 and #1105
programs added to new and appropriately named directory
license present and acceptable (either in separate file or as comment at beginning of program)
contributed-by present (either in README file or as comment at beginning of program)
programs added to a
.set
file of an existing category, or new sub-category established (if justified)intended property matches the corresponding
.prp
fileprograms and expected answer added to a
.yml
file according to task definitions