Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create rchk GHA #6263

Merged
merged 28 commits into from
Aug 5, 2024
Merged

Create rchk GHA #6263

merged 28 commits into from
Aug 5, 2024

Conversation

MichaelChirico
Copy link
Member

@MichaelChirico MichaelChirico commented Jul 15, 2024

Part of #6257.

@MichaelChirico
Copy link
Member Author

This runs surprisingly quickly (3m). Maybe we can just run it every time.

Copy link

github-actions bot commented Jul 22, 2024

Comparison Plot

Generated via commit c5d5241

Download link for the artifact containing the test results: ↓ atime-results.zip

Time taken to finish the standard R installation steps: 11 minutes and 47 seconds

Time taken to run atime::atime_pkg on the tests: 6 minutes and 0 seconds

@MichaelChirico
Copy link
Member Author

@jangorecki indeed rchk did not like our changes in #4386 😆

@MichaelChirico
Copy link
Member Author

@kalibera we're getting hit multiple times now with

"unsupported form of unprotect, unprotecting all variables"

I see that marked as FIXME in rchk, does that mean we should ignore this report, or that the report could be more helpful? (I didn't try reading the rchk code very carefully but couldn't immediately figure out the issue)

https://github.com/kalibera/rchk/blob/8fe3f2769e834cf183577d5ddfa97f9a224fe612/src/freshvars.cpp#L408-L409

Or is there some different recommendation for how we're doing protect/unprotect in those functions (I have pored over several times & don't see anything unusual...)

@MichaelChirico
Copy link
Member Author

MichaelChirico commented Jul 29, 2024

Reading rchk code a bit more, I guess this is just false positive related to hitting the max protect stack depth (64):

https://github.com/kalibera/rchk/blob/8fe3f2769e834cf183577d5ddfa97f9a224fe612/src/freshvars.cpp#L334-L340

So we should ignore it. That said, I don't really see why we're hitting a stack depth issue, we only PROTECT() a few times, particularly in rbindlist() (I can see rchk more easily getting tripped up in fcase(), whose logic is a bit harder to parse statically).

@kalibera
Copy link

@kalibera we're getting hit multiple times now with

"unsupported form of unprotect, unprotecting all variables"

I see that marked as FIXME in rchk, does that mean we should ignore this report, or that the report could be more helpful? (I didn't try reading the rchk code very carefully but couldn't immediately figure out the issue)

The message means that rchk doesn't understand this code, so it cannot check it. I would consider simplifying the code: rchk has been tuned to understand most PROTECT/UNPROTECT patterns present in R and R packages, and making the code simpler for the tool often makes it also simpler cognitively for humans.

The FIXME means that indeed, more patterns that those supported by rchk may be correct.

https://github.com/kalibera/rchk/blob/8fe3f2769e834cf183577d5ddfa97f9a224fe612/src/freshvars.cpp#L408-L409

Or is there some different recommendation for how we're doing protect/unprotect in those functions (I have pored over several times & don't see anything unusual...)

I only had a quick look at your function (in the code being checked by rchk), but I would use a simple protection counter. That is, unprotect using "UNPROTECT(nprotect)" unconditionally and whenever calling PROTECT(), increment also nprotect to reflect the number of pointers protected.

@kalibera
Copy link

Reading rchk code a bit more, I guess this is just false positive related to hitting the max protect stack depth (64):

https://github.com/kalibera/rchk/blob/8fe3f2769e834cf183577d5ddfa97f9a224fe612/src/freshvars.cpp#L334-L340

So we should ignore it. That said, I don't really see why we're hitting a stack depth issue, we only PROTECT() a few times, particularly in rbindlist() (I can see rchk more easily getting tripped up in fcase(), whose logic is a bit harder to parse statically).

I haven't tried to debug this case, but I don't think this is related: whether a form of PROTECT/UNPROTECT is supported is not related to the maximum supported pointer protection stack when modeling which variables are protected.

Yes, one can ignore the cases when rchk doesn't understand the code (for the price of potentially missing some true errors). One can also simplify the code to use the supported forms of PROTECT/UNPROTECT, which is what I would do in this case.

To know exactly what is going on, one needs to look at the assembly (the LLVM IR) of the code being checked, possibly enable diagnostic output in rchk and possibly run in a debugger.

In some cases in the past, new problems about "unsupported form" of UNPROTECT etc appeared after changes in LLVM due to which the LLVM IR looked different from what it has been when the corresponding rchk code was implemented. In such a case it is worth fixing in rchk, if possible. If you ran into such a case and have a reproducible example, I am happy to have a look (or have a look at a patch).

@ben-schwen
Copy link
Member

Interestingly we are running into the same error as arrow does which says that it cannot install the packages:
data.table
arrow

Looks like a bug in rchk?

I also stumbled upon #4625 which suggests everything is fine 🔥

@MichaelChirico
Copy link
Member Author

it cannot install the packages:

Yea, I saw that, possibly the docker image is just under-configured. But for {rchk}, all that matters is the C code builds & can be analyzed, which it does; my read is this is a consequence of this note in the rchk README:

One can reduce the number of required R package dependencies by only installing LinkingTo dependencies of the package and then installing the package with --libs-only option (only shared libraries are built and installed). This is enough to build shared libraries of most but not all packages. Docker and singularity rchk containers for non-interactive use do this, see scripts/utils.r and definitions of the containers for more details.

@kalibera
Copy link

kalibera commented Aug 5, 2024

Interestingly we are running into the same error as arrow does which says that it cannot install the packages: data.table arrow

Looks like a bug in rchk?

In the checks you reference, "arrow" fails to build because libarrow is not available:

 * installing *source* package 'arrow' ...
not using staged install with --libs-only
** using non-staged installation
*** pkg-config found.
*** Latest available nightly for 16.1.0.9000: 16.1.0.100000450
*** Linux binaries incompatible with libc++
*** Proceeding without libarrow (no local source)
------------------------- NOTE ---------------------------
There was an issue building the Arrow C++ libraries.
See https://arrow.apache.org/docs/r/articles/install.html
---------------------------------------------------------

That needs to be analyzed looking at how the arrow package installs (and, as Michael writes, focus on --libs-only).

Re "data.table". First in principle, you might want to have a less stringent check in the container. There is some code that the tool won't understand. In some cases, you probably want to simplify the code, but in some cases it might be too intrusive. You could even have a stoplist for certain functions.

It is not a bug of "rchk" that in some cases it is not sure about the code. Any bug finding tool in principle works like that. In all cases, I would though recommend checking the rchk reports time to time manually. Often one can find true errors by looking into functions which rchk cannot analyze.

Concretely, in these cases, it seems that the code needs more work updating:

Function fcaseR
  [UP] protect stack is too deep, unprotecting all variables, results will be incomplete
  [UP] protect stack is too deep, unprotecting all variables, results will be incomplete
  [UP] protect stack is too deep, unprotecting all variables, results will be incomplete
  [UP] unsupported form of unprotect, unprotecting all variables, results will be incomplete /rchk/packages/build/bCoYRFDw/data.table/src/fifelse.c:402

^^^ this function seems to have some UNPROTECT calls using a constant literal without an update to the protection counter ("nprotect"), so the code doesn't seem correct. I would simply UNPROTECT(nprotect) at all local returns from the function and count all protections by incrementing "nprotect".

Function rbindlist
  [UP] protect stack is too deep, unprotecting all variables, results will be incomplete
  [UP] protect stack is too deep, unprotecting all variables, results will be incomplete
  [UP] unsupported form of unprotect, unprotecting all variables, results will be incomplete /rchk/packages/build/bCoYRFDw/data.table/src/rbindlist.c:556

^^^ there is still a mixture in the code including the protection counter ("nprotect") and the old "listprotect" variable; probably switching fully to "nprotect" would solve the issue. As above, I would increment nprotect at all protections and then UNPROTECT(nprotect) at all local returns.

I also stumbled upon #4625 which suggests everything is fine 🔥

This is yet another case:

ERROR: too many states (abstraction error?) in function strptime_internal
ERROR: too many states (abstraction error?) in function RunGenCollect
ERROR: too many states (abstraction error?) in function rbindlist
ERROR: too many states (abstraction error?) in function freadMain
ERROR: too many states (abstraction error?) in function rbindlist

This is an error during the checking, not an error of the checked code. It means rchk cannot analyze some of these functions because it has already exhausted the number of states it is allowed to use to represent the code. This can happen in some complicated loops (with complicated conditions around memory protection functions), etc - and often in cases when you wouldn't want to rewrite the source code, when it is as said a limited expressiveness of the abstractions used for the analyzed code. One can increase the number of states, but it often wouldn't help. In principle, again, such issues are inevitable in bug finding tools based on static analysis. You might want to have a quick look at these functions (those of them from the package), and it may be that you will find some issues in protection, and it certainly may be that after you fix the issues in rbindlist, the problem would disappear. But it may be some of these won't be fixable, and certainly at least those referring to functions from base R. I would ignore those.

@MichaelChirico
Copy link
Member Author

MichaelChirico commented Aug 5, 2024

It is not a bug of "rchk" that in some cases it is not sure about the code.

ah, of course very much appreciate the deeper review, thank you for your time! but just to clarify, we are referring to the rchk log having "Could not install package data.table" being the putative bug, not the rchk analysis. sorry for the mixup.

@MichaelChirico
Copy link
Member Author

Merging now to get this going on future PRs. More to do here, as noted, but glad this GHA is "prod-ready"

@MichaelChirico MichaelChirico merged commit 846b512 into master Aug 5, 2024
5 checks passed
@MichaelChirico MichaelChirico deleted the rchk-gha branch August 5, 2024 20:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants