Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't use function string matching for do.call(what=) or lapply(FUN=) #6217

Merged
merged 5 commits into from
Jul 8, 2024

Conversation

MichaelChirico
Copy link
Member

@MichaelChirico MichaelChirico commented Jul 3, 2024

Related: #6192. First caught my eye because I know this makes it harder for static analyzers, e.g. any code looking for cbind() usage will have to special-case the possibility that cbind is passed as a string, in an appropriate context, etc.

This is not a terribly big deal for base functions but is best practice when using dependencies since it can be hard to tell if downstreams get broken, statically.

It's also a bit slower -- match.fun() exits immediately when passed a function, takes an extra step when passed a string. That core part of function matching in lapply() is 10x slower. do.call() is roughly the same:

microbenchmark(times=1e6, match.fun("is.na"), match.fun(is.na))
# Unit: nanoseconds
#                expr  min   lq     mean median   uq       max neval cld
#  match.fun("is.na") 6892 7183 8837.238   7290 7696 127587928 1e+06   b
#    match.fun(is.na)  575  677 1073.283    730  832  64672336 1e+06  a 
l=list(1)
microbenchmark(times=1e6, do.call("identity", l), do.call(identity, l))
# Unit: microseconds
#                    expr   min    lq     mean median    uq      max neval cld
#  do.call("identity", l) 1.885 2.010 2.910809  2.094 2.451 42517.85 1e+06   a
#    do.call(identity, l) 1.850 1.966 2.915019  2.049 2.417 77993.08 1e+06   a

PS Shouldn't we use NextMethod(), rather than call the S3 method explicitly?

@MichaelChirico MichaelChirico changed the title Don't use function string matching for is.na method Don't use function string matching for do.call(what=) or lapply(FUN=) Jul 3, 2024
Copy link

github-actions bot commented Jul 3, 2024

Comparison Plot

Generated via commit 6fb1c1f

Download link for the artifact containing the test results: ↓ atime-results.zip

Time taken to finish the standard R installation steps: 12 minutes and 29 seconds

Time taken to run atime::atime_pkg on the tests: 3 minutes and 50 seconds

Copy link
Member

@tdhock tdhock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good, thanks

@ben-schwen
Copy link
Member

ben-schwen commented Jul 3, 2024

LGTM, but I'm not sure I would apply it to test cases since some of them might explictly check for working with characters, as e.g., 23c88e7

@MichaelChirico
Copy link
Member Author

LGTM, but I'm not sure I would apply it to test cases since some of them might explictly check for working with characters, as e.g., 23c88e7

Thanks for flagging! Indeed I skipped some test cases that are testing for strings usage specifically. In general you're right that altering test cases can be risky. Those changes I did include are clearly designed to test other things -- if behavior w.r.t. strings is intended, that should get its own specific test. The footprint here is small enough I'm confident we're OK.

@MichaelChirico MichaelChirico merged commit 19a73c7 into master Jul 8, 2024
4 checks passed
@MichaelChirico MichaelChirico deleted the isna-string branch July 8, 2024 20:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants