Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

benchmark #4517

Draft
wants to merge 14 commits into
base: master
Choose a base branch
from
Draft

benchmark #4517

wants to merge 14 commits into from

Conversation

jangorecki
Copy link
Member

@jangorecki jangorecki commented May 30, 2020

After having discussion with Matt on slack we decided to narrow down scope of #4687. So new benchmarking feature can be more usable and not introduce extra maintenance burden that tracking historical timings, and other features initially listed here, would require.

As a starting point I took system.time tests that have been already taken out from tests.Rraw file (to reduce run time of main test script). Those tests have been moved to a new benchmark() function that meant to replace test() function when system.time is needed.
To keep things simple, we don't need new benchmark.data.table(), as we can just call test.data.table("benchmarks.Rraw") or cc("benchmarks.Rraw") in dev-mode. It will already recognize benchmarks calls in the test script.
This is still very much a starting point so any feedback is very welcome.

Ideas for improvement:

  • include times argument to run expression multiple times and take mean/median to compare. This will allow to make a more tight tolerance (test 1110).
  • test for available memory at start and stop early if less than necessary

Initial proposal at bc8a8be

PR brings new set of scripts, and internal functions, to measure performance of data.table. They are not run in any of our workflow as of now, but rather should be run manually. For now there is no point to merge this branch. Opening PR to more easily document and refer to amongst gh issues.

For example, addressing "add timing test for many .SD cols #3797" for which scripts are defined in benchmarks.Rraw file. Yet to close #3797 we need to add a rules to be checked after all benchmarks, to confirm optimize=0 is not that much different than optimize=Inf.

data.table:::benchmark.data.table(libs=list.dirs("library/gcc", recursive=FALSE))
benchmark.data.table() running: benchmarks.Rraw
R_LIBS_USER=library/gcc/O0 R_DATATABLE_NUM_THREADS=1 R_DATATABLE_NUM_PROCS_PERCENT=100 Rscript inst/benchmarks/benchmarks.Rraw
R_LIBS_USER=library/gcc/O0 R_DATATABLE_NUM_THREADS=4 R_DATATABLE_NUM_PROCS_PERCENT=100 Rscript inst/benchmarks/benchmarks.Rraw
...
R_LIBS_USER=library/gcc/O0 R_DATATABLE_NUM_THREADS=40 R_DATATABLE_NUM_PROCS_PERCENT=100 Rscript inst/benchmarks/benchmarks.Rraw
R_LIBS_USER=library/gcc/O0-g R_DATATABLE_NUM_THREADS=1 R_DATATABLE_NUM_PROCS_PERCENT=100 Rscript inst/benchmarks/benchmarks.Rraw
...
R_LIBS_USER=library/gcc/O0-g R_DATATABLE_NUM_THREADS=40 R_DATATABLE_NUM_PROCS_PERCENT=100 Rscript inst/benchmarks/benchmarks.Rraw
R_LIBS_USER=library/gcc/O2 R_DATATABLE_NUM_THREADS=1 R_DATATABLE_NUM_PROCS_PERCENT=100 Rscript inst/benchmarks/benchmarks.Rraw
...
> data.table:::summary.benchmark()[-c(1:3)]
               cflags   num    fun          args         desc    th user_self sys_self elapsed
               <char> <num> <char>        <char>       <char> <int>     <num>    <num>   <num>
 1:               -O0  1.01      [ DT,,.SD,by=st  optimize=0L     1     1.395    0.230   1.625
 2:               -O0  1.02      [ DT,,.SD,by=st optimize=Inf     1     1.600    0.238   1.838
...
85: -O3 -mtune=native  1.01      [ DT,,.SD,by=st  optimize=0L     1     1.336    0.231   1.566
86: -O3 -mtune=native  1.02      [ DT,,.SD,by=st optimize=Inf     1     1.484    0.279   1.763
...

@jangorecki
Copy link
Member Author

jangorecki commented Aug 13, 2020

Should also cover comment in #4666

Possibly higher priority than valgrind is to to run benchmark.Rraw in GLCI. We've had a few performance regressions recently it'd be nice to nail down so they don't come back. It'd be good to add focused low level benchmarks like the [[-by-group to benchmark.Rraw.

@jangorecki jangorecki added the WIP label Aug 13, 2020
@jangorecki jangorecki mentioned this pull request Aug 25, 2020
9 tasks
@codecov
Copy link

codecov bot commented Oct 6, 2020

Codecov Report

Merging #4517 into master will not change coverage.
The diff coverage is n/a.

Impacted file tree graph

@@           Coverage Diff           @@
##           master    #4517   +/-   ##
=======================================
  Coverage   99.44%   99.44%           
=======================================
  Files          73       73           
  Lines       14539    14539           
=======================================
  Hits        14458    14458           
  Misses         81       81           
Impacted Files Coverage Δ
R/test.data.table.R 100.00% <ø> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update ba2d7bb...39843be. Read the comment docs.

@jangorecki jangorecki linked an issue Oct 9, 2020 that may be closed by this pull request
9 tasks
@MichaelChirico
Copy link
Member

cc @Anirban166 this old draft PR has some potential new benchmarking tests for #6078. If we extract the good tests from here I think we could close this PR too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

add timing test for many .SD cols Continuous Benchmarking
2 participants