[experimental] make use of data.table
#877
Closed
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
A recent post on the r-devel mailinglist reminded me of the
data.table
package. We have quite a few data frames, or to be precise, I built this package around data frames. There might be some gains using thedata.table
package, but unfortunately I have not useddata.table
for quite some time and even when I did, my knowledge was quite limited.Though I like the package, their long time maintainer and their development ideas. Therefore if we would see some larger gains, I'd be willing to switch to it. Though, I lack time and motivation to do it just for a few microseconds. In the example below (writing a 10.000 x 1.000 matrix, this improves the profvis runtime by about 1.000ms which is nice, but still just 1s.
Since these small performance gains are hard to measure, even benchmarking if something is faster takes a lot of time. And while hunting for some seconds can be fun, more often it is not.