Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inconsistent code formatting / minor fixes to vignettes #6521

Merged
2 changes: 1 addition & 1 deletion vignettes/datatable-faq.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -632,7 +632,7 @@ Yes, for both 32-bit and 64-bit on all platforms. Thanks to CRAN. There are no s
## I think it's great. What can I do?
Please file suggestions, bug reports and enhancement requests on our [issues tracker](https://github.com/Rdatatable/data.table/issues). This helps make the package better.

Please do star the package on [GitHub](https://github.com/Rdatatable/data.table/wiki). This helps encourage the developers and helps other R users find the package.
Please do star the package on [GitHub](https://github.com/Rdatatable/data.table). This helps encourage the developers and helps other R users find the package.

You can submit pull requests to change the code and/or documentation yourself; see our [Contribution Guidelines](https://github.com/Rdatatable/data.table/blob/master/.github/CONTRIBUTING.md).

Expand Down
4 changes: 2 additions & 2 deletions vignettes/datatable-importing.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ h2 {
}
</style>

This document is focused on using `data.table` as a dependency in other R packages. If you are interested in using `data.table` C code from a non-R application, or in calling its C functions directly, jump to the [last section](#non-r-API) of this vignette.
This document is focused on using `data.table` as a dependency in other R packages. If you are interested in using `data.table` C code from a non-R application, or in calling its C functions directly, jump to the [last section](#importing-from-non-r-applications-non-r-api) of this vignette.
MichaelChirico marked this conversation as resolved.
Show resolved Hide resolved

Importing `data.table` is no different from importing other R packages. This vignette is meant to answer the most common questions arising around that subject; the lessons presented here can be applied to other R packages.

Expand Down Expand Up @@ -138,7 +138,7 @@ The option mechanism in R is _global_. Meaning that if a user sets a `data.table

If you face any problems in creating a package that uses data.table, please confirm that the problem is reproducible in a clean R session using the R console: `R CMD check package.name`.

Some of the most common issues developers are facing are usually related to helper tools that are meant to automate some package development tasks, for example, using `roxygen` to generate your `NAMESPACE` file from metadata in the R code files. Others are related to helpers that build and check the package. Unfortunately, these helpers sometimes have unintended/hidden side effects which can obscure the source of your troubles. As such, be sure to double check using R console (run R on the command line) and ensure the import is defined in the `DESCRIPTION` and `NAMESPACE` files following the [instructions](#DESCRIPTION) [above](#NAMESPACE).
Some of the most common issues developers are facing are usually related to helper tools that are meant to automate some package development tasks, for example, using `roxygen` to generate your `NAMESPACE` file from metadata in the R code files. Others are related to helpers that build and check the package. Unfortunately, these helpers sometimes have unintended/hidden side effects which can obscure the source of your troubles. As such, be sure to double check using R console (run R on the command line) and ensure the import is defined in the `DESCRIPTION` and `NAMESPACE` files following the [instructions](#description-file-description) [above](#namespace-file-namespace).
MichaelChirico marked this conversation as resolved.
Show resolved Hide resolved
MichaelChirico marked this conversation as resolved.
Show resolved Hide resolved

If you are not able to reproduce problems you have using the plain R console build and check, you may try to get some support based on past issues we've encountered with `data.table` interacting with helper tools: [devtools#192](https://github.com/r-lib/devtools/issues/192) or [devtools#1472](https://github.com/r-lib/devtools/issues/1472).

Expand Down
2 changes: 1 addition & 1 deletion vignettes/datatable-intro.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -665,7 +665,7 @@ We can do much more in `i` by keying a `data.table`, which allows for blazing fa

3. Compute on columns: `DT[, .(sum(colA), mean(colB))]`.

4. Provide names if necessary: `DT[, .(sA =sum(colA), mB = mean(colB))]`.
4. Provide names if necessary: `DT[, .(sA = sum(colA), mB = mean(colB))]`.

5. Combine with `i`: `DT[colA > value, sum(colB)]`.

Expand Down
4 changes: 2 additions & 2 deletions vignettes/datatable-reference-semantics.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -68,7 +68,7 @@ DF$c <- 18:13 # (1) -- replace entire column
DF$c[DF$ID == "b"] <- 15:13 # (2) -- subassign in column 'c'
```

both (1) and (2) resulted in deep copy of the entire data.frame in versions of `R` versions `< 3.1`. [It copied more than once](https://stackoverflow.com/q/23898969/559784). To improve performance by avoiding these redundant copies, *data.table* utilised the [available but unused `:=` operator in R](https://stackoverflow.com/q/7033106/559784).
both (1) and (2) resulted in deep copy of the entire data.frame in versions of `R` `< 3.1`. [It copied more than once](https://stackoverflow.com/q/23898969/559784). To improve performance by avoiding these redundant copies, *data.table* utilised the [available but unused `:=` operator in R](https://stackoverflow.com/q/7033106/559784).
MichaelChirico marked this conversation as resolved.
Show resolved Hide resolved
MichaelChirico marked this conversation as resolved.
Show resolved Hide resolved

Great performance improvements were made in `R v3.1` as a result of which only a *shallow* copy is made for (1) and not *deep* copy. However, for (2) still, the entire column is *deep* copied even in `R v3.1+`. This means the more columns one subassigns to in the *same query*, the more *deep* copies R does.

Expand Down Expand Up @@ -247,7 +247,7 @@ head(flights)

* We use the `LHS := RHS` form. We store the input column names and the new columns to add in separate variables and provide them to `.SDcols` and for `LHS` (for better readability).

* Note that since we allow assignment by reference without quoting column names when there is only one column as explained in [Section 2c](#delete-convenience), we can not do `out_cols := lapply(.SD, max)`. That would result in adding one new column named `out_col`. Instead we should do either `c(out_cols)` or simply `(out_cols)`. Wrapping the variable name with `(` is enough to differentiate between the two cases.
* Note that since we allow assignment by reference without quoting column names when there is only one column as explained in [Section 2c](#delete-convenience), we can not do `out_cols := lapply(.SD, max)`. That would result in adding one new column named `out_cols`. Instead we should do either `c(out_cols)` or simply `(out_cols)`. Wrapping the variable name with `(` is enough to differentiate between the two cases.

* The `LHS := RHS` form allows us to operate on multiple columns. In the RHS, to compute the `max` on columns specified in `.SDcols`, we make use of the base function `lapply()` along with `.SD` in the same way as we have seen before in the *"Introduction to data.table"* vignette. It returns a list of two elements, containing the maximum value corresponding to `dep_delay` and `arr_delay` for each group.

Expand Down
2 changes: 1 addition & 1 deletion vignettes/datatable-reshape.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -241,7 +241,7 @@ melt(two.iris, measure.vars = measure(value.name, dim, sep="."))
```

Using the code above we get one value column per flower part. If we
instead want a value column for each measurement dimension, we can do
instead want a value column for each measurement dimension, we can do:

```{r}
melt(two.iris, measure.vars = measure(part, value.name, sep="."))
Expand Down
Loading