diff --git a/_posts/2022-12-05-a-collection-of-r-resources/img/GitHub_Logo.png b/_posts/2022-12-05-a-collection-of-r-resources/img/GitHub_Logo.png
deleted file mode 100644
index e03d8dd..0000000
Binary files a/_posts/2022-12-05-a-collection-of-r-resources/img/GitHub_Logo.png and /dev/null differ
diff --git a/_posts/2022-12-05-a-collection-of-r-resources/resources.Rmd b/_posts/2022-12-05-a-collection-of-r-resources/resources.Rmd
index 3a587e2..36930c6 100644
--- a/_posts/2022-12-05-a-collection-of-r-resources/resources.Rmd
+++ b/_posts/2022-12-05-a-collection-of-r-resources/resources.Rmd
@@ -6,6 +6,7 @@ date: 2022-12-05
output:
distill::distill_article:
self_contained: false
+ toc: true
---
```{r setup, include = FALSE}
@@ -84,6 +85,36 @@ z) {
# styler::style_dir # directory
```
+## Reproducibility
+
+### Print environment information with sessionInfo()
+
+It's very helpful to have a record of which packages were used in an analysis. One approach is to print the `sessionInfo()`.
+
+Show session info
+
+```{r code}
+sessionInfo()
+```
+
+See also the `sessioninfo` package, which provide more details:
+
+```{r}
+sessioninfo::session_info()
+```
+
+
Read the Guide to RMarkdown for an exhaustive description of the various formats and options for using RMarkdown documents. Note that HTML for this class were all made from Rmd, using the distill blog format
styler
, clean up code rea
# styler::style_dir # directory
It’s very helpful to have a record of which packages were used in an analysis. One approach is to print the sessionInfo()
.
sessionInfo()
+#> R version 4.2.0 (2022-04-22)
+#> Platform: x86_64-apple-darwin17.0 (64-bit)
+#> Running under: macOS Big Sur/Monterey 10.16
+#>
+#> Matrix products: default
+#> BLAS: /Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRblas.0.dylib
+#> LAPACK: /Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRlapack.dylib
+#>
+#> locale:
+#> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
+#>
+#> attached base packages:
+#> [1] stats graphics grDevices utils datasets methods
+#> [7] base
+#>
+#> other attached packages:
+#> [1] here_1.0.1 forcats_0.5.1 stringr_1.4.1 dplyr_1.0.10
+#> [5] purrr_0.3.5 readr_2.1.2 tidyr_1.2.0 tibble_3.1.8
+#> [9] ggplot2_3.3.6 tidyverse_1.3.1
+#>
+#> loaded via a namespace (and not attached):
+#> [1] lubridate_1.8.0 assertthat_0.2.1 rprojroot_2.0.3
+#> [4] digest_0.6.30 utf8_1.2.2 prettycode_1.1.0
+#> [7] R6_2.5.1 cellranger_1.1.0 backports_1.4.1
+#> [10] reprex_2.0.1 evaluate_0.16 httr_1.4.4
+#> [13] pillar_1.8.1 rlang_1.0.6 readxl_1.4.0
+#> [16] rstudioapi_0.13 jquerylib_0.1.4 R.utils_2.12.0
+#> [19] R.oo_1.25.0 rmarkdown_2.14 styler_1.7.0
+#> [22] munsell_0.5.0 broom_0.8.0 compiler_4.2.0
+#> [25] modelr_0.1.8 xfun_0.32 pkgconfig_2.0.3
+#> [28] htmltools_0.5.2 downlit_0.4.2 tidyselect_1.2.0
+#> [31] fansi_1.0.3 crayon_1.5.2 tzdb_0.3.0
+#> [34] dbplyr_2.2.1 withr_2.5.0 R.methodsS3_1.8.2
+#> [37] grid_4.2.0 jsonlite_1.8.3 gtable_0.3.0
+#> [40] lifecycle_1.0.3 DBI_1.1.3 magrittr_2.0.3
+#> [43] scales_1.2.0 cli_3.4.1 stringi_1.7.8
+#> [46] cachem_1.0.6 fs_1.5.2 xml2_1.3.3
+#> [49] bslib_0.3.1 ellipsis_0.3.2 generics_0.1.3
+#> [52] vctrs_0.4.1 distill_1.5 tools_4.2.0
+#> [55] R.cache_0.15.0 glue_1.6.2 hms_1.1.2
+#> [58] fastmap_1.1.0 yaml_2.3.6 colorspace_2.0-3
+#> [61] rvest_1.0.2 memoise_2.0.1 knitr_1.39
+#> [64] haven_2.5.0 sass_0.4.1
+See also the sessioninfo
package, which provide more details:
sessioninfo::session_info()
+#> ─ Session info ─────────────────────────────────────────────────────
+#> setting value
+#> version R version 4.2.0 (2022-04-22)
+#> os macOS Big Sur/Monterey 10.16
+#> system x86_64, darwin17.0
+#> ui X11
+#> language (EN)
+#> collate en_US.UTF-8
+#> ctype en_US.UTF-8
+#> tz America/Denver
+#> date 2022-12-16
+#> pandoc 2.19.2 @ /Applications/RStudio.app/Contents/MacOS/quarto/bin/tools/ (via rmarkdown)
+#>
+#> ─ Packages ─────────────────────────────────────────────────────────
+#> package * version date (UTC) lib source
+#> assertthat 0.2.1 2019-03-21 [1] CRAN (R 4.2.0)
+#> backports 1.4.1 2021-12-13 [1] CRAN (R 4.2.0)
+#> broom 0.8.0 2022-04-13 [1] CRAN (R 4.2.0)
+#> bslib 0.3.1 2021-10-06 [1] CRAN (R 4.2.0)
+#> cachem 1.0.6 2021-08-19 [1] CRAN (R 4.2.0)
+#> cellranger 1.1.0 2016-07-27 [1] CRAN (R 4.2.0)
+#> cli 3.4.1 2022-09-23 [1] CRAN (R 4.2.0)
+#> colorspace 2.0-3 2022-02-21 [1] CRAN (R 4.2.0)
+#> crayon 1.5.2 2022-09-29 [1] CRAN (R 4.2.0)
+#> DBI 1.1.3 2022-06-18 [1] CRAN (R 4.2.0)
+#> dbplyr 2.2.1 2022-06-27 [1] CRAN (R 4.2.0)
+#> digest 0.6.30 2022-10-18 [1] CRAN (R 4.2.0)
+#> distill 1.5 2022-09-07 [1] CRAN (R 4.2.0)
+#> downlit 0.4.2 2022-07-05 [1] CRAN (R 4.2.0)
+#> dplyr * 1.0.10 2022-09-01 [1] CRAN (R 4.2.0)
+#> ellipsis 0.3.2 2021-04-29 [1] CRAN (R 4.2.0)
+#> evaluate 0.16 2022-08-09 [1] CRAN (R 4.2.0)
+#> fansi 1.0.3 2022-03-24 [1] CRAN (R 4.2.0)
+#> fastmap 1.1.0 2021-01-25 [1] CRAN (R 4.2.0)
+#> forcats * 0.5.1 2021-01-27 [1] CRAN (R 4.2.0)
+#> fs 1.5.2 2021-12-08 [1] CRAN (R 4.2.0)
+#> generics 0.1.3 2022-07-05 [1] CRAN (R 4.2.0)
+#> ggplot2 * 3.3.6 2022-05-03 [1] CRAN (R 4.2.0)
+#> glue 1.6.2 2022-02-24 [1] CRAN (R 4.2.0)
+#> gtable 0.3.0 2019-03-25 [1] CRAN (R 4.2.0)
+#> haven 2.5.0 2022-04-15 [1] CRAN (R 4.2.0)
+#> here * 1.0.1 2020-12-13 [1] CRAN (R 4.2.0)
+#> hms 1.1.2 2022-08-19 [1] CRAN (R 4.2.0)
+#> htmltools 0.5.2 2021-08-25 [1] CRAN (R 4.2.0)
+#> httr 1.4.4 2022-08-17 [1] CRAN (R 4.2.0)
+#> jquerylib 0.1.4 2021-04-26 [1] CRAN (R 4.2.0)
+#> jsonlite 1.8.3 2022-10-21 [1] CRAN (R 4.2.0)
+#> knitr 1.39 2022-04-26 [1] CRAN (R 4.2.0)
+#> lifecycle 1.0.3 2022-10-07 [1] CRAN (R 4.2.0)
+#> lubridate 1.8.0 2021-10-07 [1] CRAN (R 4.2.0)
+#> magrittr 2.0.3 2022-03-30 [1] CRAN (R 4.2.0)
+#> memoise 2.0.1 2021-11-26 [1] CRAN (R 4.2.0)
+#> modelr 0.1.8 2020-05-19 [1] CRAN (R 4.2.0)
+#> munsell 0.5.0 2018-06-12 [1] CRAN (R 4.2.0)
+#> pillar 1.8.1 2022-08-19 [1] CRAN (R 4.2.0)
+#> pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 4.2.0)
+#> prettycode 1.1.0 2019-12-16 [1] CRAN (R 4.2.0)
+#> purrr * 0.3.5 2022-10-06 [1] CRAN (R 4.2.0)
+#> R.cache 0.15.0 2021-04-30 [1] CRAN (R 4.2.0)
+#> R.methodsS3 1.8.2 2022-06-13 [1] CRAN (R 4.2.0)
+#> R.oo 1.25.0 2022-06-12 [1] CRAN (R 4.2.0)
+#> R.utils 2.12.0 2022-06-28 [1] CRAN (R 4.2.0)
+#> R6 2.5.1 2021-08-19 [1] CRAN (R 4.2.0)
+#> readr * 2.1.2 2022-01-30 [1] CRAN (R 4.2.0)
+#> readxl 1.4.0 2022-03-28 [1] CRAN (R 4.2.0)
+#> reprex 2.0.1 2021-08-05 [1] CRAN (R 4.2.0)
+#> rlang 1.0.6 2022-09-24 [1] CRAN (R 4.2.0)
+#> rmarkdown 2.14 2022-04-25 [1] CRAN (R 4.2.0)
+#> rprojroot 2.0.3 2022-04-02 [1] CRAN (R 4.2.0)
+#> rstudioapi 0.13 2020-11-12 [1] CRAN (R 4.2.0)
+#> rvest 1.0.2 2021-10-16 [1] CRAN (R 4.2.0)
+#> sass 0.4.1 2022-03-23 [1] CRAN (R 4.2.0)
+#> scales 1.2.0 2022-04-13 [1] CRAN (R 4.2.0)
+#> sessioninfo 1.2.2 2021-12-06 [1] CRAN (R 4.2.0)
+#> stringi 1.7.8 2022-07-11 [1] CRAN (R 4.2.0)
+#> stringr * 1.4.1 2022-08-20 [1] CRAN (R 4.2.0)
+#> styler 1.7.0 2022-03-13 [1] CRAN (R 4.2.0)
+#> tibble * 3.1.8 2022-07-22 [1] CRAN (R 4.2.0)
+#> tidyr * 1.2.0 2022-02-01 [1] CRAN (R 4.2.0)
+#> tidyselect 1.2.0 2022-10-10 [1] CRAN (R 4.2.0)
+#> tidyverse * 1.3.1 2021-04-15 [1] CRAN (R 4.2.0)
+#> tzdb 0.3.0 2022-03-28 [1] CRAN (R 4.2.0)
+#> utf8 1.2.2 2021-07-24 [1] CRAN (R 4.2.0)
+#> vctrs 0.4.1 2022-04-13 [1] CRAN (R 4.2.0)
+#> withr 2.5.0 2022-03-03 [1] CRAN (R 4.2.0)
+#> xfun 0.32 2022-08-10 [1] CRAN (R 4.2.0)
+#> xml2 1.3.3 2021-11-30 [1] CRAN (R 4.2.0)
+#> yaml 2.3.6 2022-10-18 [1] CRAN (R 4.2.0)
+#>
+#> [1] /Library/Frameworks/R.framework/Versions/4.2/Resources/library
+#>
+#> ────────────────────────────────────────────────────────────────────
+The renv
package allows you to have a separate set of R packages for each project. It also can record and restore the set of R pacakges used in a project. This is very helpful when you need to return to a project months (or years) later and want to have the same set of packages. It also makes it easier to share your packages with collaborators.
See also:
+conda for managing various command line programs (python, R, c, etc.)
docker for generating a fully reproducible operating system environment.
microbenchmark
and profvis
m
#> Unit: milliseconds
#> expr min lq mean median uq max neval
-#> base 3300 3400 3500 3500 3500 3600 5
-#> readr 300 300 400 320 390 690 5
+#> base 3400 3400 3500 3400 3600 3700 5
+#> readr 300 300 390 310 410 650 5
m
})
p
R has a debugger built in. You can debug a function:
@@ -1644,9 +1856,7 @@It’s surprisingly easy, particularly with Rstudio, to write your own R package to store your code. Putting your code in a package makes it much easier to debug, document, add tests, and distribute your code.
The command line can be accessed via the Terminal app on macOS, or using the windows subsystem for linux (WSL).
There command line is a place where you can run executable programs (a C, python, R, or whatever). It’s what using a computer looked like before the existence of a Graphical User Interface. It is impossible to conduct data analysis without gaining some experience with working on the command line.
R is an executable, and we can pull up an R console using:
-R
R
In Rmarkdown you can also include other languages including bash (which is a common language of the command line).You need to change the r
to bash
in the code chunk (or to python or other languages).
You can run simple commands by using Rscript
or R
with the -e
option.
R -e "print('hello')"
-#>
-#> R version 4.2.0 (2022-04-22) -- "Vigorous Calisthenics"
-#> Copyright (C) 2022 The R Foundation for Statistical Computing
-#> Platform: x86_64-apple-darwin17.0 (64-bit)
-#>
-#> R is free software and comes with ABSOLUTELY NO WARRANTY.
-#> You are welcome to redistribute it under certain conditions.
-#> Type 'license()' or 'licence()' for distribution details.
-#>
-#> Natural language support but running in an English locale
-#>
-#> R is a collaborative project with many contributors.
-#> Type 'contributors()' for more information and
-#> 'citation()' on how to cite R or R packages in publications.
-#>
-#> Type 'demo()' for some demos, 'help()' for on-line help, or
-#> 'help.start()' for an HTML browser interface to help.
-#> Type 'q()' to quit R.
-#>
-#> > print('hello')
-#> [1] "hello"
-#> >
-#> >
R -e "print('hello')"
+#>
+#> R version 4.2.0 (2022-04-22) -- "Vigorous Calisthenics"
+#> Copyright (C) 2022 The R Foundation for Statistical Computing
+#> Platform: x86_64-apple-darwin17.0 (64-bit)
+#>
+#> R is free software and comes with ABSOLUTELY NO WARRANTY.
+#> You are welcome to redistribute it under certain conditions.
+#> Type 'license()' or 'licence()' for distribution details.
+#>
+#> Natural language support but running in an English locale
+#>
+#> R is a collaborative project with many contributors.
+#> Type 'contributors()' for more information and
+#> 'citation()' on how to cite R or R packages in publications.
+#>
+#> Type 'demo()' for some demos, 'help()' for on-line help, or
+#> 'help.start()' for an HTML browser interface to help.
+#> Type 'q()' to quit R.
+#>
+#> > print('hello')
+#> [1] "hello"
+#> >
+#> >
Rscript -e "print('hello')"
-#> [1] "hello"
Rscript -e "print('hello')"
+#> [1] "hello"
Alternatively you can write a R script, which can be then called from Rscript. For example if we wrote an R script called cool_function.R
.
#!/usr/bin/env Rscript # allows calling with ./cool_function.R if executable
-
-= commandArgs(trailingOnly=TRUE) # collect command line arguments
- args print(args) # args is a list e.g. argument1 argument2...
#!/usr/bin/env Rscript # allows calling with ./cool_function.R if executable
+
+= commandArgs(trailingOnly=TRUE) # collect command line arguments
+ args print(args) # args is a list e.g. argument1 argument2...
We could call on the command line:
-Rscript path/to/cool_function.R argument1 argument2 ...
-#or
-path/to/cool_function.R argument1 argument2 ...
Rscript path/to/cool_function.R argument1 argument2 ...
+#or
+path/to/cool_function.R argument1 argument2 ...
Git is a command line tool for version control, which allows us to:
rolling back code to a previous state if needed
branched development, tackling individual issues/tasks
collaboration
Git was first created by Linus Torvalds for coordinating development of Linux. Read this guide for Getting started , checkout this interactive guide and check out this Tutorial written from an R data analyst perspective.
# for bioinformatics, get comfortable with command line too
-
-ls
-git status # list changes to tracked files
-git blame resources.Rmd # see who contributed
-git commit -m "added something cool" # save state
-git push # push git to a git repository (e.g. github)
-git pull # pull changes from git repository
# for bioinformatics, get comfortable with command line too
+
+ls
+git status # list changes to tracked files
+git blame resources.Rmd # see who contributed
+git commit -m "added something cool" # save state
+git push # push git to a git repository (e.g. github)
+git pull # pull changes from git repository
This can be handled by Rstudio as well (new tab next to Connections
and Build
)
#> 😄
2,000+ R packages dedicated to bioinformatics. Included a coherent framework of data structures (e.g. SummarizedExperiment) built by dedicated Core members. Also includs many annotation and experimental datasets built into R packages and objects (See AnnotationHub and ExperimentHub)