diff --git a/_posts/2022-12-05-a-collection-of-r-resources/img/GitHub_Logo.png b/_posts/2022-12-05-a-collection-of-r-resources/img/GitHub_Logo.png deleted file mode 100644 index e03d8dd..0000000 Binary files a/_posts/2022-12-05-a-collection-of-r-resources/img/GitHub_Logo.png and /dev/null differ diff --git a/_posts/2022-12-05-a-collection-of-r-resources/resources.Rmd b/_posts/2022-12-05-a-collection-of-r-resources/resources.Rmd index 3a587e2..36930c6 100644 --- a/_posts/2022-12-05-a-collection-of-r-resources/resources.Rmd +++ b/_posts/2022-12-05-a-collection-of-r-resources/resources.Rmd @@ -6,6 +6,7 @@ date: 2022-12-05 output: distill::distill_article: self_contained: false + toc: true --- ```{r setup, include = FALSE} @@ -84,6 +85,36 @@ z) { # styler::style_dir # directory ``` +## Reproducibility + +### Print environment information with sessionInfo() + +It's very helpful to have a record of which packages were used in an analysis. One approach is to print the `sessionInfo()`. + +
Show session info + +```{r code} +sessionInfo() +``` + +See also the `sessioninfo` package, which provide more details: + +```{r} +sessioninfo::session_info() +``` + +
+ + +### Use renv to manage package dependencies + +The [`renv`](https://rstudio.github.io/renv/articles/renv.html) package allows you to have a separate set of R packages for each project. It also can record and restore the set of R pacakges used in a project. This is very helpful when you need to return to a project months (or years) later and want to have the same set of packages. It also makes it easier to share your packages with collaborators. + +See also: + +- [conda](https://docs.conda.io/en/latest/) for managing various command line programs (python, R, c, etc.) + +- [docker](https://www.docker.com/) for generating a fully reproducible operating system environment. ## Benchmarking, with `microbenchmark` and `profvis` @@ -150,9 +181,9 @@ cool_function(1) traceback() ``` -```{r, echo = FALSE} -knitr::include_graphics("img/traceback.png") -``` +![](https://github.com/rnabioco/bmsc-7810-pbda/raw/main/_posts/2022-12-05-a-collection-of-r-resources/img/traceback.png) + + ## Building your own R package It's surprisingly easy, particularly with Rstudio, to write your own R package to store your code. Putting your code in a package makes it much easier to debug, document, add tests, and distribute your code. @@ -245,9 +276,9 @@ path/to/cool_function.R argument1 argument2 ... ## Git and Github -```{r, echo =FALSE, fig.cap="From https://jmcglone.com/guides/github-pages/"} -knitr::include_graphics("img/git-basics.png") -``` +![From https://jmcglone.com/guides/github-pages/](https://github.com/rnabioco/bmsc-7810-pbda/raw/main/_posts/2022-12-05-a-collection-of-r-resources/img/git-basics.png) + + Git is a command line tool for version control, which allows us to: @@ -257,9 +288,10 @@ Git is a command line tool for version control, which allows us to: 3. collaboration -```{r, echo =FALSE, fig.cap="From https://blog.programster.org/git-workflows"} -knitr::include_graphics("img/github-flow.png") -``` + +![From https://blog.programster.org/git-workflows](https://github.com/rnabioco/bmsc-7810-pbda/raw/main/_posts/2022-12-05-a-collection-of-r-resources/img/github-flow.png) + + Git was first created by Linus Torvalds for coordinating development of Linux. Read this guide for [Getting started](https://git-scm.com/book/en/v2/Getting-Started-About-Version-Control) , checkout this [interactive guide](https://learngitbranching.js.org/) and check out this [Tutorial](https://happygitwithr.com) written from an R data analyst perspective. @@ -322,9 +354,8 @@ emo::ji("smile") ## Bioconductor -```{r, echo = FALSE} -knitr::include_graphics("https://bioconductor.org/images/logo_bioconductor.gif") -``` +![](https://bioconductor.org/images/logo_bioconductor.gif) + 2,000+ R packages dedicated to bioinformatics. Included a coherent framework of data structures (e.g. SummarizedExperiment) built by dedicated Core members. Also includs many annotation and experimental datasets built into R packages and objects (See AnnotationHub and ExperimentHub) diff --git a/_posts/2022-12-05-a-collection-of-r-resources/resources.html b/_posts/2022-12-05-a-collection-of-r-resources/resources.html index 7324b48..324450b 100644 --- a/_posts/2022-12-05-a-collection-of-r-resources/resources.html +++ b/_posts/2022-12-05-a-collection-of-r-resources/resources.html @@ -110,12 +110,12 @@ @@ -1524,6 +1524,57 @@

Class wrap up: Data analysis, tips and resources

+
+ +

Rmarkdown

Read the Guide to RMarkdown for an exhaustive description of the various formats and options for using RMarkdown documents. Note that HTML for this class were all made from Rmd, using the distill blog format

Caching

@@ -1578,6 +1629,167 @@

styler, clean up code rea # styler::style_dir # directory

+

Reproducibility

+ +

It’s very helpful to have a record of which packages were used in an analysis. One approach is to print the sessionInfo().

+
+ +Show session info + +
+ +
#> R version 4.2.0 (2022-04-22)
+#> Platform: x86_64-apple-darwin17.0 (64-bit)
+#> Running under: macOS Big Sur/Monterey 10.16
+#> 
+#> Matrix products: default
+#> BLAS:   /Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRblas.0.dylib
+#> LAPACK: /Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRlapack.dylib
+#> 
+#> locale:
+#> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
+#> 
+#> attached base packages:
+#> [1] stats     graphics  grDevices utils     datasets  methods  
+#> [7] base     
+#> 
+#> other attached packages:
+#>  [1] here_1.0.1      forcats_0.5.1   stringr_1.4.1   dplyr_1.0.10   
+#>  [5] purrr_0.3.5     readr_2.1.2     tidyr_1.2.0     tibble_3.1.8   
+#>  [9] ggplot2_3.3.6   tidyverse_1.3.1
+#> 
+#> loaded via a namespace (and not attached):
+#>  [1] lubridate_1.8.0   assertthat_0.2.1  rprojroot_2.0.3  
+#>  [4] digest_0.6.30     utf8_1.2.2        prettycode_1.1.0 
+#>  [7] R6_2.5.1          cellranger_1.1.0  backports_1.4.1  
+#> [10] reprex_2.0.1      evaluate_0.16     httr_1.4.4       
+#> [13] pillar_1.8.1      rlang_1.0.6       readxl_1.4.0     
+#> [16] rstudioapi_0.13   jquerylib_0.1.4   R.utils_2.12.0   
+#> [19] R.oo_1.25.0       rmarkdown_2.14    styler_1.7.0     
+#> [22] munsell_0.5.0     broom_0.8.0       compiler_4.2.0   
+#> [25] modelr_0.1.8      xfun_0.32         pkgconfig_2.0.3  
+#> [28] htmltools_0.5.2   downlit_0.4.2     tidyselect_1.2.0 
+#> [31] fansi_1.0.3       crayon_1.5.2      tzdb_0.3.0       
+#> [34] dbplyr_2.2.1      withr_2.5.0       R.methodsS3_1.8.2
+#> [37] grid_4.2.0        jsonlite_1.8.3    gtable_0.3.0     
+#> [40] lifecycle_1.0.3   DBI_1.1.3         magrittr_2.0.3   
+#> [43] scales_1.2.0      cli_3.4.1         stringi_1.7.8    
+#> [46] cachem_1.0.6      fs_1.5.2          xml2_1.3.3       
+#> [49] bslib_0.3.1       ellipsis_0.3.2    generics_0.1.3   
+#> [52] vctrs_0.4.1       distill_1.5       tools_4.2.0      
+#> [55] R.cache_0.15.0    glue_1.6.2        hms_1.1.2        
+#> [58] fastmap_1.1.0     yaml_2.3.6        colorspace_2.0-3 
+#> [61] rvest_1.0.2       memoise_2.0.1     knitr_1.39       
+#> [64] haven_2.5.0       sass_0.4.1
+
+

See also the sessioninfo package, which provide more details:

+
+
+
sessioninfo::session_info()
+
+
#> ─ Session info ─────────────────────────────────────────────────────
+#>  setting  value
+#>  version  R version 4.2.0 (2022-04-22)
+#>  os       macOS Big Sur/Monterey 10.16
+#>  system   x86_64, darwin17.0
+#>  ui       X11
+#>  language (EN)
+#>  collate  en_US.UTF-8
+#>  ctype    en_US.UTF-8
+#>  tz       America/Denver
+#>  date     2022-12-16
+#>  pandoc   2.19.2 @ /Applications/RStudio.app/Contents/MacOS/quarto/bin/tools/ (via rmarkdown)
+#> 
+#> ─ Packages ─────────────────────────────────────────────────────────
+#>  package     * version date (UTC) lib source
+#>  assertthat    0.2.1   2019-03-21 [1] CRAN (R 4.2.0)
+#>  backports     1.4.1   2021-12-13 [1] CRAN (R 4.2.0)
+#>  broom         0.8.0   2022-04-13 [1] CRAN (R 4.2.0)
+#>  bslib         0.3.1   2021-10-06 [1] CRAN (R 4.2.0)
+#>  cachem        1.0.6   2021-08-19 [1] CRAN (R 4.2.0)
+#>  cellranger    1.1.0   2016-07-27 [1] CRAN (R 4.2.0)
+#>  cli           3.4.1   2022-09-23 [1] CRAN (R 4.2.0)
+#>  colorspace    2.0-3   2022-02-21 [1] CRAN (R 4.2.0)
+#>  crayon        1.5.2   2022-09-29 [1] CRAN (R 4.2.0)
+#>  DBI           1.1.3   2022-06-18 [1] CRAN (R 4.2.0)
+#>  dbplyr        2.2.1   2022-06-27 [1] CRAN (R 4.2.0)
+#>  digest        0.6.30  2022-10-18 [1] CRAN (R 4.2.0)
+#>  distill       1.5     2022-09-07 [1] CRAN (R 4.2.0)
+#>  downlit       0.4.2   2022-07-05 [1] CRAN (R 4.2.0)
+#>  dplyr       * 1.0.10  2022-09-01 [1] CRAN (R 4.2.0)
+#>  ellipsis      0.3.2   2021-04-29 [1] CRAN (R 4.2.0)
+#>  evaluate      0.16    2022-08-09 [1] CRAN (R 4.2.0)
+#>  fansi         1.0.3   2022-03-24 [1] CRAN (R 4.2.0)
+#>  fastmap       1.1.0   2021-01-25 [1] CRAN (R 4.2.0)
+#>  forcats     * 0.5.1   2021-01-27 [1] CRAN (R 4.2.0)
+#>  fs            1.5.2   2021-12-08 [1] CRAN (R 4.2.0)
+#>  generics      0.1.3   2022-07-05 [1] CRAN (R 4.2.0)
+#>  ggplot2     * 3.3.6   2022-05-03 [1] CRAN (R 4.2.0)
+#>  glue          1.6.2   2022-02-24 [1] CRAN (R 4.2.0)
+#>  gtable        0.3.0   2019-03-25 [1] CRAN (R 4.2.0)
+#>  haven         2.5.0   2022-04-15 [1] CRAN (R 4.2.0)
+#>  here        * 1.0.1   2020-12-13 [1] CRAN (R 4.2.0)
+#>  hms           1.1.2   2022-08-19 [1] CRAN (R 4.2.0)
+#>  htmltools     0.5.2   2021-08-25 [1] CRAN (R 4.2.0)
+#>  httr          1.4.4   2022-08-17 [1] CRAN (R 4.2.0)
+#>  jquerylib     0.1.4   2021-04-26 [1] CRAN (R 4.2.0)
+#>  jsonlite      1.8.3   2022-10-21 [1] CRAN (R 4.2.0)
+#>  knitr         1.39    2022-04-26 [1] CRAN (R 4.2.0)
+#>  lifecycle     1.0.3   2022-10-07 [1] CRAN (R 4.2.0)
+#>  lubridate     1.8.0   2021-10-07 [1] CRAN (R 4.2.0)
+#>  magrittr      2.0.3   2022-03-30 [1] CRAN (R 4.2.0)
+#>  memoise       2.0.1   2021-11-26 [1] CRAN (R 4.2.0)
+#>  modelr        0.1.8   2020-05-19 [1] CRAN (R 4.2.0)
+#>  munsell       0.5.0   2018-06-12 [1] CRAN (R 4.2.0)
+#>  pillar        1.8.1   2022-08-19 [1] CRAN (R 4.2.0)
+#>  pkgconfig     2.0.3   2019-09-22 [1] CRAN (R 4.2.0)
+#>  prettycode    1.1.0   2019-12-16 [1] CRAN (R 4.2.0)
+#>  purrr       * 0.3.5   2022-10-06 [1] CRAN (R 4.2.0)
+#>  R.cache       0.15.0  2021-04-30 [1] CRAN (R 4.2.0)
+#>  R.methodsS3   1.8.2   2022-06-13 [1] CRAN (R 4.2.0)
+#>  R.oo          1.25.0  2022-06-12 [1] CRAN (R 4.2.0)
+#>  R.utils       2.12.0  2022-06-28 [1] CRAN (R 4.2.0)
+#>  R6            2.5.1   2021-08-19 [1] CRAN (R 4.2.0)
+#>  readr       * 2.1.2   2022-01-30 [1] CRAN (R 4.2.0)
+#>  readxl        1.4.0   2022-03-28 [1] CRAN (R 4.2.0)
+#>  reprex        2.0.1   2021-08-05 [1] CRAN (R 4.2.0)
+#>  rlang         1.0.6   2022-09-24 [1] CRAN (R 4.2.0)
+#>  rmarkdown     2.14    2022-04-25 [1] CRAN (R 4.2.0)
+#>  rprojroot     2.0.3   2022-04-02 [1] CRAN (R 4.2.0)
+#>  rstudioapi    0.13    2020-11-12 [1] CRAN (R 4.2.0)
+#>  rvest         1.0.2   2021-10-16 [1] CRAN (R 4.2.0)
+#>  sass          0.4.1   2022-03-23 [1] CRAN (R 4.2.0)
+#>  scales        1.2.0   2022-04-13 [1] CRAN (R 4.2.0)
+#>  sessioninfo   1.2.2   2021-12-06 [1] CRAN (R 4.2.0)
+#>  stringi       1.7.8   2022-07-11 [1] CRAN (R 4.2.0)
+#>  stringr     * 1.4.1   2022-08-20 [1] CRAN (R 4.2.0)
+#>  styler        1.7.0   2022-03-13 [1] CRAN (R 4.2.0)
+#>  tibble      * 3.1.8   2022-07-22 [1] CRAN (R 4.2.0)
+#>  tidyr       * 1.2.0   2022-02-01 [1] CRAN (R 4.2.0)
+#>  tidyselect    1.2.0   2022-10-10 [1] CRAN (R 4.2.0)
+#>  tidyverse   * 1.3.1   2021-04-15 [1] CRAN (R 4.2.0)
+#>  tzdb          0.3.0   2022-03-28 [1] CRAN (R 4.2.0)
+#>  utf8          1.2.2   2021-07-24 [1] CRAN (R 4.2.0)
+#>  vctrs         0.4.1   2022-04-13 [1] CRAN (R 4.2.0)
+#>  withr         2.5.0   2022-03-03 [1] CRAN (R 4.2.0)
+#>  xfun          0.32    2022-08-10 [1] CRAN (R 4.2.0)
+#>  xml2          1.3.3   2021-11-30 [1] CRAN (R 4.2.0)
+#>  yaml          2.3.6   2022-10-18 [1] CRAN (R 4.2.0)
+#> 
+#>  [1] /Library/Frameworks/R.framework/Versions/4.2/Resources/library
+#> 
+#> ────────────────────────────────────────────────────────────────────
+
+
+

Use renv to manage package dependencies

+

The renv package allows you to have a separate set of R packages for each project. It also can record and restore the set of R pacakges used in a project. This is very helpful when you need to return to a project months (or years) later and want to have the same set of packages. It also makes it easier to share your packages with collaborators.

+

See also:

+

Benchmarking, with microbenchmark and profvis

@@ -1593,8 +1805,8 @@

Benchmarking, with m

#> Unit: milliseconds
 #>   expr  min   lq mean median   uq  max neval
-#>   base 3300 3400 3500   3500 3500 3600     5
-#>  readr  300  300  400    320  390  690     5
+#> base 3400 3400 3500 3400 3600 3700 5 +#> readr 300 300 390 310 410 650 5
@@ -1609,8 +1821,8 @@

Benchmarking, with m }) p

-
- +
+

Debugging R code

R has a debugger built in. You can debug a function:

@@ -1644,9 +1856,7 @@

Look at the call stack with trace traceback() -
-

-
+

Building your own R package

It’s surprisingly easy, particularly with Rstudio, to write your own R package to store your code. Putting your code in a package makes it much easier to debug, document, add tests, and distribute your code.