-
Notifications
You must be signed in to change notification settings - Fork 4
/
Copy pathoverview.Rmd
114 lines (74 loc) · 3.17 KB
/
overview.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
---
title: "R Bootcamp for Biolgists"
---
## Overview
Introduction to the R programming language with a focus on using it for biological data analysis. The purpose of this course is to teach scientists (students, postdocs, PIs) in the biological and medical sciences to use R for typical data analysis tasks they might encounter routinely. This includes sequence analysis and other bioinformatics tasks. No prior knowledge of R is expected and workshop attendees can expect to come away with a skill set that is immediately translatable to their respective data tasks.
## Learning Objectives
At the end of the workshop you will able to:
* Install and update R
* Use the Rstudio IDE
* Understand what CRAN and Bioconductor are and what the differences are between them
* Install and update R packages from CRAN and Bioconductor
* Import a wide variety of data types into R
* Understand the basic data types: integer, numeric, logical, character
* Understand R's basic data structures: vector, matrix, list, data.frame
* Understand basic programming concepts: functions, objects, loops, vectorization, conditionals
* Manipulate data structures by subsetting and indexing
* Understand key base R functions: seq, apply (and friends)
* Manipulate data with dplyr and friends
* Make plots with ggplot
* Find help about any function
* Understand some common R errors and how to deal with them
* Find and evaluate R packages needed for a particular analysis
* Understand the difference between `<-` and `=` and make your own choice about which one to use
## Preparation
Attendees are expected to come with their own laptops and have already installed R and RStudio as well as completed at least one of the following online tutorials.
- https://www.datacamp.com/courses/free-introduction-to-r
- http://tryr.codeschool.com/
- http://swirlstats.com/
- http://rforcats.net/
This small bit of preparation will allow us to move quickly through the basics and get to the good stuff.
### Course reference
Main text:
[R for Data Science](http://r4ds.had.co.nz/) by Garret Grolemund and Hadley Wickham
Secondary reading:
[Advanced R](http://adv-r.had.co.nz/) by Hadley Wickham
## Course Materials
### Understanding R
* R History
* Packages
* Assignment
* Environments
* Other important key R functions including basic statistics
* Errors and getting help
### Data Structures
* Vectors
* Lists
* Factors
* Matrices
* Data frames
### Subsetting
* Ways to subset
* Subsetting operators
### Programming Concepts
* Functions
* Conditionals
* Loops
### Practical data management
* Tidy data
* Pipes: Ceci n'est pas un pipe
* Intro to dplyr and tidyr
* Restructuring data and doing stuff to it
* Regular expressions and stringr
### Data Visualization
* ggplot
* heatmaps
* What makes effective visualizations
* Building up a complex visualization
### Introduction to Bioconductor
* Finding and installing biocondutor packages
* Learning what packages do and how to evaluate them
* Intro to some key data structures: XStringSet, IRanges, expressionset, etc.
### Reproducibility
* Rapid introduction to managing and reproducing your analysis: Rmarkdown, git and github, best practices
* Writing your own functions