-
Notifications
You must be signed in to change notification settings - Fork 4
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
73 changed files
with
1,082 additions
and
86 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,21 +1,278 @@ | ||
# Freestyle | ||
|
||
> ... in which we venture into the unknown to show an example data | ||
analysis. | ||
> ... in which we learn about ANOVA, explore ggplot extensions and | ||
build and interactive web application with Shiny. | ||
|
||
::: { .alert .alert-info} | ||
These are just the notes for the lecture that | ||
will be filled during the recording. | ||
Check back later to the the completed script. | ||
::: {.video-container} | ||
<iframe class="video" src="https://www.youtube.com/embed/gZ0u6b_1ej4?vq=hd1080" allowfullscreen></iframe> | ||
::: | ||
|
||
## Setup | ||
|
||
Today we are using quite a bunch of packages. | ||
In the lecture these are of course introduced one by one, | ||
but because it is a good habit to have all your imports | ||
and dependencies near the top of your analysis they | ||
show up here already. | ||
|
||
```{r} | ||
library(multcomp) | ||
library(patchwork) | ||
library(gganimate) | ||
library(palmerpenguins) | ||
library(broom) | ||
library(tidyverse) | ||
``` | ||
|
||
## ANOVA | ||
|
||
Let's get started. | ||
ANOVA stands for Analysis of Variance. | ||
We start with our familiar penguins dataset and ask | ||
the question: "Is there a difference in bill length between the species?" | ||
|
||
```{r} | ||
penguins %>% | ||
ggplot(aes(species, bill_length_mm)) + | ||
geom_boxplot(outlier.color = NA) + | ||
geom_jitter(alpha = 0.2, width = 0.2) | ||
``` | ||
|
||
Optically, it certainly looks like it. | ||
We could run a bunch of t-tests to compare between the groups, | ||
which would be 3 in total (Adelie–Chinstrap, Adelie–Gentoo, Gentoo–Chinstrap). | ||
And then we would have to correct for multiple testing. | ||
At this point, only running one of tests (like Adelie vs. Chinstrap) | ||
because it looks so promising is not an option! | ||
Looking at the data visually is a form of comparison, | ||
so we would be cheating if we formed our hypothesis only after | ||
looking at the data. | ||
|
||
**ANOVA** is a way of comparing multiple groups and tells us, if there | ||
is a difference at all between the groups. | ||
It does not however tell us, between which groups the difference exists, | ||
just that there is an overall difference. | ||
In order to get p-values for direct comparisons between the groups | ||
we run a **post-hoc** (Latin for "after the fact") on our anova result. | ||
|
||
First I am sliding in a little extra code chunk, because later down | ||
the line we might want to compare all groups to e.g. a baseline or | ||
control group (see the section about Dunnet below) and this baseline | ||
is decided to be the first factor-level of the grouping variable. | ||
With `fct_relevel` we can move a lovel to the first (or any other) | ||
position. | ||
|
||
```{r} | ||
penguins <- penguins %>% | ||
mutate(species = fct_relevel(species, "Gentoo")) | ||
``` | ||
|
||
Not we create an anova fit with `aov`. | ||
|
||
```{r} | ||
anova_fit <- aov(bill_length_mm ~ species, data = penguins) | ||
anova_fit | ||
``` | ||
|
||
And explore if further with various functions. | ||
|
||
```{r} | ||
summary(anova_fit) | ||
``` | ||
|
||
```{r} | ||
tidy(anova_fit) | ||
``` | ||
|
||
### Tukey Post-Hoc Test | ||
|
||
The most straighforward post-hoc test, which is already built into R, | ||
is "Tukey's Honest Significant Differences", | ||
which compares all groups with all other groups. | ||
|
||
```{r} | ||
TukeyHSD(anova_fit) %>% | ||
tidy() | ||
``` | ||
|
||
### Dunnet Post-Hoc Test | ||
|
||
If we have a control group we use Dunnet's test to compare all other | ||
groups to the control group. | ||
This means we have to do less comparisons and end up with a higher statistical power. | ||
|
||
Unfortunately, Dunnet is not built into R, but we can get the necessary | ||
functions from the `multcomp` package. | ||
When loading the `multcomp` package we have to be careful | ||
to load it before the `tidyverse`! | ||
Because `multcomp` also loads a bunch of other packages and one | ||
of them brings it's own `select` function, so loading it after | ||
the `tidyverse` would overwrite tidyverse `select` and make our life very hard. | ||
|
||
```{r} | ||
glht(anova_fit, mcp(species = "Dunnet")) %>% | ||
tidy() | ||
``` | ||
|
||
Note, that this package can also do Tukey's test by changing "Dunnet" | ||
to "Tukey". | ||
|
||
## ggplot extensions | ||
|
||
## Build Apps with Shiny | ||
In order to plot these nice significance stars on top of of our | ||
comparisons let me first introduce you to ggplot extensions: | ||
|
||
<https://exts.ggplot2.tidyverse.org/> | ||
|
||
This is just a collection of packages that extend ggplot in some shape or form | ||
or simply work very well with it. | ||
|
||
### `ggsignif`, `ggpubr` | ||
|
||
One of these is `ggsignif`, which we can use to add comparison | ||
brackets to our plot. | ||
|
||
```{r} | ||
penguins %>% | ||
ggplot(aes(species, bill_length_mm)) + | ||
geom_boxplot(outlier.color = NA) + | ||
geom_jitter(alpha = 0.2, width = 0.2) + | ||
ggsignif::geom_signif( | ||
comparisons = list(c("Gentoo", "Adelie")), | ||
annotations = c("hello") | ||
) | ||
``` | ||
|
||
`ggsignif` can also run it's own comparisons, which is | ||
a wilcoxon rank sum test by default, but especially | ||
for statistical tests I prefer to run them myself first | ||
before trusting more packages. | ||
Additionally, we have of course a different type of test (ANOVA), | ||
and we already have our values, so we use it just to manually | ||
add text to the annotation: | ||
|
||
```{r} | ||
stars <- function(p) { | ||
case_when( | ||
p <= 0.001 ~ "***", | ||
p <= 0.01 ~ "**", | ||
p <= 0.05 ~ "*", | ||
TRUE ~ "ns" | ||
) | ||
} | ||
dunnet <- glht(anova_fit, mcp(species = "Dunnet")) %>% | ||
tidy() %>% | ||
mutate(contrast = str_split(contrast, " - "), | ||
stars = stars(adj.p.value)) | ||
plt <- penguins %>% | ||
ggplot(aes(species, bill_length_mm)) + | ||
geom_boxplot(outlier.color = NA) + | ||
geom_jitter(alpha = 0.2, width = 0.2) + | ||
ggsignif::geom_signif( | ||
comparisons = dunnet$contrast, | ||
annotations = dunnet$stars, | ||
y_position = c(60, 65) | ||
) | ||
plt | ||
``` | ||
|
||
|
||
### `patchwork` | ||
|
||
I also saved the plot to a variable and will now create a second | ||
one just to show you the `patchwork` package, | ||
which can combine plots into neat layouts: | ||
|
||
```{r} | ||
plt2 <- ggplot(penguins, aes(bill_length_mm, bill_depth_mm, color = species)) + | ||
geom_point() | ||
((plt | plt2) / plt2 ) + plot_layout(guides = 'collect') | ||
``` | ||
|
||
### `ggfortify` | ||
|
||
`ggfortiy` might not strictly be necessary but I want to mention it | ||
to talk about `autoplot`. | ||
`autoplot` is in `ggplot2` already by default. | ||
It creates automatic plot for certain objects, like models for example, | ||
or an anova result. | ||
`ggfortify` makes more objects have the ability to generate an autoplot. | ||
|
||
```{r} | ||
library(ggfortify) | ||
autoplot(anova_fit) | ||
``` | ||
|
||
I want to mention this here because it has an important implication | ||
for communication. | ||
There are two purposes too plots. | ||
The first purpose is to generate insight for you, the data analyst. | ||
Autoplots are often very helpful for this. | ||
But the second purpose is to communicate your findings to a reader! | ||
It takes time and effort to refine a plot from something that generated | ||
and insight for you after having spent a lot of time with your data | ||
to also generating insights for a reader that sees the data for the first time. | ||
Too many people stop at the first step and publish their plots | ||
as is, without thinking much about the reader. | ||
Keep this in mind as you advance in your scientific career. | ||
|
||
> Communication is key! | ||
Both code and plots are means of communicating. | ||
|
||
### `esquisse` | ||
|
||
Two more cool packages: The `esquisse` package let's you | ||
create ggplots interactively with a gui! | ||
But make sure to always save the code this created to ensure | ||
reproducibility! | ||
|
||
### `gganimate` | ||
|
||
And `gganimate` extends `ggplot` with the ability to use time as a dimension: | ||
|
||
```{r} | ||
gapminder::gapminder %>% | ||
ggplot(aes(gdpPercap, lifeExp, size = pop, color = country)) + | ||
geom_point() + | ||
scale_x_log10() + | ||
# facet_wrap(~year) + | ||
transition_time(year) + | ||
scale_color_manual(values = gapminder::country_colors) + | ||
guides(color = "none", | ||
size = "none") | ||
``` | ||
|
||
## Build Apps with `shiny` | ||
|
||
You can find the code written during the lecture in: `example-app/`. | ||
|
||
## Exercises | ||
|
||
### The Whole Deal | ||
|
||
Data analysis takes practice. | ||
For this last exercise you get the option to show what you | ||
have learned on a fresh dataset. | ||
But people have different interests, so I am leaving | ||
the choice open. | ||
|
||
Choose a dataset from the TidyTuesday project: | ||
|
||
<https://github.com/rfordatascience/tidytuesday#datasets> | ||
|
||
and write your data analysis report. | ||
Aim to write down some questions you have about the | ||
topic and answer them with the available data. | ||
Explore it with plots and tables and try | ||
to end up with 1 or 2 high quality plots | ||
that clearly communicate you findings. | ||
Above all, have fun! | ||
|
||
## Feedback | ||
|
||
I will send round a link with a feedback form. | ||
It is anonymous. | ||
I will send round a link for a feedback form. | ||
It is anonymous. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,20 @@ | ||
base | ||
mvtnorm | ||
survival | ||
MASS | ||
TH.data | ||
multcomp | ||
patchwork | ||
ggplot2 | ||
gganimate | ||
palmerpenguins | ||
broom | ||
tidyverse | ||
tibble | ||
tidyr | ||
readr | ||
purrr | ||
dplyr | ||
stringr | ||
forcats | ||
ggfortify |
Binary file added
BIN
+3.09 KB
...own_files/08-freestyle_cache/html/unnamed-chunk-10_5909845bb1be2ca981659eddb68dfa1b.RData
Binary file not shown.
Binary file added
BIN
+144 KB
...kdown_files/08-freestyle_cache/html/unnamed-chunk-10_5909845bb1be2ca981659eddb68dfa1b.rdb
Binary file not shown.
Binary file added
BIN
+841 Bytes
...kdown_files/08-freestyle_cache/html/unnamed-chunk-10_5909845bb1be2ca981659eddb68dfa1b.rdx
Binary file not shown.
Binary file added
BIN
+2.85 KB
...own_files/08-freestyle_cache/html/unnamed-chunk-11_259d7dca34481e0643c13da7b10204cc.RData
Binary file not shown.
Binary file added
BIN
+50.4 KB
...kdown_files/08-freestyle_cache/html/unnamed-chunk-11_259d7dca34481e0643c13da7b10204cc.rdb
Binary file not shown.
Binary file added
BIN
+456 Bytes
...kdown_files/08-freestyle_cache/html/unnamed-chunk-11_259d7dca34481e0643c13da7b10204cc.rdx
Binary file not shown.
Binary file added
BIN
+2.76 KB
...own_files/08-freestyle_cache/html/unnamed-chunk-12_f9da657235b4e571c228f1f80c7a8315.RData
Binary file not shown.
Empty file.
Binary file added
BIN
+122 Bytes
...kdown_files/08-freestyle_cache/html/unnamed-chunk-12_f9da657235b4e571c228f1f80c7a8315.rdx
Binary file not shown.
Binary file added
BIN
+2.92 KB
...own_files/08-freestyle_cache/html/unnamed-chunk-13_992e791e3218119f6e69489db3073403.RData
Binary file not shown.
Empty file.
Binary file added
BIN
+122 Bytes
...kdown_files/08-freestyle_cache/html/unnamed-chunk-13_992e791e3218119f6e69489db3073403.rdx
Binary file not shown.
Binary file added
BIN
+207 Bytes
...down_files/08-freestyle_cache/html/unnamed-chunk-1_d4a2e0629694bee8a55b86f2182c3e17.RData
Binary file not shown.
Empty file.
Binary file added
BIN
+122 Bytes
_bookdown_files/08-freestyle_cache/html/unnamed-chunk-1_d4a2e0629694bee8a55b86f2182c3e17.rdx
Binary file not shown.
Binary file added
BIN
+2.82 KB
...down_files/08-freestyle_cache/html/unnamed-chunk-2_c96f65c9baa85bfc4780425f3603d203.RData
Binary file not shown.
Empty file.
Binary file added
BIN
+122 Bytes
_bookdown_files/08-freestyle_cache/html/unnamed-chunk-2_c96f65c9baa85bfc4780425f3603d203.rdx
Binary file not shown.
Binary file added
BIN
+2.72 KB
...down_files/08-freestyle_cache/html/unnamed-chunk-3_709945f7e6222aa954fc079aaaaeecec.RData
Binary file not shown.
Binary file added
BIN
+2.88 KB
_bookdown_files/08-freestyle_cache/html/unnamed-chunk-3_709945f7e6222aa954fc079aaaaeecec.rdb
Binary file not shown.
Binary file added
BIN
+147 Bytes
_bookdown_files/08-freestyle_cache/html/unnamed-chunk-3_709945f7e6222aa954fc079aaaaeecec.rdx
Binary file not shown.
Binary file added
BIN
+2.93 KB
...down_files/08-freestyle_cache/html/unnamed-chunk-4_2ca37ae030b07b3b7675b8846261238c.RData
Binary file not shown.
Binary file added
BIN
+5.58 KB
_bookdown_files/08-freestyle_cache/html/unnamed-chunk-4_2ca37ae030b07b3b7675b8846261238c.rdb
Binary file not shown.
Binary file added
BIN
+148 Bytes
_bookdown_files/08-freestyle_cache/html/unnamed-chunk-4_2ca37ae030b07b3b7675b8846261238c.rdx
Binary file not shown.
Binary file added
BIN
+2.88 KB
...down_files/08-freestyle_cache/html/unnamed-chunk-5_977ceacacbfcde8cf98c9813ffa8a9bb.RData
Binary file not shown.
Empty file.
Binary file added
BIN
+122 Bytes
_bookdown_files/08-freestyle_cache/html/unnamed-chunk-5_977ceacacbfcde8cf98c9813ffa8a9bb.rdx
Binary file not shown.
Binary file added
BIN
+3.07 KB
...down_files/08-freestyle_cache/html/unnamed-chunk-6_456b9451dfea655ea2faad1add677fa7.RData
Binary file not shown.
Empty file.
Binary file added
BIN
+122 Bytes
_bookdown_files/08-freestyle_cache/html/unnamed-chunk-6_456b9451dfea655ea2faad1add677fa7.rdx
Binary file not shown.
Binary file added
BIN
+3.17 KB
...down_files/08-freestyle_cache/html/unnamed-chunk-7_5bf54bcacd5cf92d03bb7038bf7c7957.RData
Binary file not shown.
Empty file.
Binary file added
BIN
+122 Bytes
_bookdown_files/08-freestyle_cache/html/unnamed-chunk-7_5bf54bcacd5cf92d03bb7038bf7c7957.rdx
Binary file not shown.
Binary file added
BIN
+3.17 KB
...down_files/08-freestyle_cache/html/unnamed-chunk-8_4aab09226e59d55def71f9dccca883fa.RData
Binary file not shown.
Empty file.
Binary file added
BIN
+122 Bytes
_bookdown_files/08-freestyle_cache/html/unnamed-chunk-8_4aab09226e59d55def71f9dccca883fa.rdx
Binary file not shown.
Binary file added
BIN
+2.91 KB
...down_files/08-freestyle_cache/html/unnamed-chunk-9_de1a4e9b7ef15310e81f77d8f22a36ce.RData
Binary file not shown.
Empty file.
Binary file added
BIN
+122 Bytes
_bookdown_files/08-freestyle_cache/html/unnamed-chunk-9_de1a4e9b7ef15310e81f77d8f22a36ce.rdx
Binary file not shown.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+2.04 MB
_bookdown_files/08-freestyle_files/figure-html/unnamed-chunk-13-1.gif
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file not shown.
Binary file not shown.
Oops, something went wrong.