-
Notifications
You must be signed in to change notification settings - Fork 4
/
supplemental-figures.Rmd
100 lines (73 loc) · 6.15 KB
/
supplemental-figures.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
\clearpage
# Supplemental figures
<!-- Case studies: leopold2020host -->
```{r leopold2020host-variation, fig.cap = '(ref:cap-leopold2020host-variation)'}
fs::path(
'notebook/_posts/2022-01-08-leopold2020host-case-study',
'leopold2020host-case-study_files/figure-html5',
'variation-in-proportions-and-mean-efficiency-1.svg'
) %>%
include_svg
```
(ref:cap-leopold2020host-variation) **In the pre-infection samples from @leopold2020host, multiplicative variation in taxa proportions is much larger than that in the mean efficiency.** Panel A shows the distribution of the proportions of each commensal isolate (denoted by its genus) across all samples collected prior to pathogen inoculation; Panel C shows the distribution of the (estimated) sample mean efficiency across these same samples on the same scale; and Panel B shows the efficiency of each taxon estimated from DNA mock communities as point estimates and 90% bootstrap percentile confidence intervals. Efficiencies are shown relative to the most efficiently measured taxon (_Fusarium_).
<br><br>
```{r leopold2020host-infection-mean-efficiency-dist, fig.cap = '(ref:cap-leopold2020host-infection-mean-efficiency-dist)'}
fs::path(
'notebook/_posts/2022-01-08-leopold2020host-case-study',
'leopold2020host-case-study_files/figure-html5',
'infection-mean-efficiency-dist-1.svg'
) %>%
include_svg
```
(ref:cap-leopold2020host-infection-mean-efficiency-dist) **The mean efficiency tends to increase after infection due to the high proportion of the pathogen.**
<br><br>
```{r leopold2020host-infection-lfc, fig.cap = '(ref:cap-leopold2020host-infection-lfc)'}
fs::path(
'notebook/_posts/2022-01-08-leopold2020host-case-study',
'leopold2020host-case-study_files/figure-html5',
'infection-gp-lfc-estimates-1.svg'
) %>%
include_svg
```
(ref:cap-leopold2020host-infection-lfc) **Bias correction increases the estimated log fold changes (LFCs) in commensal species proportions in response to pathogen infection.** Shown are the estimated LFCs with 95% Bayesian credible intervals (CIs) derived from a Bayesian gamma Poisson regression. Negative values indicate that the species' proportion decreased on average in response to infection, which is the default expectation given growth of the pathogen and the sum-to-one constraint of proportions. Bias leads to artificially low estimates, as the increase in the pathogen proportion increases the mean efficiency. In Western genotypes, several species whose proportion appears to decrease are instead found to remain stable or even to increase when bias is corrected.
<br><br>
<!-- Case study: MOMSPI -->
```{r momspi-mean-efficiency-fcs, fig.cap = '(ref:cap-momspi-mean-efficiency-fcs)'}
fs::path(
"notebook/_posts/2021-11-01-momspi-summary/momspi-summary_files",
"figure-html5/momspi-mean-efficiency-fcs-1.svg"
) %>%
include_svg
```
(ref:cap-momspi-mean-efficiency-fcs) **Fold changes in the mean efficiency within and between women in the MOMS-PI study.**
<br><br>
```{r momspi-trajectory, fig.cap = '(ref:cap-momspi-trajectory)', out.width = '75%'}
fs::path(
"notebook/_posts/2021-11-01-momspi-summary/momspi-summary_files",
"figure-html5/trajectories-within-subject-high-var-1.svg"
) %>%
include_svg
```
(ref:cap-momspi-trajectory) **In vaginal microbiome measurements, shifts between _Lactobacillus_ and _Gardnerella_ dominance can drive spurious fold changes in other, lower-abundance species.** The figure shows species proportions and mean efficiency trajectories over consecutive clinical visits for a subject in the MOMS-PI study whose microbiome samples showed substantial variation in mean efficiency. The subject's samples are dominated by _Gardnerella vaginalis_ and _Lachnospiraceae BVAB1_ during the first three visits before transitioning to being dominated by _Lactobacillus iners_ between visits 3 and 4. This transition drives a sharp increase in the mean efficiency, which significantly distorts the fold changes in the observed (uncalibrated) microbiome measurements for species with less dramatic fold changes. Two exemplar species are shown to illustrate the magnitude (_Ureaplasma cluster 23_) and direction (_Megasphaera OTU70 type1_) errors that can arise in this situation.
<br><br>
```{r momspi-regression, fig.cap = '(ref:cap-momspi-regression)'}
fs::path(
"notebook/_posts/2021-12-15-momspi-regression-diversity/momspi-regression-diversity_files",
"figure-html5/regression-gp-1.svg"
) %>%
include_svg
```
(ref:cap-momspi-regression) **Bias inflates estimated log fold changes (LFCs) in species proportions with diversity in vaginal microbiome samples from the MOMS-PI study.** Samples were split into low, medium, and high diversity groups based on Shannon diversity in observed (uncalibrated) microbiome profiles. The LFC in proportion from low- to high-diversity samples was estimated for 30 common species using gamma-Poisson regression with and without bias correction. Panel A shows the distribution of point estimates; Panel B shows the point estimates and 95% Bayesian credible intervals for each species.
<br><br>
<!-- Case studies: Gut -->
```{r gut, fig.cap = '(ref:cap-gut)'}
fs::path(
"notebook/_posts/2022-01-30-hmp-stool-vagina-comparison/",
"hmp-stool-vagina-comparison_files",
"figure-html5/main-1.svg"
) %>%
include_svg
```
(ref:cap-gut) **Compared to vaginal samples, stool samples have higher diversity, lower variation in mean efficiency, and lower variation in species proportions.** This figure summarizes results from our case study of shotgun-sequenced stool and vaginal samples in the Human Microbiome Project. Panel A shows the distribution of order-2 alpha diversity (equal to the Inverse Simpson Index) across samples. Panel B shows the variation in the mean efficiency, as quantified by the GSD, across samples of a given type for each of 1000 simulated bias vectors. Panel C shows the variation in the proportion of each species commonly found in a given sample type. For Panel C, variation is measured by the GSD when a pseudo-value of $10^{-4}$ is added to the raw proportions. Different choices of the pseudo-value substantially change the measured GSDs, but not the difference (ratio) between gut and vaginal species.
<br><br>
<!-- Solutions -->