-
Notifications
You must be signed in to change notification settings - Fork 7
/
README.Rmd
162 lines (115 loc) · 7.39 KB
/
README.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
---
output: github_document
editor_options:
chunk_output_type: console
---
<!-- README.md is generated from README.Rmd. Please edit that file -->
```{r, echo = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/README-",
fig.align = "center",
fig.height = 4.5,
fig.width = 6.5,
out.width = "85%",
dpi = 200
)
options(width = 100)
```
# {rsimsum}: Analysis of Simulation Studies Including Monte Carlo Error <img src="man/figures/hex.png" width = "200" align="right" />
<!-- badges: start -->
[![R-CMD-check](https://github.com/ellessenne/rsimsum/actions/workflows/R-CMD-check.yaml/badge.svg)](https://github.com/ellessenne/rsimsum/actions/workflows/R-CMD-check.yaml)
[![Codecov Test Coverage](https://codecov.io/gh/ellessenne/rsimsum/branch/master/graph/badge.svg)](https://app.codecov.io/gh/ellessenne/rsimsum?branch=master)
[![CRAN Status](https://www.r-pkg.org/badges/version/rsimsum)](https://CRAN.R-project.org/package=rsimsum)
[![CRAN Logs Monthly Downloads](https://cranlogs.r-pkg.org/badges/rsimsum)](https://CRAN.R-project.org/package=rsimsum)
[![CRAN Logs Total Downloads](https://cranlogs.r-pkg.org/badges/grand-total/rsimsum)](https://CRAN.R-project.org/package=rsimsum)
[![JOSS DOI](https://joss.theoj.org/papers/10.21105/joss.00739/status.svg)](https://doi.org/10.21105/joss.00739)
[![Lifecycle: Stable](https://img.shields.io/badge/lifecycle-stable-brightgreen.svg)](https://lifecycle.r-lib.org/articles/stages.html#stable)
<!-- badges: end -->
`rsimsum` is an R package that can compute summary statistics from simulation studies. `rsimsum` is modelled upon a similar package available in Stata, the user-written command `simsum` (White I.R., 2010).
The aim of `rsimsum` is to help to report simulation studies, including understanding the role of chance in results of simulation studies: Monte Carlo standard errors and confidence intervals based on them are computed and presented to the user by default. `rsimsum` can compute a wide variety of summary statistics: bias, empirical and model-based standard errors, relative precision, relative error in model standard error, mean squared error, coverage, bias. Further details on each summary statistic are presented elsewhere (White I.R., 2010; Morris _et al_, 2019).
The main function of `rsimsum` is called `simsum` and can handle simulation studies with a single estimand of interest at a time. Missing values are excluded by default, and it is possible to define boundary values to drop estimated values or standard errors exceeding such limits. It is possible to define a variable representing methods compared with the simulation study, and it is possible to define _by_ factors, that is, factors that vary between the different simulated scenarios (data-generating mechanisms, DGMs). However, methods and DGMs are not strictly required: in that case, a simulation study with a single scenario and a single method is assumed. Finally, `rsimsum` provides a function named `multisimsum` that allows summarising simulation studies with multiple estimands as well.
An important step of reporting a simulation study consists in visualising the results; therefore, `rsimsum` exploits the R package [`ggplot2`](https://CRAN.R-project.org/package=ggplot2) to produce a portfolio of opinionated data visualisations for quick exploration of results, inferring colours and facetting by data-generating mechanisms. `rsimsum` includes methods to produce (1) plots of summary statistics with confidence intervals based on Monte Carlo standard errors (forest plots, lolly plots), (2) zipper plots to graphically visualise coverage by directly plotting confidence intervals, (3) plots for method-wise comparisons of estimates and standard errors (scatter plots, Bland-Altman plots, ridgeline plots), and (4) heat plots. The latter is a visualisation type that has not been traditionally used to present results of simulation studies, and consists in a mosaic plot where the factor on the x-axis is the methods compared with the current simulation study and the factor on the y-axis is the data-generating factors. Each tile of the mosaic plot is coloured according to the value of the summary statistic of interest, with a red colour representing values above the target value and a blue colour representing values below the target.
## Installation
You can install `rsimsum` from CRAN:
```{r cran-installation, eval = FALSE}
install.packages("rsimsum")
```
Alternatively, it is possible to install the development version from GitHub using the `remotes` package:
```{r gh-installation, eval = FALSE}
# install.packages("remotes")
remotes::install_github("ellessenne/rsimsum")
```
## Example
This is a basic example using data from a simulation study on missing data (type `help("MIsim", package = "rsimsum")` in the R console for more information):
```{r simsum}
library(rsimsum)
data("MIsim", package = "rsimsum")
s <- simsum(data = MIsim, estvarname = "b", true = 0.5, se = "se", methodvar = "method", x = TRUE)
s
```
We set `x = TRUE` as it will be required for some plot types.
Summarising the results:
```{r summary}
summary(s)
```
## Vignettes
`rsimsum` comes with 5 vignettes. In particular, check out the introductory one:
```r
vignette(topic = "A-introduction", package = "rsimsum")
```
The list of vignettes could be obtained by typing the following in the R console:
```r
vignette(package = "rsimsum")
```
## Visualising results
As of version `0.2.0`, `rsimsum` can produce a variety of plots: among others, lolly plots, forest plots, zipper plots, etc.:
```{r lolly}
library(ggplot2)
autoplot(s, type = "lolly", stats = "bias")
```
```{r zipper}
autoplot(s, type = "zip")
```
With `rsimsum` `0.5.0` the plotting functionality has been completely rewritten, and new plot types have been implemented:
* Scatter plots for method-wise comparisons, including Bland-Altman type plots;
```{r ba}
autoplot(s, type = "est_ba")
```
* Ridgeline plots.
```{r ridgeline}
autoplot(s, type = "est_ridge")
```
Nested loop plots have been implemented in `rsimsum` `0.6.0`:
```{r nlp}
data("nlp", package = "rsimsum")
s.nlp <- rsimsum::simsum(
data = nlp, estvarname = "b", true = 0, se = "se",
methodvar = "model", by = c("baseline", "ss", "esigma")
)
autoplot(s.nlp, stats = "bias", type = "nlp")
```
Finally, as of `rsimsum` `0.7.1` contour plots and hexbin plots have been implemented as well:
```{r density}
autoplot(s, type = "est_density")
```
```{r hex}
autoplot(s, type = "est_hex")
```
They provide a useful alternative when there are several data points with large overlap (e.g. in a scatterplot).
The plotting functionality now extend the S3 generic `autoplot`: see `?ggplot2::autoplot` and `?rsimsum::autoplot.simsum` for further details.
More details and information can be found in the vignettes dedicated to plotting:
```{r vignette-plotting, eval = FALSE}
vignette(topic = "C-plotting", package = "rsimsum")
vignette(topic = "D-nlp", package = "rsimsum")
```
# Citation
If you find `rsimsum` useful, please cite it in your publications:
```{r citation}
citation("rsimsum")
```
# References
* White, I.R. 2010. _simsum: Analyses of simulation studies including Monte Carlo error_. The Stata Journal, 10(3): 369-385
* Morris, T.P., White, I.R. and Crowther, M.J. 2019. _Using simulation studies to evaluate statistical methods_. Statistics in Medicine, 38: 2074-2102
* Gasparini, A. 2018. _rsimsum: Summarise results from Monte Carlo simulation studies_. Journal of Open Source Software, 3(26):739