forked from shabbychef/BWStest
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathREADME.Rmd
604 lines (514 loc) · 21.4 KB
/
README.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
```{r setup,include=FALSE}
# set the knitr options ... for everyone!
# if you unset this, then vignette build bonks. oh, joy.
#opts_knit$set(progress=TRUE)
opts_knit$set(eval.after='fig.cap')
# for a package vignette, you do want to echo.
# opts_chunk$set(echo=FALSE,warning=FALSE,message=FALSE)
opts_chunk$set(warning=FALSE,message=FALSE)
#opts_chunk$set(results="asis")
opts_chunk$set(cache=TRUE,cache.path="cache/")
#opts_chunk$set(fig.path="figure/",dev=c("pdf","cairo_ps"))
opts_chunk$set(fig.path="tools/figure/",dev=c("png"))
opts_chunk$set(fig.width=7,fig.height=6,dpi=100,out.width='700px',out.height='600px')
# doing this means that png files are made of figures;
# the savings is small, and it looks like shit:
#opts_chunk$set(fig.path="figure/",dev=c("png","pdf","cairo_ps"))
#opts_chunk$set(fig.width=4,fig.height=4)
# for figures? this is sweave-specific?
#opts_knit$set(eps=TRUE)
# this would be for figures:
#opts_chunk$set(out.width='.8\\textwidth')
# for text wrapping:
options(width=96,digits=2)
opts_chunk$set(size="small")
opts_chunk$set(tidy=TRUE,tidy.opts=list(width.cutoff=50,keep.blank.line=TRUE))
library(ggplot2)
library(BWStest)
library(dplyr)
library(tidyr)
library(moments)
library(microbenchmark)
# chicken and egg dept:
#[![CRAN](http://www.r-pkg.org/badges/version/BWStest)](http://cran.rstudio.com/package=BWStest)
#[![CRAN](http://www.r-pkg.org/badges/version/BWStest)](https://cran.rstudio.com/web/packages/BWStest/index.html)
#[![CRAN](http://www.r-pkg.org/badges/version/BWStest)](https://cran.r-project.org/web/packages/BWStest/index.html)
#[![ALSO BAD CRAN](http://www.r-pkg.org/badges/version/BWStest)](https://cran.rstudio.com/web/packages/BWStest/index.html)
#[![BAD CRAN](http://www.r-pkg.org/badges/version/BWStest)](http://cran.rstudio.com/package=BWStest)
#[![Downloads](http://cranlogs.r-pkg.org/badges/BWStest?color=green)](http://www.r-pkg.org/pkg/BWStest)
#[![Total](http://cranlogs.r-pkg.org/badges/grand-total/BWStest?color=green)](http://www.r-pkg.org/pkg/BWStest)
```
# BWStest
[![Build Status](https://github.com/shabbychef/BWStest/workflows/R-CMD-check/badge.svg)](https://github.com/shabbychef/BWStest/actions)
[![codecov.io](http://codecov.io/github/shabbychef/BWStest/coverage.svg?branch=master)](http://codecov.io/github/shabbychef/BWStest?branch=master)
[![CRAN](http://www.r-pkg.org/badges/version/BWStest)](https://cran.r-project.org/package=BWStest)
[![Downloads](http://cranlogs.r-pkg.org/badges/BWStest?color=green)](http://www.r-pkg.org/pkg/BWStest)
[![Total](http://cranlogs.r-pkg.org/badges/grand-total/BWStest?color=green)](http://www.r-pkg.org/pkg/BWStest)
![RCpp](https://img.shields.io/badge/RCpp-inside-blue.svg)
Performs the [Baumgartner-Weiß-Schindler 2-sample test](http://doai.io/10.2307/2533862) of equal probability
distributions.
-- Steven E. Pav, shabbychef@gmail.com
## Installation
This package can be installed
from CRAN,
via [drat](https://github.com/eddelbuettel/drat "drat"), or
from github:
```{r install,eval=FALSE,echo=TRUE}
# via CRAN:
install.packages("BWStest")
# via drat:
if (require(drat)) {
drat:::add("shabbychef")
install.packages("BWStest")
}
# get snapshot from github (may be buggy)
if (require(devtools)) {
install_github('shabbychef/BWStest')
}
```
## What is it?
The Baumgartner-Weiß-Schindler ('BWS') test is a non-parametric hypothesis test for the null of
equal probability distributions of two samples, not unlike the two-sample
Kolmogorov-Smirnov test or the Wilcoxon test.
The BWS test is more powerful than these other tests under certain
alternatives, as shown in the original paper and replicated below.
## Basic Usage
The front end for the hypothesis test is the function `bws_test`. By default it supports
the the classical Baumgartner-Weiß-Schindler test for a two-sided alternative, returning a `htest` object:
```{r basic1,eval=TRUE,echo=TRUE}
require(BWStest)
set.seed(12345)
# under the null:
x <- rnorm(200)
y <- rnorm(200)
hval <- bws_test(x,y)
show(hval)
# under the alternative:
z <- rnorm(200,mean=1)
hval <- bws_test(x,z)
show(hval)
```
The code also supports alternative test statistics from Neuhäuser and Murakami, along with supporting
one-sided alternatives for some of the tests:
```{r basic2,eval=TRUE,echo=TRUE}
set.seed(12345)
# under the alternative:
x <- rnorm(200)
y <- rnorm(200,mean=1)
hval <- bws_test(x,z,alternative='less')
show(hval)
x <- rnorm(8)
y <- rnorm(8,mean=1)
hval <- bws_test(x,z,method='B3',alternative='two.sided')
show(hval)
```
We should note that the `B3` through `B5` tests do _not_ achieve nominal coverage
for large sample sizes and should _only_ be used on sample sizes of about 12 or fewer
in each of the two samples.
## Checking the null
_Doverai No Proverai_ (Trust, but verify.) -- Russian proverb.
Here we perform 5000 simulations of the BWS test under the null hypothesis, then
compute the CDF of the test statistic. If the code is correct, the resultant p-values
should be uniform. So I q-q plot under the uniform law:
```{r babysteps,eval=TRUE,echo=TRUE,fig.width=5,fig.height=4,out.width='500px',out.height='400px'}
require(BWStest)
# now compute a bunch under the null:
set.seed(1234)
bvals <- replicate(5000,bws_stat(rnorm(100),rnorm(100)))
# compute the approximate p-values under the null:
pvals <- bws_cdf(bvals)
require(ggplot2)
ph <- ggplot(data.frame(pv=pvals),aes(sample=pv)) +
stat_qq(distribution=stats::qunif)
print(ph)
```
Looks good to me!
## Under the alternative
Here we replicate figure 2A of Baumgartner _et al._ We draw two samples from the normal distribution,
both with unit standard deviation, letting _a_ be the difference in means.
We check the empirical rejection rate at the 0.05 level for a few different tests.
As in Baumgartner, we find that the lowly t-test
is the most powerful in this case, with the BWS, Cramer-Von Mises, and Wilcoxon tests displaying similar power,
then the KS test the least powerful. Note that the Kolmogorov Smirnov test does not appear to have nominal
coverage under the null, probably due to the small sample size.
```{r sim_two_A,eval=TRUE,echo=TRUE}
n.sim <- 10000
avals <- seq(0,3.2,length.out=17)
alpha <- 0.05
mnsize <- 10
# this is archived on CRAN, unfortunately:
library(CvM2SL2Test)
# find the CVM critical value.
critv <- uniroot(function(x) { cvmts.pval(x,mnsize,mnsize) - alpha },lower=0,upper=100,maxiter=5000)$root
set.seed(1234)
simul <- sapply(avals,function(a) {
rejs <- replicate(n.sim,{
x <- rnorm(mnsize,mean=0)
y <- rnorm(mnsize,mean=a)
bws <- bws_cdf(bws_stat(x,y),lower_tail=FALSE) <= alpha
ttv <- t.test(x,y,alternative='two.sided')$p.value <= alpha
cvm <- cvmts.test(x,y) >= critv
ksv <- ks.test(x,y,alternative='two.sided')$p.value <= alpha
wcx <- wilcox.test(x,y,alternative='two.sided')$p.value <= alpha
c(bws,ttv,cvm,ksv,wcx)
})
rejrate <- rowMeans(rejs)
names(rejrate) <- c('BWS test','t test','Cramer-Von Mises test','Kolmogorov Smirnov test','Wilcoxon test')
rejrate
},simplify='matrix')
Arejrates <- data.frame(t(simul))
Arejrates$a <- avals
```
```{r fig_two_A,eval=TRUE,echo=TRUE,fig.width=8.0,fig.height=4.5,out.width='700px',out.height='500px'}
library(tidyr)
library(dplyr)
library(ggplot2)
plotdf <- tidyr::gather(Arejrates,'test','rejection_rate',-a) %>%
dplyr::mutate(test=gsub('\\.',' ',test))
ph <- ggplot(plotdf,aes(x=a,y=rejection_rate,group=test,colour=test)) +
geom_line() + geom_point() + labs(x='a, difference in means',y='rejection rate')
print(ph)
```
Here we replicate figure 2B of Baumgartner _et al._ We draw two samples from the normal distribution,
both with zero mean, one with unit standard deviation, the other with standard deviation of _sigma_.
We compute the empirical rejection rate at the 0.05 level, dropping the t-test
since it is not relevant for this formulation.
As in Baumgartner,
we find the BWS test is the most powerful, followed by KS test, then Cramer-Von Mises,
then Wilcoxon, which is basically useless in this simulation.
```{r sim_two_B,eval=TRUE,echo=TRUE}
n.sim <- 10000
svals <- seq(1,45,length.out=10)
alpha <- 0.05
mnsize <- 10
# this is archived on CRAN, unfortunately:
library(CvM2SL2Test)
# find the CVM critical value.
critv <- uniroot(function(x) { cvmts.pval(x,mnsize,mnsize) - alpha },lower=0,upper=100,maxiter=5000)$root
set.seed(1234)
simul <- sapply(svals,function(s) {
rejs <- replicate(n.sim,{
x <- rnorm(mnsize,mean=0,sd=1)
y <- rnorm(mnsize,mean=0,sd=s)
bws <- bws_cdf(bws_stat(x,y),lower_tail=FALSE) <= alpha
cvm <- cvmts.test(x,y) >= critv
ksv <- ks.test(x,y,alternative='two.sided')$p.value <= alpha
wcx <- wilcox.test(x,y,alternative='two.sided')$p.value <= alpha
c(bws,cvm,ksv,wcx)
})
rejrate <- rowMeans(rejs)
names(rejrate) <- c('BWS test','Cramer-Von Mises test','Kolmogorov Smirnov test','Wilcoxon test')
rejrate
},simplify='matrix')
Brejrates <- data.frame(t(simul))
Brejrates$sigma <- svals
```
```{r fig_two_B,eval=TRUE,echo=TRUE,fig.width=8.0,fig.height=4.5,out.width='700px',out.height='500px'}
plotdf <- tidyr::gather(Brejrates,'test','rejection_rate',-sigma) %>%
dplyr::mutate(test=gsub('\\.',' ',test))
ph <- ggplot(plotdf,aes(x=sigma,y=rejection_rate,group=test,colour=test)) +
geom_line() + geom_point() + labs(x='sigma, ratio of standard deviations',y='rejection rate')
print(ph)
```
Here we replicate figure 3A of Baumgartner _et al._ We draw two samples from the exponential distribution,
letting _l_ be the ratio of the rate parameters of the two populations.
We compute the empirical rejection rate at the 0.05 level.
As in Baumgartner,
we find the BWS test is the most powerful, followed by Wilcoxon, then Cramer-Von Mises, then the
KS test.
```{r sim_three_A,eval=TRUE,echo=TRUE}
n.sim <- 10000
lvals <- seq(1,12)
alpha <- 0.05
mnsize <- 10
# this is archived on CRAN, unfortunately:
library(CvM2SL2Test)
# find the CVM critical value.
critv <- uniroot(function(x) { cvmts.pval(x,mnsize,mnsize) - alpha },lower=0,upper=100,maxiter=5000)$root
set.seed(1234)
simul <- sapply(lvals,function(l) {
rejs <- replicate(n.sim,{
x <- rexp(mnsize,rate=1)
y <- rexp(mnsize,rate=l)
bws <- bws_cdf(bws_stat(x,y),lower_tail=FALSE) <= alpha
cvm <- cvmts.test(x,y) >= critv
ksv <- ks.test(x,y,alternative='two.sided')$p.value <= alpha
wcx <- wilcox.test(x,y,alternative='two.sided')$p.value <= alpha
c(bws,cvm,ksv,wcx)
})
rejrate <- rowMeans(rejs)
names(rejrate) <- c('BWS test','Cramer-Von Mises test','Kolmogorov Smirnov test','Wilcoxon test')
rejrate
},simplify='matrix')
Crejrates <- data.frame(t(simul))
Crejrates$lratio <- lvals
```
```{r fig_three_A,eval=TRUE,echo=TRUE,fig.width=8.0,fig.height=4.5,out.width='700px',out.height='500px'}
plotdf <- tidyr::gather(Crejrates,'test','rejection_rate',-lratio) %>%
dplyr::mutate(test=gsub('\\.',' ',test))
ph <- ggplot(plotdf,aes(x=lratio,y=rejection_rate,group=test,colour=test)) +
geom_line() + geom_point() + labs(x='l, ratio of rate parameters',y='rejection rate')
print(ph)
```
Here we replicate figure 3B of Baumgartner _et al._ We draw two samples, one from the normal distribution
with zero mean and variance one-twelth, the other uniformly on -0.5 to 0.5. We take equal sample sizes
from these two populations, then vary the sample size, checking the
empirical rejection rate at the 0.05 level. Since the first two moments are equal, the Wilcoxon test
is useless here, and not applied.
As in Baumgartner, we find the BWS test is the most powerful,
followed by the KS test and Cramer-Von Mises tests.
Based on the power plots here, I theorize that Baumgartner _et al._ are plotting the _total_ sample
sizes on the _x_ axis, that is, drawing _n_ from both distributions, then plotting empirical power
versus _2n_. We follow that convention, which makes the plots match those of Baumgartner.
```{r sim_three_B,eval=TRUE,echo=TRUE}
n.sim <- 10000
mvals <- seq(10,670,by=60)
alpha <- 0.05
# this is archived on CRAN, unfortunately:
library(CvM2SL2Test)
set.seed(1234)
simul <- sapply(mvals,function(mnsize) {
# find the CVM critical value.
# note that this basically converged for mnsize > 100 or so, so we take the min..
# for reasons of speed.
critv <- uniroot(function(x) { cvmts.pval(x,min(mnsize,80),min(mnsize,80)) - alpha },lower=0,upper=2,maxiter=100)$root
rejs <- replicate(n.sim,{
x <- rnorm(mnsize,mean=0,sd=1.0/sqrt(12.0))
y <- runif(mnsize,min=-0.5,max=0.5)
bws <- bws_cdf(bws_stat(x,y),lower_tail=FALSE) <= alpha
cvm <- cvmts.test(x,y) >= critv
ksv <- ks.test(x,y,alternative='two.sided')$p.value <= alpha
c(bws,cvm,ksv)
})
rejrate <- rowMeans(rejs)
names(rejrate) <- c('BWS test','Cramer-Von Mises test','Kolmogorov Smirnov test')
rejrate
},simplify='matrix')
Drejrates <- data.frame(t(simul))
Drejrates$ssize <- 2 * mvals
```
```{r fig_three_B,eval=TRUE,echo=TRUE,fig.width=8.0,fig.height=4.5,out.width='700px',out.height='500px'}
plotdf <- tidyr::gather(Drejrates,'test','rejection_rate',-ssize) %>%
dplyr::mutate(test=gsub('\\.',' ',test))
ph <- ggplot(plotdf,aes(x=ssize,y=rejection_rate,group=test,colour=test)) +
geom_line() + geom_point() + labs(x='m+n, total sample size',y='rejection rate')
print(ph)
```
## Murakami tests
[Neuhäuser](http://doai.io/10.1007/BF02762032) and
[Murakami](http://doai.io/10.1080/00949655.2010.551516) described
some modifications to the original test of Baumgartner. Neuhäuser's
test allows one to test against directional alternatives. Murakami
enumerated some modifications to the weighting scheme. These are
available via the `murakami_stat` function, where the `flavor`
corresponds to the test number from Murakami's paper, namely
* 0 corresponds to the original BWS statistic.
* 1 corresponds to Murakami's first modification.
* 2 corresponds to Neuhäuser's statistic, which can take negative values.
* 3 through 5 correspond to different weighting schemes of Murakami's.
Here we take these through the paces as above.
### Under the null
As above, we draw samples under the null and compare to the putative
CDF function
```{r murakami_null,eval=TRUE,echo=TRUE,fig.width=5,fig.height=6,out.width='500px',out.height='600px'}
require(BWStest)
# now compute a bunch under the null:
n1 <- 9
n2 <- n1
set.seed(1234)
allpvs <- lapply(0L:5L,function(flavor) {
bvals <- replicate(5000,murakami_stat(rnorm(n1),rnorm(n2),flavor=flavor))
# compute the approximate p-values under the null:
pvals <- murakami_cdf(bvals,n1=n1,n2=n2,flavor=flavor)
data.frame(pv=pvals,flavor=flavor)
})
df <- do.call(rbind,allpvs)
require(ggplot2)
ph <- ggplot(df,aes(sample=pv)) +
facet_grid(flavor ~ .) +
stat_qq(distribution=stats::qunif)
print(ph)
```
While these all look fine, they are based on small sample sizes. The CDF is approximated by evaluating
all the permutations (with memoisation to tame the computational requirements), but this can only be
done up to some reasonably small sample size. If the test statistic does not converge beyond that
sample size, the CDF approximation will not be accurate. This appears to be the case for flavors 3 through 5,
as demonstrated below:
```{r murakami_null_2,eval=TRUE,echo=TRUE,fig.width=5,fig.height=6,out.width='500px',out.height='600px'}
require(BWStest)
# now compute a bunch under the null:
n1 <- 50
n2 <- n1
set.seed(1234)
allpvs <- lapply(0L:5L,function(flavor) {
bvals <- replicate(5000,murakami_stat(rnorm(n1),rnorm(n2),flavor=flavor))
# compute the approximate p-values under the null:
pvals <- murakami_cdf(bvals,n1=n1,n2=n2,flavor=flavor)
data.frame(pv=pvals,flavor=flavor)
})
df <- do.call(rbind,allpvs)
require(ggplot2)
ph <- ggplot(df,aes(sample=pv)) +
facet_grid(flavor ~ .) +
stat_qq(distribution=stats::qunif)
print(ph)
```
So that's not so great.
## Under the alternative
We can perform the power comparisons of Baumgartner _et al._ using the Murakami test statistics.
Here we consider the setup of figure 2A, samples of size 10 from the normal distribution with
equal variances but different means. Neuhäuser's test is, as expected, more powerful for detecting
this one-sided alternative, probably on par with the t-test, while the BWS test (basically equivalent
to Murakami 1) is next, then 5, 3, 4.
```{r murakami_sim_two_A,eval=TRUE,echo=TRUE}
n.sim <- 10000
avals <- seq(0,3.2,length.out=17)
alpha <- 0.05
mnsize <- 10
set.seed(1234)
simul <- sapply(avals,function(a) {
statvs <- replicate(n.sim,{
x <- rnorm(mnsize,mean=0)
y <- rnorm(mnsize,mean=a)
bws <- bws_stat(x,y)
murav <- sapply(1L:5L,function(flavor) {
murakami_stat(x,y,flavor=flavor)
})
c(bws,murav)
})
rej_bws <- mean(bws_cdf(statvs[1,],lower_tail=FALSE) <= alpha)
rej_mur <- sapply(2:6,function(rown) {
mean(murakami_cdf(statvs[rown,],mnsize,mnsize,flavor=rown-1,lower_tail=FALSE) <= alpha)
})
rejrate <- c(rej_bws,rej_mur)
names(rejrate) <- c('BWS test','Murakami 1','Neuhauser','Murakami 3','Murakami 4','Murakami 5')
rejrate
},simplify='matrix')
Arejrates <- data.frame(t(simul))
Arejrates$a <- avals
```
```{r murakami_fig_two_A,eval=TRUE,echo=TRUE,fig.width=8.0,fig.height=4.5,out.width='700px',out.height='500px'}
library(tidyr)
library(dplyr)
library(ggplot2)
plotdf <- tidyr::gather(Arejrates,'test','rejection_rate',-a) %>%
dplyr::mutate(test=gsub('\\.',' ',test))
ph <- ggplot(plotdf,aes(x=a,y=rejection_rate,group=test,colour=test)) +
geom_line() + geom_point() + labs(x='a, difference in means',y='rejection rate')
print(ph)
```
Now we consider figure2B, with two samples of equal size from the normal distribution with zero
mean, but different standard deviations. Neuhäuser's test is nearly useless here because the
means of the populations are equal. Murakami 3, 4, and 5 are the most powerful, followed by BWS.
```{r murakami_sim_two_B,eval=TRUE,echo=TRUE}
n.sim <- 10000
svals <- seq(1,45,length.out=10)
alpha <- 0.05
mnsize <- 10
set.seed(1234)
simul <- sapply(svals,function(s) {
statvs <- replicate(n.sim,{
x <- rnorm(mnsize,mean=0,sd=1)
y <- rnorm(mnsize,mean=0,sd=s)
bws <- bws_stat(x,y)
murav <- sapply(1L:5L,function(flavor) {
murakami_stat(x,y,flavor=flavor)
})
c(bws,murav)
})
rej_bws <- mean(bws_cdf(statvs[1,],lower_tail=FALSE) <= alpha)
rej_mur <- sapply(2:6,function(rown) {
mean(murakami_cdf(statvs[rown,],mnsize,mnsize,flavor=rown-1,lower_tail=FALSE) <= alpha)
})
rejrate <- c(rej_bws,rej_mur)
names(rejrate) <- c('BWS test','Murakami 1','Neuhauser','Murakami 3','Murakami 4','Murakami 5')
rejrate
},simplify='matrix')
Brejrates <- data.frame(t(simul))
Brejrates$sigma <- svals
```
```{r murakami_fig_two_B,eval=TRUE,echo=TRUE,fig.width=8.0,fig.height=4.5,out.width='700px',out.height='500px'}
plotdf <- tidyr::gather(Brejrates,'test','rejection_rate',-sigma) %>%
dplyr::mutate(test=gsub('\\.',' ',test))
ph <- ggplot(plotdf,aes(x=sigma,y=rejection_rate,group=test,colour=test)) +
geom_line() + geom_point() + labs(x='sigma, ratio of standard deviations',y='rejection rate')
print(ph)
```
Now to figure 3A, with
two equal sized samples from the exponential distribution,
letting _l_ be the ratio of the rate parameters of the two populations. The BWS test
is the most powerful here, apparently.
```{r murakami_sim_three_A,eval=TRUE,echo=TRUE}
n.sim <- 10000
lvals <- seq(1,12)
alpha <- 0.05
mnsize <- 10
set.seed(1234)
simul <- sapply(lvals,function(l) {
statvs <- replicate(n.sim,{
x <- rexp(mnsize,rate=1)
y <- rexp(mnsize,rate=l)
bws <- bws_stat(x,y)
murav <- sapply(1L:5L,function(flavor) {
murakami_stat(x,y,flavor=flavor)
})
c(bws,murav)
})
rej_bws <- mean(bws_cdf(statvs[1,],lower_tail=FALSE) <= alpha)
rej_mur <- sapply(2:6,function(rown) {
mean(murakami_cdf(statvs[rown,],mnsize,mnsize,flavor=rown-1,lower_tail=FALSE) <= alpha)
})
rejrate <- c(rej_bws,rej_mur)
names(rejrate) <- c('BWS test','Murakami 1','Neuhauser','Murakami 3','Murakami 4','Murakami 5')
rejrate
},simplify='matrix')
Crejrates <- data.frame(t(simul))
Crejrates$lratio <- lvals
```
```{r murakami_fig_three_A,eval=TRUE,echo=TRUE,fig.width=8.0,fig.height=4.5,out.width='700px',out.height='500px'}
plotdf <- tidyr::gather(Crejrates,'test','rejection_rate',-lratio) %>%
dplyr::mutate(test=gsub('\\.',' ',test))
ph <- ggplot(plotdf,aes(x=lratio,y=rejection_rate,group=test,colour=test)) +
geom_line() + geom_point() + labs(x='l, ratio of rate parameters',y='rejection rate')
print(ph)
```
Now figure 3B, with a normal sample and a uniform sample, with equal first and second moments, varying
the sample size. Because flavors 3 through 5 are not well behaved, we drop them here. Once again
the BWS test is the most powerful.
```{r murakami_sim_three_B,eval=TRUE,echo=TRUE}
n.sim <- 10000
mvals <- seq(10,670,by=60)
alpha <- 0.05
set.seed(1234)
simul <- sapply(mvals,function(mnsize) {
statvs <- replicate(n.sim,{
x <- rnorm(mnsize,mean=0,sd=1.0/sqrt(12.0))
y <- runif(mnsize,min=-0.5,max=0.5)
bws <- bws_stat(x,y)
murav <- sapply(1L:2L,function(flavor) {
murakami_stat(x,y,flavor=flavor)
})
c(bws,murav)
})
rej_bws <- mean(bws_cdf(statvs[1,],lower_tail=FALSE) <= alpha)
rej_mur <- sapply(2:3,function(rown) {
mean(murakami_cdf(statvs[rown,],mnsize,mnsize,flavor=rown-1,lower_tail=FALSE) <= alpha)
})
rejrate <- c(rej_bws,rej_mur)
names(rejrate) <- c('BWS test','Murakami 1','Neuhauser')
rejrate
},simplify='matrix')
Drejrates <- data.frame(t(simul))
Drejrates$ssize <- 2 * mvals
```
```{r murakami_fig_three_B,eval=TRUE,echo=TRUE,fig.width=8.0,fig.height=4.5,out.width='700px',out.height='500px'}
plotdf <- tidyr::gather(Drejrates,'test','rejection_rate',-ssize) %>%
dplyr::mutate(test=gsub('\\.',' ',test))
ph <- ggplot(plotdf,aes(x=ssize,y=rejection_rate,group=test,colour=test)) +
geom_line() + geom_point() + labs(x='m+n, total sample size',y='rejection rate')
print(ph)
```
# Future work
I anticipate the following:
* Modifications of flavors 3 through 5 so that they converge for large
sample sizes.