Version 0.1.0
MCHT is a package implementing an interface for creating and using Monte
Carlo tests. The primary function of the package is MCHTest()
, which creates
functions with S3 class MCHTest
that perform a Monte Carlo test.
MCHT is not presently available on CRAN. You can download and install
MCHT from GitHub using devtools via
the R command devtools::install_github("ntguardian/MCHT")
.
Monte Carlo testing is a form of hypothesis testing where the -values are
computed using the empirical distribution of the test statistic computed from
data simulated under the null hypothesis. These tests are used when the
distribution of the test statistic under the null hypothesis is intractable or
difficult to compute, or as an exact test (that is, a test where the
distribution used to compute
-values is appropriate for any sample size, not
just large sample sizes).
Suppose that is the observed value of the test statistic and large values
of
are evidence against the null hypothesis; normally,
-values would be
computed as
, where
is the cumulative
distribution functions and
is the random variable version of
. We
cannot use
for some reason; it's intractable, or the
provided is only
appropriate for large sample sizes.
Instead of using we will use
, which is the empirical CDF of
the same test statistic computed from simulated data following the distribution
prescribed by the null hypothesis of the test. For the sake of simplicity in
this presentation, assume that
is a continuous random variable. Now our
-value is
, where
where
is the indicator function and
is an independent random copy of
computed from simulated
data with a sample size of
.
The power of these tests increase with (see [1]) but modern computers are
able to simulate large
quickly, so this is rarely an issue. The procedure
above also assumes that there are no nuisance parameters and the distribution of
can effectively be known precisely when the null hypothesis is true (and
all other conditions of the test are met, such as distributional assumptions). A
different procedure needs to be applied when nuisance parameters are not
explicitly stated under the null hypothesis. [2] suggests a procedure using
optimization techniques (recommending simulated annealing specifically) to
adversarially select values for nuisance parameters valid under the null
hypothesis that maximize the
-value computed from the simulated data. This
procedure is often called maximized Monte Carlo (MMC) testing. That is the
procedure employed here. (In fact, the tests created by
MCHTest()
are the
tests described in [2].) Unfortunately, MMC, while conservative and exact, has
much less power than if the unknown parameters were known, perhaps due to the
behavior of samples under distributions with parameter values distant from the
true parameter values (see [3]).
Bootstrap statistical testing is very similar to Monte Carlo testing; the key
difference is that bootstrap testing uses information from the sample. For
example a parametric bootstrap test would estimate the parameters of the
distribution the data is assumed to follow and generate datasets from that
distribution using those estimates as the actual parameter values. A permutation
test (like Fisher's permutation test; see [4]) would use the original dataset
values but randomly shuffle the labeles (stating which sample an observation
belongs to) to generate new data sets and thus new simulated test statistics.
-values are essentially computed the same way.
Unlike Monte Carlo tests and MMC, these tests are not exact tests. That said, they often have good finite sample properties. (See [3].)
See the documentation for more details and references.
MCHTest()
is the main function of the package and can create functions with S3
class MCHTest
that perform Monte Carlo hypothesis tests.
For example, the code below creates the Monte Carlo equivalent of a -test.
library(MCHT)
#> .------..------..------..------.
#> |M.--. ||C.--. ||H.--. ||T.--. |
#> | (\/) || :/\: || :/\: || :/\: |
#> | :\/: || :\/: || (__) || (__) |
#> | '--'M|| '--'C|| '--'H|| '--'T|
#> `------'`------'`------'`------' v. 0.1.0
#> Type citation("MCHT") for citing this R package in publications
library(doParallel)
#> Loading required package: foreach
#> Loading required package: iterators
#> Loading required package: parallel
registerDoParallel(detectCores()) # Necessary for parallelization, and if not
# done the resulting function will complain
# on the first use
ts <- function(x, mu = 0) {sqrt(length(x)) * (mean(x) - mu)/sd(x)}
sg <- function(x, mu = 0) {
x <- x + mu
ts(x)
}
rg <- rnorm
mc.t.test <- MCHTest(ts, sg, rg, seed = 20181001, test_params = "mu",
lock_alternative = FALSE,
method = "Monte Carlo One Sample t-Test")
The object mc.t.test()
is an S3 class, and a callable function.
class(mc.t.test)
#> [1] "MCHTest"
print()
will print relevant information about the construction of the test.
mc.t.test
#>
#> Details for Monte Carlo One Sample t-Test
#>
#> Seed: 20181001
#> Replications: 10000
#> Tested Parameters: mu
#> Default mu: 0
#>
#> Memoisation enabled
Once this object is created, we can use it for performing hypothesis tests.
dat <- c(2.3, -0.13, 1.42, 1.51, 3.43, -0.96, 0.59, 0.62, 1.28, 4.07)
t.test(dat, mu = 1, alternative = "two.sided") # For reference
#>
#> One Sample t-test
#>
#> data: dat
#> t = 0.84975, df = 9, p-value = 0.4175
#> alternative hypothesis: true mean is not equal to 1
#> 95 percent confidence interval:
#> 0.3135303 2.5124697
#> sample estimates:
#> mean of x
#> 1.413
mc.t.test(dat, mu = 1)
#> Loading required package: rngtools
#> Loading required package: pkgmaker
#> Loading required package: registry
#>
#> Attaching package: 'pkgmaker'
#> The following object is masked from 'package:base':
#>
#> isFALSE
#>
#> Monte Carlo One Sample t-Test
#>
#> data: dat
#> S = 0.84975, p-value = 0.9885
mc.t.test(dat, mu = 1, alternative = "two.sided")
#>
#> Monte Carlo One Sample t-Test
#>
#> data: dat
#> S = 0.84975, p-value = 0.023
#> alternative hypothesis: true mu is not equal to 1
This is the simplest example; MCHTest()
can create more involved Monte Carlo
tests. See other documentation for details.
- A function for making diagnostic-type plots for tests, such as a function creating a plot for the rejection probability function (RPF) as described in [5]
- A function that accepts a
MCHTest
-class object and returns a function that, rather than returning ahtest
-class object, returns a function that will give the test statistic, simulated test statistics, and a-value, in a list; could be useful for diagnostic work.
- A. C. A. Hope, A simplified Monte Carlo test procedure, JRSSB, vol. 30 (1968) pp. 582-598
- J-M Dufour, Monte Carlo tests with nuisance parameters: A general approach to finite-sample inference and nonstandard asymptotics, Journal of Econometrics, vol. 133 no. 2 (2006) pp. 443-477
- J. G. MacKinnon, Bootstrap hypothesis testing in Handbook of computational econometrics (2009) pp. 183-213
- R. A. Fisher, The design of experiments (1935)
- R. Davidson and J. G. MacKinnon, The size distortion of bootstrap test, Econometric Theory, vol. 15 (1999) pp. 361-376