Monte Carlo tests with nuisance parameters: and nonstandard asymptotics

ARTICLE IN PRESS
Journal of Econometrics 133 (2006) 443–477
www.elsevier.com/locate/jeconom
Monte Carlo tests with nuisance parameters:
A general approach to finite-sample inference
and nonstandard asymptotics
Jean-Marie Dufour
De´partement de sciences e´conomiques, Universite´ de Montre´al, C.P. 6128 succursale Centre-ville,
Montre´al, Que´bec, Canada H3C 3J7
Available online 30 August 2005
Abstract
The technique of Monte Carlo (MC) tests [Dwass (1957, Annals of Mathematical Statistics
28, 181–187); Barnard (1963, Journal of the Royal Statistical Society, Series B 25, 294)]
provides a simple method for building exact tests from statistics whose finite sample
distribution is intractable but can be simulated (when no nuisance parameter is involved). We
extend this method in two ways: first, by allowing for MC tests based on exchangeable
possibly discrete test statistics; second, by generalizing it to statistics whose null distribution
involves nuisance parameters [maximized MC (MMC) tests]. Simplified asymptotically
justified versions of the MMC method are also proposed: these provide a simple way of
improving standard asymptotics and dealing with nonstandard asymptotics.
r 2005 Elsevier B.V. All rights reserved.
JEL classification: C12; C15; C2; C52; C22
Keywords: Monte Carlo test; Maximized Monte Carlo test; Finite-sample test; Exact test; Nuisance
parameter; Bounds; Bootstrap; Parametric bootstrap; Simulated annealing; Asymptotics; Nonstandard
asymptotic distribution
Tel.: +1 514 343 2400; fax: +1 514 343 5831.
E-mail address: [email protected].
URL: http://www.fas.umontreal.ca/SCECO/Dufour.
0304-4076/$ - see front matter r 2005 Elsevier B.V. All rights reserved.
doi:10.1016/j.jeconom.2005.06.007
ARTICLE IN PRESS
444
J.-M. Dufour / Journal of Econometrics 133 (2006) 443–477
1. Introduction
During the last 25 years, the development of faster and cheaper computers has
made Monte Carlo techniques more affordable and attractive in statistical analysis.
In particular, such techniques may now be used routinely for data analysis.
Important developments in this area include the use of bootstrap techniques for
improving standard asymptotic approximations (for reviews, see Efron, 1982; Beran
and Ducharme, 1991; Efron and Tibshirani, 1993; Hall, 1992; Jeong and Maddala,
1993; Vinod, 1993; Shao and Tu, 1995; Davison and Hinkley, 1997; Chernick, 1999;
Horowitz, 1997) and techniques where estimators and forecasts are obtained from
criteria evaluated by simulation (see McFadden, 1989; Mariano and Brown, 1993;
Hajivassiliou, 1993; Keane, 1993; Gourie´roux and Monfort, 1996; Gallant and
Tauchen, 1996).
With respect to tests and confidence sets, these techniques only have asymptotic
justifications and do not yield inferences that are provably valid (in the sense of
correct levels) in finite samples. Here, it is of interest to note that the use of
simulation in the execution of tests was suggested much earlier than recent bootstrap
and simulation-based techniques. For example, randomized tests have been
proposed long ago as a way of obtaining tests with any given level from statistics
with discrete distributions (e.g., sign and rank tests); see Lehmann (1986). A second
interesting possibility is the technique of Monte Carlo tests originally suggested by
Dwass (1957) for implementing permutation tests and later extended by Barnard
(1963), Hope (1968) and Birnbaum (1974). This technique has the great attraction of
providing exact (randomized) tests based on any statistic whose finite-sample
distribution may be intractable but can be simulated. The validity of the tests so
obtained does not depend at all on the number of replications made (which can be
small). Only the power of the procedure is influenced by the number of replications,
but the power gains associated with lengthy simulations are typically rather small.
For further discussion of Monte Carlo tests, see Besag and Diggle (1977), Dufour
and Kiviet (1996, 1998), Edgington (1980), Edwards (1985), Edwards and Berry
(1987), Foutz (1980), Jo¨ckel (1986), Kiviet and Dufour (1997), Marriott (1979) and
Ripley (1981).
An important limitation of the technique of Monte Carlo tests is the fact
that one needs to have a statistic whose distribution does not depend on
nuisance parameters. This obviously limits considerably its applicability. The main
objective of this paper is to extend the technique of Monte Carlo tests in order
to allow for the presence of nuisance parameters in the null distribution of the
test statistic.
In Section 2, we summarize and extend results on Monte Carlo (MC) tests when
the null distribution of a test statistic does not involve nuisance parameters. In
particular, we put them in a form that will make their extension to cases with
nuisance parameters easy and intuitive, and we generalize them by allowing for MC
tests based on exchangeable (possibly nonindependent) replications and statistics
with discrete distributions. These generalizations allow, in particular, for various
nonparametric tests (e.g., permutation tests) as well as test statistics where certain
ARTICLE IN PRESS
J.-M. Dufour / Journal of Econometrics 133 (2006) 443–477
445
parameters are themselves evaluated by simulation. We deal with possibly
discrete (or mixtures of continuous and discrete distributions) by exploiting
Ha´jek’s (1969) method of randomized ranks for breaking ties in rank tests,
which is both simple to implement and allows one to easily deal with exchangeable [as opposed to independent and identically distributed (i.i.d.)] simulations.
On the problem of discrete distributions, it is also of interest that the method
proposed by Jo¨ckel (1986) was derived under the assumption of i.i.d. MC
replications.
In Section 3, we study how the power of Monte Carlo tests is related to the
number of replications used and the sensitivity of the conclusions to the randomized
nature of the procedure. In particular, given the observed (randomized) p-value of
the Monte Carlo test, we see that the probability of an eventual reversal of the
conclusion of the procedure (rejection or acceptance at a given level, e.g. 5%) can
easily be computed.
In Section 4, we present the extension to statistics whose null distribution depends
on nuisance parameters. This procedure is based on considering a simulated p-value
function which depends on nuisance parameters (under the null hypothesis). We
show that maximizing the latter with respect to the nuisance parameters yields a test
with provably exact level, irrespective of the sample size and the number replications
used. For this reason, we call the latter maximized Monte Carlo (MMC) tests. As
one would expect for a statistic whose distribution depends on unknown nuisance
parameters, the probability of type I error for a MMC test can be lower (but not
higher) than the level of the test, so the procedure can be conservative. We also
discuss how this maximization can be achieved in practice, e.g. through simulated
annealing techniques.
In the two next sections, we discuss simplified (asymptotically justified)
approximate versions of the proposed procedures, which involve the use of
consistent set or point estimates of model parameters. In Section 5, we suggest a
method [the consistent set estimate MMC method (CSEMMC)] which is applicable
when a consistent set estimator of the nuisance parameters [e.g., a random subset of
the parameter space whose probability of covering the nuisance parameters
converges to one as the sample size goes to infinity] is available. The approach
proposed involves maximizing the simulated p-value function over the consistent set
estimate, as opposed to the full nuisance parameter space. This procedure may thus
be computationally much less costly. Using a consistent set estimator (or confidence
set), as opposed to a point estimate, to deal with nuisance parameters is especially
useful because it allows one to obtain asymptotically valid tests even when the test
statistic does not converge in distribution or when the asymptotic distribution
depends on nuisance parameters possibly in a discontinuous way. Consequently,
there is no need to study the asymptotic distribution of the test statistic considered or
even to establish its existence.1 This consistent set estimator MMC method
1
A case where the distribution of a test statistic does not converge in distribution is the one where the
associated sequence of distribution functions has several accumulation points, allowing different
subsequences to have different limiting distributions.
ARTICLE IN PRESS
446
J.-M. Dufour / Journal of Econometrics 133 (2006) 443–477
(CSEMMC) may be viewed as an asymptotic Monte Carlo extension of finite-sample
two-stage procedures proposed in Dufour (1990), Dufour and Kiviet (1996, 1998),
Campbell and Dufour (1997), and Dufour et al. (1998). These features may be
contrasted with those of bootstrap methods which can fail to provide asymptotically valid tests when the test statistic simulated has an asymptotic distribution
involving nuisance parameters, especially if the asymptotic distribution has
discontinuities with respect to the nuisance parameters (see Athreya, 1987; Basawa
et al., 1991; Sriram, 1994; Andrews, 2000; Benkwitz et al., 2000; Inoue and Kilian,
2002, 2003).
In Section 6, we consider the simplest form of a Monte Carlo test with nuisance
parameters, i.e. the one where the consistent set estimate has been replaced by a
consistent point estimate. In other words, the distribution of the test statistic is
simulated after replacing the nuisance parameters by a consistent point estimate.
Such a procedure can be interpreted as a parametric bootstrap test based on the
percentile method (see Efron and Tibshirani, 1993, Chapter 16; Hall, 1992). The
term ‘‘parametric’’ may however be misleading here, because such MC tests can be
applied as well to nonparametric (distribution-free) test statistics. We give general
conditions under which a Monte Carlo test obtained after replacing an unknown
nuisance parameter yield an asymptotically valid test in cases where the limit
distribution of the test statistic involves nuisance parameters. Following the
general spirit of Monte Carlo testing and in contrast with typical bootstrap
arguments, the proofs take the number of Monte Carlo simulations as fixed (possibly
very small, such as 19 to obtain a test with level 0.05). As in standard bootstrap
arguments, the conditions considered involve a smooth (continuous) dependence
of the asymptotic distribution upon the nuisance parameters. It is, however,
important to note that these conditions are more restrictive and more difficult to
check than those under which CSEMMC procedures would be applicable. We
conclude in Section 7.
2. Monte Carlo tests without nuisance parameters
Let us consider a family of probability spaces fðZ; AZ ; Py Þ : y 2 Og, where Z is a
sample space, AZ a s-algebra of subsets of Z, and O a parameter space (possibly
infinite dimensional). Let also S SðoÞ, o 2 Z, be a real-valued AZ -measurable
function whose distribution is determined by Py0 —i.e., y0 is the ‘‘true’’ parameter
vector. We wish to test the hypothesis
H 0 : y0 2 O0 ,
(2.1)
where O0 is a nonempty subset of O; using a critical region of the form fSXcg:
Although, in general, the distribution of S under H 0 depends on the unknown value
of y0 , we shall assume in this section that this distribution does not depend on
(unknown) nuisance parameters, so that we can write
Py ½Spx ¼ F ðxÞ for all y 2 O0 ,
(2.2)
ARTICLE IN PRESS
J.-M. Dufour / Journal of Econometrics 133 (2006) 443–477
447
where F ðxÞ is the unique distribution that S can have under H 0 . In view of this
assumption, we shall—until further notice—compute probabilities under the
(unique) P Py0 when y0 2 O0 . The constant c is chosen so that
P½SXc ¼ 1 F ðcÞ þ P½S ¼ cpa,
(2.3)
where a is the desired level of the test ð0oao1Þ. Note that the critical region SXc can
also be put in two useful alternative forms, which are equivalent to SXc with probability
one (i.e., they can differ from the critical region SXc only on a set of zero probability):
GðSÞpGðcÞ,
(2.4)
SXF 1 ½ðF ðcÞ P½S ¼ cÞþ ¼ F 1 ½ð1 GðcÞÞþ ,
(2.5)
where
GðxÞ ¼ P½SXx ¼ 1 F ðxÞ þ P½S ¼ x
(2.6)
is the ‘‘tail-area’’ or ‘‘p-value’’ function associated with F , and F
function of F , with the conventions
F 1 ðqþ Þ ¼ lim F 1 ðq þ Þ ¼ inffF 1 ðq0 Þ : q0 4qg;
#0
1
is the quantile
0pqp1,
F 1 ð1þ Þ ¼ 1 and F 1 ð0þ Þ ¼ F 1 ð0Þ. For any probability distribution function F ðxÞ,
the quantile function F 1 ðqÞ is defined as follows:
F 1 ðqÞ ¼ inffx : F ðxÞXqg
if 0oqo1,
¼ inffx : F ðxÞ40g if q ¼ 0,
¼ supfx : F ðxÞo1g if q ¼ 1;
ð2:7Þ
1
see Reiss (1989, p. 13). In general, F ðqÞ takes its values in the extended real
¯ ¼ R [ f1; þ1g and, for coherence, we set F ð1Þ ¼ 0 and F ð1Þ ¼ 1.
numbers R
Using (2.7), it is easy to see that
F 1 ½ðF ðcÞ P½S ¼ cÞþ ¼ c,
when: 0oF ðcÞo1 and either PðS ¼ cÞ40 or F ðxÞ is continuous and strictly
monotonic in an open neighborhood containing c. However, formulation (2.5)
remains valid in all cases.
2.1. Monte Carlo tests based on statistics with continuous distributions
Consider now a situation where the distribution of S under H 0 may not be easy to
compute analytically but can be simulated. Let S1 ; . . . ; S N be a sample of identically
distributed real random variables with the same distribution as S. Typically, it is
assumed that S 1 ; . . . ; S N are also independent. However, we will observe that the
exchangeability of S 1 ; . . . ; SN is sufficient for most of the results presented below.2
2
The elements of a random vector ðS1 ; S2 ; . . . ; SN Þ0 are exchangeable (or P-exchangeable) if
ðSr1 ; Sr2 ; . . . ; SrN Þ0 ðS 1 ; S2 ; . . . ; SN Þ0 for any permutation ðr1 ; r2 ; . . . ; rN Þ of the integers ð1; 2; :::; NÞ under
the relevant probability measure P.
ARTICLE IN PRESS
J.-M. Dufour / Journal of Econometrics 133 (2006) 443–477
448
This can accommodate a wide array of situations, where the simulated statistics are not independent because they involve common (conditioning) variables,
such as: statistics obtained by permuting randomly a given set of observations (permutation tests), tests which are simulated conditionally on a common
set of initial values (e.g., in time series models), common regressors or a
common subsample [see the Anderson–Rubin-type split-sample test described in
Dufour and Jasiak (2001)], tests that depend on a common independent simulation
[e.g., tests based on a criterion evaluated by a preliminary simulation, such as
the simulated method of moments or indirect inference; see Gourie´roux and Monfort
(1996)], etc.
The technique of MC tests provides a simple method allowing one to replace the
theoretical distribution F ðxÞ by its sample analogue based on S1 ; . . . ; S N :
N
1 X
F^ N ðxÞ F^ N ½x; SðNÞ ¼
1ðSi pxÞ,
N i¼1
(2.8)
where SðNÞ ¼ ðS 1 ; . . . ; SN Þ0 and 1ðCÞ is the indicator function associated with the
condition C:
1ðCÞ ¼ 1
¼0
if condition C holds,
otherwise.
ð2:9Þ
In the latter notation, C may also be replaced by an event A 2 AZ , in which case
1ðAÞ 1ðo 2 AÞ where o is the element drawn from the sample space.
We also consider the corresponding sample tail area (or survival) function:
N
1 X
1ðS i XxÞ.
G^ N ½x; SðNÞ ¼
N i¼1
(2.10)
The sample distribution function is related to the ranks R1 ; . . . ; RN of the variables
S 1 ; . . . ; SN (when put in ascending order) by the expression:
Rj ¼ N F^ N ½Sj ; SðNÞ ¼
N
X
1ðS i pSj Þ;
j ¼ 1; . . . ; N.
(2.11)
i¼1
The central property we shall exploit here is the following: to obtain critical values
or compute p-values, the ‘‘theoretical’’ null distribution F ðxÞ can be replaced by its
simulation-based ‘‘estimate’’ F^ N ðxÞ in a way that will preserve the level of the test in
finite samples, irrespective of the number N of replications used. For continuous
distributions, this property is expressed by Proposition 2.2 below, which is easily
proved by using the following simple lemma.
Lemma 2.1 (Distribution of ranks when ties have zero probability). Let ðy1 ; . . . ; yN Þ0
be a N 1 vector of P-exchangeable real random variables such that
Pðyi ¼ yj Þ ¼ 0
for iaj; i; j ¼ 1; . . . ; N,
(2.12)
ARTICLE IN PRESS
J.-M. Dufour / Journal of Econometrics 133 (2006) 443–477
449
P
and let Rj ¼ N
i¼1 1ðyi pyj Þ be the rank of yj when y1 ; . . . ; yN are ranked in
nondecreasing order ðj ¼ 1; . . . ; NÞ. Then, for j ¼ 1; . . . ; N,
PðRj =NpxÞ ¼ I½xN=N
for 0pxp1,
PðRj =NXxÞ ¼ 1 if xp0,
¼ ðI½ð1 xÞN þ 1Þ=N
¼0
(2.13)
if 0oxp1,
if x41,
ð2:14Þ
where I½x is the largest integer less than or equal to x.
Note that we use the symbol I½x rather than the common notation ½x to represent
the integer part of a number x, because we heavily use brackets elsewhere in the
paper, so that the notation ½ could lead to confusions. It is clear that condition
(2.12) is satisfied whenever the variables y1 ; . . . ; yN are independent with continuous
distribution functions (see Ha´jek, 1969, pp. 20–21), or when the vector ðy1 ; . . . ; yN Þ0
has an absolutely continuous distribution (with respect to the Lebesgue measure
on RN Þ.
Proposition 2.2 (Validity of Monte Carlo tests when ties have zero probability). Let
ðS0 ; S 1 ; . . . ; SN Þ0 be a ðN þ 1Þ 1 vector of exchangeable real random variables such
that
PðSi ¼ S j Þ ¼ 0 for iaj; i; j ¼ 0; 1; . . . ; N,
(2.15)
1
let F^ N ðxÞ F^ N ½x; SðNÞ, G^ N ðxÞ ¼ G^ N ½x; SðNÞ and F^ N ðxÞ be defined as in
(2.7)–(2.10), and set
p^ N ðxÞ ¼
N G^ N ðxÞ þ 1
.
N þ1
(2.16)
Then,
I½a1 N þ 1
P½G^ N ðS0 Þpa1 ¼ P½F^ N ðS 0 ÞX1 a1 ¼
N þ1
I½a1 N þ 1
1
P½S0 XF^ N ð1 a1 Þ ¼
N þ1
for 0pa1 p1,
for 0oa1 o1,
(2.17)
(2.18)
and
I½aðN þ 1Þ
for 0pap1.
N þ1
The latter proposition can be used as follows: choose a1 and N so that
P½p^ N ðS 0 Þpa ¼
(2.19)
I½a1 N þ 1
(2.20)
N þ1
is the desired significance level. Provided N is reasonably large, a1 will be very close
to a; in particular, if aðN þ 1Þ is an integer, we can take a1 ¼ a ðð1 aÞ=NÞ, in
which case we see easily that the critical region G^ N ðS0 Þpa1 is equivalent to G^ N ðS0 Þ
1
oa. Further, for 0oao1, the randomized critical region S 0 XF^ N ð1 a1 Þ has the
a¼
ARTICLE IN PRESS
450
J.-M. Dufour / Journal of Econometrics 133 (2006) 443–477
same level ðaÞ as the nonrandomized critical region S0 XF 1 ð1 aÞ, or equivalently
the critical regions p^ N ðS 0 Þpa and G^ N ðS0 Þpa1 have the same level as the critical
region GðS 0 Þ 1 F ðS0 Þpa.
2.2. Monte Carlo tests based on general statistics
Assumption (2.15), which states that ties have zero probability, plays an important
role in proving Proposition 2.2. However, it is possible to prove analogous results for
general sequences of exchangeable random variables (which may exhibit ties with
positive probability), provided we consider a properly randomized empirical
distribution function. For this purpose, we introduce randomized ranks which are
obtained like ordinary ranks except that ties are ‘‘broken’’ according to a uniform
distribution. More precisely, let us associate with each variable S j ; j ¼ 1; . . . ; N, a
random variable U j ; j ¼ 1; . . . ; N such that
i:i:d:
U 1 ; . . . ; U N Uð0; 1Þ,
(2.21)
0
0
UðNÞ ¼ ðU 1 ; . . . ; U N Þ is independent of SðNÞ ¼ ðS 1 ; . . . ; SN Þ where Uð0; 1Þ is the
uniform distribution on the interval ð0; 1Þ. Then, we consider the pairs
Z j ¼ ðSj ; U j Þ;
j ¼ 1; . . . ; N,
(2.22)
which are ordered according to the lexicographic order:
ðS i ; U i ÞpðS j ; U j Þ()fS i oS j or ðSi ¼ S j and U i pU j Þg.
(2.23)
Using the indicator
1½ðx1 ; u1 Þpðx2 ; u2 Þ ¼ 1ðx1 ox2 Þ þ dðx1 x2 Þ1ðu1 pu2 Þ,
(2.24)
S 1 ; . . . ; SN are then ordered like the pairs Z 1 ; . . . ; Z N according to (2.23), which yield
‘‘randomized ranks’’:
R~ j ½SðNÞ; UðNÞ ¼
N
X
1½ðS i ; U i ÞpðS j ; U j Þ,
(2.25)
i¼1
j ¼ 1; . . . ; N. By the continuity of the uniform distribution, the ranks R~ j ¼
R~ j ½SðNÞ; UðNÞ; j ¼ 1; . . . ; N; are all distinct with probability 1, so that the
randomized rank vector ðR~ 1 ; R~ 2 ; . . . ; R~ N Þ0 is a permutation of ð1; 2; . . . ; NÞ0 with
probability 1. Furthermore when Sj aS i for all jai, we have R~ j ¼ Rj : if (2.15) holds,
then R~ j ¼ Rj ; j ¼ 1; . . . ; N; with probability 1 [where Rj is defined in (2.11)]. We can
now state the following extension of Lemma 2.1.
Lemma 2.3 (Distribution of randomized ranks). Let yðNÞ ¼ ðy1 ; . . . ; yN Þ0 be a N 1
vector of exchangeable real random variables and let R~ j ¼ R~ j ½yðNÞ; UðNÞbe defined as
in (2.25) where UðNÞ ¼ ðU 1 ; . . . ; U N Þ0 is a vector of i.i.d. Uð0; 1Þ variables independent
of yðNÞ. Then, for j ¼ 1; . . . ; N,
PðR~ j =NpxÞ ¼ I½xN=N
for 0pxp1,
(2.26)
ARTICLE IN PRESS
J.-M. Dufour / Journal of Econometrics 133 (2006) 443–477
PðR~ j =NXxÞ ¼ 1
451
if xp0,
¼ ðI½ð1 xÞN þ 1Þ=N
¼ 0 if x41.
if 0oxp1,
ð2:27Þ
To the above randomized rankings, it is natural to associate the following
randomized empirical (pseudo-)distribution function:
N
1 X
F~ N ½x; U 0 ; SðNÞ; UðNÞ ¼
1½ðS i ; U i Þpðx; U 0 Þ
N i¼1
¼ 1 G^ N ½x; SðNÞ þ T N ½x; U 0 ; SðNÞ; UðNÞ,
ð2:28Þ
where U 0 is a Uð0; 1Þ random variable independent of SðNÞ and UðNÞ,
T N ½x; U 0 ; SðNÞ; UðNÞ ¼
N
1 X
1 X
dðSi xÞ1ðU i pU 0 Þ ¼
1ðU i pU 0 Þ
N i¼1
N i2E ðxÞ
N
(2.29)
and E N ðxÞ ¼ fi : S i ¼ x; 1pipNg: The function F~ N ½x; retains all the properties of
a probability distribution function, except for the fact that it may not be right
continuous at some of its jump points (where it may take values between its right and
left limits). We can also define the corresponding tail-area function:
N
1 X
G~ N ½x; U 0 ; SðNÞ; UðNÞ ¼
1½ðSi ; U i ÞXðx; U 0 Þ
N i¼1
¼ 1 F^ N ½x; SðNÞ þ T¯ N ½x; U 0 ; SðNÞ; UðNÞ,
ð2:30Þ
N
1 X
1 X
dðSi xÞ1ðU i XU 0 Þ ¼
1ðU i XU 0 Þ.
T¯ N ½x; U 0 ; SðNÞ; UðNÞ ¼
N i¼1
N i2E ðxÞ
N
(2.31)
From (2.28)–(2.31), we see that the following inequalities must hold:
1 G^ N ½x; SðNÞpF~ N ½x; U 0 ; SðNÞ; UðNÞpF^ N ½x; SðNÞ,
(2.32)
1 F^ N ½x; SðNÞpG~ N ½x; U 0 ; SðNÞ; UðNÞpG^ N ½x; SðNÞ.
(2.33)
When no element of SðNÞ is equal to x [i.e., when E N ðxÞ is empty], we have:
G~ N ½x; U 0 ; SðNÞ; UðNÞ ¼ G^ N ½x; SðNÞ ¼ 1 F^ N ½x; SðNÞ
¼ 1 F~ N ½x; U 0 ; SðNÞ; UðNÞ.
ð2:34Þ
Using the above observations, it is then easy to establish the following proposition.
ARTICLE IN PRESS
J.-M. Dufour / Journal of Econometrics 133 (2006) 443–477
452
Proposition 2.4 (Validity of Monte Carlo tests for general statistics). Let
ðS0 ; S 1 ; . . . ; SN Þ0 be a ðN þ 1Þ 1 vector of exchangeable real random variables, let
ðU 0 ; U 1 ; . . . ; U N Þ0 be a ðN þ 1Þ 1 vector of i.i.d. Uð0; 1Þ random variables
independent of ðS 0 ; S1 ; . . . ; S N Þ0 , let F^ N ðxÞ F^ N ½x; SðNÞ, G^ N ðxÞ G^ N ½x; SðNÞ,
F~ N ðxÞ F~ N ½x; U 0 ; SðNÞ; UðNÞ and G~ N ðxÞ G~ N ½x; U 0 ; SðNÞ; UðNÞ be defined as in
(2.8)–(2.10) and (2.28)–(2.30), with SðNÞ ¼ ðS1 ; . . . ; S N Þ and UðNÞ ¼ ðU 1 ; . . . ; U N Þ,
and let
p~ N ðxÞ ¼
N G~ N ðxÞ þ 1
.
N þ1
(2.35)
Then for 0pa1 p1,
P½G^ N ðS0 Þpa1 pP½G~ N ðS0 Þpa1 ¼ P½F~ N ðS0 ÞX1 a1 I½a1 N þ 1
pP½F^ N ðS 0 ÞX1 a1 ¼
N þ1
ð2:36Þ
1
with P½F^ N ðS 0 ÞX1 a1 ¼ P½S0 XF^ N ð1 a1 Þ for 0oa1 o1, and defining p^ N ðxÞ as in
(2.16),
P½p^ N ðS 0 ÞpapP½p~ N ðS0 Þpa ¼
I½aðN þ 1Þ
N þ1
for 0pap1.
(2.37)
In view of the fact that G~ N ðS 0 Þ ¼ G^ N ðS0 Þ with probability one when the zero
probability tie condition [i.e. (2.15)] holds, it is straightforward to see that
Proposition 2.2 is entailed by Proposition 2.4.
3. Power functions and concordance probabilities
The procedures described above are randomized in the sense that the result of the
tests depend on auxiliary simulations. This raises the issue of the sensitivity of the
results to these simulations. To study this more closely, let us suppose that
S0 ; S 1 ; . . . ; SN are independent with
PðSi pxÞ ¼ F ðxÞ; PðSi XxÞ ¼ GðxÞ;
PðS0 pxÞ ¼ HðxÞ;
PðS i ¼ xÞ ¼ gðxÞ; i ¼ 1; . . . ; N,
PðS 0 XxÞ ¼ KðxÞ.
ð3:1Þ
Then, it is easy to see that N G~ N ðxÞ follows a binomial distribution BiðN; pÞ with
¯
number of trials N and probability of ‘‘success’’ p ¼ Gðx;
uÞ, where
¯
Gðx;
uÞ ¼ Pð1½ðS i ; U i ÞXðx; uÞ ¼ 1Þ ¼ PðS i 4xÞ þ PðS i ¼ xÞPðU i XuÞ
¼ 1 F ðxÞ þ gðxÞð1 uÞ,
ð3:2Þ
ARTICLE IN PRESS
J.-M. Dufour / Journal of Econometrics 133 (2006) 443–477
453
and we can compute the conditional probability given ðS 0 ; U 0 Þ of the critical region
G~ N ðS 0 Þpa1 :
"
#
N
X
~
P½G N ðS0 Þpa1 j ðS 0 ; U 0 Þ ¼ P
1½ðS i ; U i ÞXðS 0 ; U 0 ÞpI½a1 N j ðS 0 ; U 0 Þ
i¼1
¼
I½a
1 N
X
N ¯
¯ 0 ; U 0 ÞNk ,
GðS 0 ; U 0 Þk ½1 GðS
k
k¼0
where ðNk Þ ¼ N!=½k!ðN kÞ!. Similarly, we can also write
I½a
1 N
X
N
^
GðS0 Þk ð1 GðS 0 ÞÞNk .
P½G N ðS0 Þpa1 j S0 ¼
k
k¼0
ð3:3Þ
(3.4)
When F ðxÞ is continuous, so that gðxÞ ¼ 0, we have
P½G~ N ðS0 Þpa1 j ðS 0 ; U 0 Þ ¼ P½G^ N ðS 0 Þpa1 j S 0 I½a
1 N
X
N
½1 F ðS 0 Þk F ðS 0 ÞNk .
¼
k
k¼0
ð3:5Þ
Using (3.3), we can find a closed-form expression for the power of the randomized
~ 0 Þpa1 for any null hypothesis which entails that S 0 has the distribution F ðÞ
test GðS
against an alternative under which its distribution is HðÞ:
P½G~ N ðS0 Þpa1 ¼
¼
E fP½G~ N ðS 0 Þpa1 j ðS0 ; U 0 Þg
ðS 0 ;U 0 Þ
I½a
1 N
X
k¼0
N
k
Z Z
1
¯
¯
Gðx;
uÞk ½1 Gðx;
uÞNk du dHðxÞ.
ð3:6Þ
0
Furthermore, when F ðxÞ is continuous everywhere, the latter expression simplifies
and we can write:
P½G~ N ðS0 Þpa1 ¼ P½G^ N ðS 0 Þpa1 Z
I½a
1 N
X
N
½1 F ðxÞk F ðxÞNk dHðxÞ.
¼
k
k¼0
ð3:7Þ
The above formulae will be useful in establishing the validity of simplified
asymptotic Monte Carlo tests in the presence of nuisance parameters. They also
allow one to compute the probability that the result of the randomized test
G~ N ðS 0 Þpa1 be different of the corresponding nonrandomized test GðS 0 Þpa;
where a ð½Na1 þ 1Þ=ðN þ 1Þ. For example, let a^ 0 ¼ GðS 0 Þ the ‘‘p-value’’ one
would obtain if the function GðxÞ were easy to compute (the p-value of the
‘‘fundamental test’’). The latter is generally different from the p-value p~ N ðS 0 Þ or
p^ N ðS 0 Þ obtained from a Monte Carlo test based on S 1 ; . . . ; SN . An interesting
question here is the probability that the Monte Carlo test yields a conclusion
different from the one based on a^ 0 . To study this, we shall consider the test which
rejects the null hypothesis H 0 when p^ N ðS 0 Þpa under assumptions (3.1).
ARTICLE IN PRESS
J.-M. Dufour / Journal of Econometrics 133 (2006) 443–477
454
If a^ 0 4a (in which case H 0 is not rejected at level a by the fundamental test), the
probability that H 0 be rejected at level a0 is
P½p^ N ðS 0 Þpa0 j S 0 ¼ P½N G^ N ðS0 ÞpðN þ 1Þa0 1 j S 0 ¼ P½BiðN; a^ 0 ÞpðN þ 1Þa0 1 j S0 pP½BiðN; aÞpðN þ 1Þa0 1
"
#
BiðN; aÞ Na Nða0 aÞ ð1 a0 Þ
p
¼P
,
ðNað1 aÞÞ1=2
ðNað1 aÞÞ1=2
ð3:8Þ
where the inequality follows on observing that a^ 0 4a and BiðN; pÞ denotes a binomial
random variable with number of trials N and probability of success p. From (3.8), we
can bound the probability that a Monte Carlo p-value as low as a0 be obtained when
the fundamental test is not significant at level a. In particular, for a0 oa, this
probability decreases as the difference ja0 aj and N get larger. It is also interesting
to observe that
"
#
BiðN; aÞ Na Nða0 aÞ ð1 a0 Þ
¼0
p
lim P½p^ N ðS0 Þpa0 j S0 ¼ lim P
N!1
N!1
ðNað1 aÞÞ1=2
ðNað1 aÞÞ1=2
(3.9)
for a0 oa, so that the probability of a discrepancy between the fundamental test and
the Monte Carlo test goes to zero as N increases.
Similarly, for a^ 0 oa (in which case H 0 is rejected at level a by the fundamental
test), the probability that H 0 not be rejected at level a0 is
P½p^ N ðS 0 Þ4a0 j S 0 ¼ P½BiðN; a^ 0 Þ4ðN þ 1Þa0 1 j S 0 "
#
BiðN; aÞ Na Nða0 aÞ ð1 a0 Þ
p
pP
ðNað1 aÞÞ1=2
ðNað1 aÞÞ1=2
ð3:10Þ
hence
lim P½p^ N ðS0 ÞXa0 j S0 ¼ 0
N!1
for a0 4a.
(3.11)
Eq. (3.10) gives an upper bound on the probability of observing a p-value as high as
a0 when the fundamental test is significant at a level lower than a. Again, the
probability of a discrepancy between the fundamental test and the Monte Carlo test
goes to zero as N increases. The only case where the probability of a discrepancy
between the two tests does not go to zero as N ! 1 is when a^ 0 ¼ a (an event with
probability zero for statistics with continuous distributions).
The probabilities (3.8) and (3.10) may be computed a posteriori to assess the
probability of obtaining p-values as low (or as high) as p^ N ðS0 Þ when the result of the
corresponding fundamental test is actually not significant (or significant) at level a.
Note also that similar (although somewhat different) calculations may be used to
determine the number N of simulations that will ensure a given probability of
concordance between the fundamental and the Monte Carlo test (see Marriott,
1979).
ARTICLE IN PRESS
J.-M. Dufour / Journal of Econometrics 133 (2006) 443–477
455
4. Monte Carlo tests with nuisance parameters
We will now study the case where the distribution of the test statistic S depends on
nuisance parameters. We consider a family of probability spaces fðZ; AZ ; Py Þ : y 2
Og and suppose that S is a real-valued AZ -measurable function whose distribution
is determined by Py0 (i.e., y0 is the ‘‘true’’ parameter vector). We wish to test the
hypothesis
H 0 : y0 2 O0 ,
(4.1)
where O0 is a nonempty subset of O. Again we take a critical region of the form SXc,
where c is a constant which does not depend on y: The critical region SXc has level a
if and only if
Py ½SXcpa;
8y 2 O0 ,
(4.2)
or equivalently,
sup Py ½SXcpa.
(4.3)
y2O0
Furthermore, SXc has size a when
sup Py ½SXc ¼ a.
(4.4)
y2O0
If we define the distribution and p-value functions,
F ½x j y ¼ Py ½Spx;
¯
x 2 R,
(4.5)
G½x j y ¼ Py ½SXx;
¯
x 2 R,
(4.6)
where y 2 O, it is again easy to see that the critical regions
sup G½S j ypa,
(4.7)
y2O0
where a sup G½c j y, and
y2O0
SX sup F 1 ½ð1 G½c j yÞþ j y c¯
(4.8)
y2O0
are equivalent to SXc in the sense that cp¯c, with equality holding when F ½x j y is
discontinuous at x ¼ c for all y 2 O0 or both F ½x j y and F 1 ½q j y are continuous at
x ¼ c and q ¼ F ðcÞ respectively for all y 2 O0 , and
sup Py ½SX¯cp sup Py ½SXc ¼ sup Py ½supfG½S j y0 : y0 2 O0 gpa.
y2O0
y2O0
(4.9)
y2O0
We shall now extend Proposition 2.2 in order to allow for the presence of nuisance
parameters. For that purpose, we consider a real random variable S 0 and random
vectors of the form
SðN; yÞ ¼ ðS 1 ðyÞ; . . . ; S N ðyÞÞ0 ;
y 2 O,
(4.10)
ARTICLE IN PRESS
J.-M. Dufour / Journal of Econometrics 133 (2006) 443–477
456
all defined on a common probability space ðZ; AZ ; PÞ, such that
the variables S 0 ; S1 ðy0 Þ; . . . ; S N ðy0 Þ are exchangeable for some y0 2 O,
each one with distribution function F ½x j y0 .
ð4:11Þ
Typically, S 0 will refer to a test statistic computed from observed data when the true
parameter vector is y0 (i.e., y ¼ y0 Þ, while S1 ðyÞ; . . . ; SN ðyÞ will refer to independent
and identically distributed (i.i.d.) replications of the test statistic obtained
independently (e.g., by simulation) under the assumption that the parameter vector
is y (i.e., P½Si ðyÞpx ¼ F ½x j yÞ.
Note that the basic probability measure P can be interpreted as Py0 ; while the
dependence of the distribution of the simulated statistics upon other values of the
parameter y is expressed by making S i ðyÞ a function of y (as well as o 2 ZÞ. In
parametric models, the statistic S will usually be simulated by first generating an
‘‘observation’’ vector y according to an equation of the form
y ¼ gðy; uÞ,
(4.12)
where u has a known distribution (which can be simulated) and then computing
SðyÞ S½gðy; uÞ gS ðy; uÞ.
(4.13)
In such cases, the above assumptions can be interpreted as follows: S 0 ¼ S½yðy0 ; u0 Þ
and S i ðyÞ ¼ S½yðy; ui Þ; i ¼ 1; . . . ; N, where the random vectors u0 ; u1 ; . . . ; uN are
i.i.d. (or exchangeable). Note y may include the parameters of a disturbance
distribution in a model, such as covariance coefficients (or even its complete
distribution function), so that the assumption that u has a known distribution is not
restrictive. Assumptions on the structure of the parameter space O (e.g., whether it is
finite-dimensional) will however entail real restrictions on the data-generating
process. More generally, it is always possible to consider that the variables
S 0 ; S 1 ðyÞ; . . . ; S N ðyÞ are P-measurable by considering their representation in terms of
uniform random variables (see Shorack and Wellner, 1986, Chapter 1, Theorem 1):
S 0 ¼ F 1 ½V 0 j y0 and Si ðyÞ ¼ F 1 ½V i j y; i ¼ 1; . . . ; N; where V 0 ; V 1 ; . . . ; V N are
P-exchangeable with uniform marginal distributions ½V i Uð0; 1Þ; i ¼ 0; 1; . . . ; N.
A more general setup that allows for nonparametric models would consist in
assuming that the null distribution of the test statistic depends on y only through
some transformation TðyÞ of the observation vector y, which in turn only depends
upon y through some transformation y ¼ hðyÞ, e.g. a subvector of y:
TðyÞ ¼ g½hðyÞ; u ¼ g½y ; u;
y 2 O ,
(4.14)
where O ¼ hðOÞ, hence
SðyÞ ¼ SðTðyÞÞ ¼ Sðg½hðyÞ; uÞ gS ½hðyÞ; u ¼ gS ðy ; uÞ.
(4.15)
The setup (4.14)–(4.15) allows for reductions of the nuisance parameter space (e.g.,
through invariance). In particular, nonparametric models may be considered by
taking appropriate distribution-free statistics (e.g., test statistics based on signs,
ranks, permutations, etc.). What matters for our purpose is the possibility of
simulating the test statistic, not necessarily the data themselves.
ARTICLE IN PRESS
J.-M. Dufour / Journal of Econometrics 133 (2006) 443–477
457
Let also
F^ N ½x j y F^ N ½x; SðN; yÞ;
p^ N ½x j y ¼
G^ N ½x j y G^ N ½x; SðN; yÞ,
N G^ N ½x j y þ 1
N þ1
(4.16)
(4.17)
be defined as in (2.8)–(2.10), and suppose the variables
supfG^ N ½S 0 j y : y 2 O0 g and inffF^ N ½S0 j y : y 2 O0 g are AZ -measurable
where O0 is nonempty subset of O.
ð4:18Þ
For general discussions of measurability of conditions for extrema of random
functions, the reader may consult Debreu (1967), Brown and Purves (1973) and
Stinchcombe and White (1992).3 We then get the following proposition.
Proposition 4.1 (Validity of MMC tests when ties have zero probability). Under the
assumptions and notations (4.10), (4.11) and (4.16)–(4.18), set S 0 ðy0 Þ ¼ S 0 and suppose
that
P½Si ðy0 Þ ¼ S j ðy0 Þ ¼ 0 for iaj; i; j ¼ 0; 1; . . . ; N.
(4.19)
If y0 2 O0 , then for 0pa1 p1,
P½supfG^ N ½S 0 j y : y 2 O0 gpa1 pP½inffF^ N ½S0 j y : y 2 O0 gX1 a1 I½a1 N þ 1
,
p
N þ1
ð4:20Þ
1
where P½inffF^ N ½S 0 j y : y 2 O0 gX1 a1 ¼ P½S 0 X supfF^ N ½1 a1 j y : y 2 O0 g for
0oa1 o1, and
P½supfp^ N ½S 0 j y : y 2 O0 gpap
I½aðN þ 1Þ
N þ1
for 0pap1.
(4.21)
Following the latter proposition, if we choose a1 and N so that (2.20) holds, the
critical region supfG^ N ½S 0 j y : y 2 O0 gpa1 has level a irrespective of the presence of
nuisance parameters in the distribution of the test statistic S under the null
hypothesis H 0 : y0 2 O0 . The same also holds if we use the (almost) equivalent
1
critical regions inffF^ N ½S0 j y : y 2 O0 gX1 a1 or S0 X supfF^ N ½1 a1 j y : y 2 O0 g.
We shall call such tests MMC tests.
3
If measurability is an issue, notions of ‘‘near-measurability’’ can be substituted (see Stinchcombe
and White, 1992). From the viewpoint of getting upper bounds on probabilities, the probability operator
can also be replaced by the associated outer measure which is always well-defined (see Dufour, 1989,
Footnote 5).
ARTICLE IN PRESS
J.-M. Dufour / Journal of Econometrics 133 (2006) 443–477
458
To be more explicit, if SðyÞ is generated according to expressions of the form
(4.14)–(4.15), we have
N
N
1 X
1 X
G^ N ½S0 j y ¼
1½S i ðyÞXS 0 ¼
1½Sðg½hðyÞ; ui ÞXS 0 N i¼1
N i¼1
¼
N
1 X
1½gS ðyn ; ui ÞXS 0 .
N i¼1
ð4:22Þ
The function G^ N ½S0 j y (or p^ N ½S0 j yÞ, is then maximized with respect to y 2 O0 [or
equivalently, with respect to y 2 O0 ¼ hðO0 Þ keeping the observed statistic S0 and
the simulated disturbance vectors u1 ; . . . ; uN fixed. The function G^ N ½S0 j y is a steptype function which typically has zero derivatives almost everywhere, except on
isolated points (or manifolds) where it is not differentiable. Further, the supremum
of G^ N ½S 0 j y is typically not unique, in the sense that several values of y will yield the
required supremum. So it cannot be maximized with usual derivative-based
algorithms. However, the required maximizations can be performed by using
appropriate optimization algorithms that do not require differentiability, such as
simulated annealing. For further discussion of such algorithms, the reader may
consult Goffe et al. (1994).
It is easy to extend Proposition 4.1 in order to relax the no-tie assumption (4.19).
For that purpose, we generate as in Proposition 2.4 a vector ðU 0 ; U 1 ; . . . ; U N Þ of
N þ 1 i.i.d. Uð0; 1Þ random variables independent of S 0 ; S1 ðy0 Þ; . . . ; S N ðy0 Þ, and we
consider the corresponding randomized distribution, tail area and p-value functions:
F~ N ½x j y F~ N ½x; U 0 ; SðN; yÞ; UðNÞ,
G~ N ½x j y G~ N ½x; U 0 ; SðN; yÞ; UðNÞ;
(4.23)
p~ N ½x j y ¼
N G~ N ½x j y þ 1
,
N þ1
(4.24)
where
UðNÞ ¼ ðU 1 ; . . . ; U N Þ
and
i:i:d:
U 0 ; U 1 ; . . . ; U N Uð0; 1Þ.
(4.25)
Under the corresponding measurability assumption
supfG~ N ½S 0 j y : y 2 O0 g and inffF~ N ½S0 j y : y 2 O0 g are AZ -measurable
where O0 is nonempty subset of O,
ð4:26Þ
we can then state the following generalization of Proposition 2.4.
Proposition 4.2 (Validity of MMC tests for general statistics). Under the assumptions
and notations (4.10), (4.11), (4.16)–(4.18) and (4.23)–(4.26), suppose U 0 , U 1 ; . . . ; U N
are independent of S0 ; S 1 ðy0 Þ; . . . ; S N ðy0 Þ. If y0 2 O0 , then for 0pa1 p1 and for
0pap1,
P½supfG^ N ½S 0 j y : y 2 O0 gpa1 pP½supfG~ N ½S0 j y : y 2 O0 gpa1 I½a1 N þ 1
,
p
N þ1
ARTICLE IN PRESS
J.-M. Dufour / Journal of Econometrics 133 (2006) 443–477
459
P½supfG^ N ½S 0 j y : y 2 O0 gpa1 pP½supfF~ N ½S 0 j y : y 2 O0 gX1 a1 I½a1 N þ 1
,
p
N þ1
P½supfp^ N ½S 0 j y : y 2 O0 gpapP½supfp~ N ½S 0 j y : y 2 O0 gpap
I½aðN þ 1Þ
.
N þ1
One should note here that the validity results of Propositions 4.1 and 4.2
differ from those of the corresponding Propositions 2.2 and 2.4 in the sense that
equalities have been replaced by inequalities. This entails that the corresponding maximized MC tests is exact in the sense that the probability of type I error
cannot be larger than the nominal level a of the test, but its size may be lower
that the level (leading to a conservative procedure).4 In view of the fact the
distribution of the test statistic involves nuisance parameters, this is not surprising: since the distribution of the test statistic varies as a function of nuisance
parameters, we can expect that the probability of type I error be lower than the level
a for some distribution compatible with the null hypothesis, even if we use the
tightest possible critical value that allows one to control the level of the test. Both
the fundamental (infeasible) test and its MC version are not similar. This is a feature
of the test statistic, not its MC implementation. Of course, it is preferable from
the power viewpoint that the discrepancy between the size of the test and its level
be as small as possible. This discrepancy would disappear if we could estimate and
maximize without error the theoretical p-value function G½x j y or the appropriate critical value, but this is not typically feasible. In general, the discrepancy
between the size and the nominal level of the test depends on the form of the test
statistic, the null hypothesis, and the number N of replications of the MMC
procedure. Studying in any detail this sort of effect would go beyond the scope of
the present.5
5. Asymptotic Monte Carlo tests based on a consistent set estimator
In this section, we propose simplified approximate versions of the procedures
proposed in the previous section when a consistent point or set estimate of y is
available. To do this, we shall need to reformulate the setup used previously in order
to allow for an increasing sample size.
Consider
ST0 ; ST1 ðyÞ; . . . ; S TN ðyÞ;
TXI 0 ; y 2 O,
(5.1)
4
We say that a test procedure is conservative at level a if its size is strictly smaller than a. Note that a nonsimilar test is not conservative as long as its size is equal to the level a (even though the probability of type
I error is smaller than a for certain distributions compatible with the null hypothesis).
5
A question of interest here consists in studying the conditions under which the discrepancy will
disappear as the number of MC replications goes to infinity ðN ! 1Þ. The reader will also find simulation
evidence on the size and power properties of MMC procedures in Dufour and Khalaf (2003a, b), Dufour
and Jouini (2005) and Dufour and Vale´ry (2005).
ARTICLE IN PRESS
J.-M. Dufour / Journal of Econometrics 133 (2006) 443–477
460
real random variables all defined on a common probability space ðZ; AZ ; PÞ, and set
ST ðN; yÞ ¼ ðS T1 ðyÞ; . . . ; S TN ðyÞÞ;
TXI 0 .
(5.2)
We will be primarily interested by situations where
the variables S T0 ; S T1 ðy0 Þ; . . . ; S TN ðy0 Þ are exchangeable for some y0 2 O,
each one with distribution function F T ½x j y0 .
ð5:3Þ
Here ST0 will normally refer to a test statistic with distribution function F T ½ j y
based on a sample of size T, while S T1 ðyÞ; . . . ; STN ðyÞ are i.i.d. replications of the
same test statistic obtained independently under the assumption that the parameter
vector is y (i.e., P½S Ti ðyÞpx ¼ F T ½x j y; i ¼ 1; . . . ; NÞ. Let also
F^ TN ½x j y ¼ F^ N ½x; ST ðN; yÞ;
p^ TN ½x j y ¼
G^ TN ½x j y ¼ G^ N ½x; ST ðN; yÞ,
N G^ TN ½x j y þ 1
,
N þ1
(5.4)
(5.5)
1
and let F^ TN ½x j y be defined as in (2.7)–(2.10).
We consider first the situation where p-values are maximized over a subset C T of O
(e.g., a confidence set for y) instead of O0 . Consequently, we introduce the following
assumption:
C T ; TXI 0 is a sequence of (possibly random) subsets of O such that
supfG^ TN ½S T0 j y : y 2 C T g and
inffF^ TN ½ST0 j y : y 2 C T g are AZ -measurable,
for all TXI 0 ; where O0 is nonempty subset of O.
ð5:6Þ
Then we have the following proposition.
Proposition 5.1 (Asymptotic validity of confidence-set restricted MMC tests:
continuous distributions). Under the assumptions and notations (5.1) to (5.6), set
S T0 ðy0 Þ ¼ ST0 , suppose
P½STi ðy0 Þ ¼ S Tj ðy0 Þ ¼ 0
for iaj
and
i; j ¼ 0; 1; . . . ; N,
(5.7)
for all TXI 0 , and let C T ; TXI 0 , be a sequence of (possibly random) subsets of O such
that
lim P½y0 2 C T ¼ 1.
T!1
(5.8)
If y0 2 O0 , then
lim P½supfG^ TN ½S T0 j y : y 2 C T gpa1 T!1
p lim P½inffF^ TN ½ST0 j y : y 2 C T gX1 a1 T!1
I½a1 N þ 1
1
¼ lim P½S T0 X supfF^ TN ½1 a1 j y : y 2 C T gp
T!1
N þ1
ð5:9Þ
ARTICLE IN PRESS
J.-M. Dufour / Journal of Econometrics 133 (2006) 443–477
461
and
lim P½supfp^ TN ½S T0 j y : y 2 C T gpap
T!1
I½aðN þ 1Þ
N þ1
for 0pap1.
(5.10)
It is quite easy to find a consistent set estimate of y0 whenever a consistent point
estimate y^ T of y0 is available. Suppose O Rk and
lim P½ky^ T y0 ko ¼ 1;
T!1
(5.11)
840,
where k k is the Euclidean norm in Rk [i.e., kxk ¼ ðx0 xÞ1=2 ; x 2 Rk . Note that
condition (5.8) need only hold for the true value y0 of the parameter vector y. Then
any set of the form C T ¼ fy 2 O : ky^ T ykocg satisfies (5.8), whenever c is a fixed
positive constant that does not depend on T. More generally, if there is a sequence of
(possibly random) matrices AT and a non-negative exponent d such that
lim P½T d kAT ðy^ T y0 Þk2 oc ¼ 1;
T!1
(5.12)
8c40,
then any set of the form
C T ¼ fy 2 O : ðy^ T yÞ0 A0T AT ðy^ T yÞoc=T d g
¼ fy 2 O : kAT ðy^ T yÞk2 oc=T d g; c40
ð5:13Þ
satisfies (5.8), since in this case,
P½y0 2 C T ¼ P½ðy^ T y0 Þ0 A0T AT ðy^ T y0 Þoc=T d ¼ P½T d ðy^ T y0 Þ0 A0T AT ðy^ T y0 Þoc ! 1.
T!1
¯
In particular (5.12) will hold whenever we can find d40
(e.g., d¯ ¼ 1) such that
¯
d=2
^
T AT ðyT y0 Þ has an asymptotic distribution (as T ! 1) and d is selected so that
¯ Whenever d40 and plimT!1 ðA0 AT Þ ¼ C 0 with detðC 0 Þa0, the diameter
0pdod.
T
of the set C T goes to zero, a fact which can greatly simplify the evaluation of the
variables supfG^ TN g, inffF^ TN g and supfp^ TN g in (5.9) and (5.10).
The above procedure may be especially useful when the distribution of the test
statistic is highly sensitive to nuisance parameters, in a way that would make its
asymptotic distribution discontinuous with respect to the nuisance parameters. In
such cases, a simulation-based procedure where the nuisance parameters are replaced
by a consistent point estimate—such as a parametric bootstrap procedure—may not
converge to the appropriate asymptotic distribution (because the point estimate does
not converge sufficiently fast to overcome the discontinuity). Here, possible
discontinuities in the asymptotic distribution are automatically taken into account
thorough a numerical maximization over a set that contains the correct value of the
nuisance parameter with a probability asymptotically equal to one: using a
consistent set estimator as opposed a point estimate (which does not converge fast
enough) can overcome such a high sensitivity to nuisance parameters. Of course, the
procedure can also be helpful in situations where the finite-sample distribution is
highly sensitive to nuisance parameters, even though it does not lead to asymptotic
failure of the bootstrap.
ARTICLE IN PRESS
J.-M. Dufour / Journal of Econometrics 133 (2006) 443–477
462
Again, it is possible to extend Proposition 5.1 to statistics with general
(possibly discrete) distributions by considering properly randomized distribution,
tail-area and p-value functions:
F~ TN ½x j y ¼ F~ N ½x; U 0 ; S T ðN; yÞ; UðNÞ,
(5.14)
G~ TN ½x j y ¼ G~ N ½x; U 0 ; S T ðN; yÞ; UðNÞ,
(5.15)
p~ TN ½x j y ¼
N G~ TN ½x j y þ 1
,
N þ1
(5.16)
where F~ N ½, G~ N ½, U 0 and UðNÞ are defined as in (4.23)–(4.25).
Proposition 5.2 (Asymptotic validity of confidence-set restricted MMC tests: general
distributions). Under the assumptions and notations (5.1)–(5.6) and (5.14)–(5.16),
suppose the sets C T O, TXI 0 , satisfy (5.8). If y0 2 O0 , then for 0pa1 p1 and
0pap1,
lim P½supfG^ TN ½S T0 j y : y 2 C T gpa1 T!1
p lim P½supfG~ TN ½S T0 j y : y 2 C T gpa1 p
T!1
I½a1 N þ 1
,
N þ1
ð5:17Þ
lim P½supfG^ TN ½S T0 j y : y 2 C T gpa1 T!1
p lim P½supfF~ TN ½ST0 j y : y 2 C T g X1 a1 p
T!1
I½a1 N þ 1
,
N þ1
ð5:18Þ
lim P½supfp^ TN ½S T0 j y : y 2 C T gpap lim P½supfp~ TN ½S 0 j y : y 2 C T gpa
T!1
T!1
p
I½aðN þ 1Þ
.
N þ1
ð5:19Þ
6. Asymptotic Monte Carlo tests based on consistent point estimate
Parametric bootstrap tests may be interpreted as a simplified form of the
procedures described in Propositions 5.1 and 5.2 where the consistent confidence set
C T has been replaced by a consistent point estimate y^ T . In other words, the
distribution of ST ðyÞ, y 2 O0 , is simulated at a single point y^ T , leading to a local (or
pointwise) MC test. It is well known that such bootstrap tests are not generally valid,
unless stronger regularity conditions are imposed. In the following proposition, we
extend earlier proofs of the asymptotic validity of such bootstrap tests. In particular,
we allow for the presence of nuisance parameters in the asymptotic distribution of
the test statistic considered. Further, our proofs have the interesting feature of being
cast in the MC test setup where the number of replications N is kept fixed even
asymptotically.
Such pointwise procedures require stronger regularity assumptions (such as
uniform continuity and convergence over the nuisance parameter space)—so that
they may fail in irregular cases where the maximized procedures described in the
ARTICLE IN PRESS
J.-M. Dufour / Journal of Econometrics 133 (2006) 443–477
463
previous sections succeed in controlling the level of the test (at least asymptotically).
But they are simpler to implement and may be taken as a natural starting point for
implementing maximized procedures. In particular, if a pointwise MC p-value is
larger than the level a of the test (so that the pointwise MC test is not significant at
level aÞ, it is clear that the maximized p-value must be larger than a (so that the
maximized MC test is not significant at level aÞ.
In order to establish a clear asymptotic validity result, we will use four basic
assumptions on the distributions of the statistics S T ðyÞ as functions of the parameter
vector y:
ST1 ðyÞ; . . . ; S TN ðyÞ are i:i:d: according to the distribution
F T ½x j y ¼ P½S T ðyÞpx; 8y 2 O,
ð6:1Þ
O is a nonempty subset of Rk ,
(6.2)
for everyTXI 0 ; S T0 is a real random variable and y^ T an estimator of y,
both measurable with respect to the probability space ðZ; AZ ; PÞ,
and F T ½S T0 j y^ T is a random variable;
ð6:3Þ
80 40; 81 40; 9d40 and a sequence of open subsets DT0 ð0 Þ in R such that
lim inf P½S T0 2 DT0 ð0 ÞX1 0 and
T!1
(
ky y0 kpd ) lim sup
T!1
sup
)
jF T ½x j y F T ½x j y0 j p1 .
ð6:4Þ
x2DT0 ð0 Þ
The first of these four conditions replaces the exchangeability assumption by an
assumption of i.i.d. variables. The two next ones simply make appropriate
measurability assumptions, while the last one may be interpreted as a local
equicontinuity condition (at y ¼ y0 ) on the sequence of distribution functions
F T ðx j yÞ, TXI 0 . Note that S T0 is not assumed to follow the same distribution as the
other variables ST1 ðyÞ; . . . ; S TN ðyÞ. Furthermore, ST0 and F T ðx j yÞ do not
necessarily converge to limits as T ! 1. An alternative assumption
of interest
p
would consist in assuming that ST0 converges in probability (!) to a random
variable S¯ 0 as T ! 1, in which case the ‘‘global’’ equicontinuity condition (6.4) can
be weakened to a ‘‘local’’ one:
p
ST0 ! S¯ 0 ,
T!1
(6.5)
D0 is a subset of R such that P½S¯ 0 2 D0 and S T0 2 D0 for all TXI 0 ¼ 1,
(6.6)
ARTICLE IN PRESS
J.-M. Dufour / Journal of Econometrics 133 (2006) 443–477
464
8x 2 D0 ; 840; 9d40 and an open neighborhood Bðx; Þ of x such that
(
)
ky y0 kpd ) lim sup
T!1
sup
jF T ½y j y F T ½y j y0 j p:
y2Bðx;Þ\D0
ð6:7Þ
We can now show that Monte Carlo tests obtained by simulating S Ti ðyÞ,
i ¼ 1; . . . ; N, with y ¼ y^ T are equivalent for large T to those based on using the
true value y ¼ y0 .
Proposition 6.1 (Asymptotic validity of bootstrap p-values). Under the assumptions
and notations (5.1), (5.2), (5.4), (5.5), (5.14)–(5.16) and (6.1)–(6.3), suppose the random
variable S T0 and the estimator y^ T are both independent of S T ðN; yÞ and U 0 . If
plimT!1 y^ T ¼ y0 and condition (6.4) or (6.5)–(6.7) hold, then for 0pa1 p1 and
0pap1,
lim fP½G~ TN ½ST0 j y^ T pa1 P½G~ TN ½ST0 j y0 pa1 g
T!1
¼ lim fP½G^ TN ½S T0 j y^ T pa1 P½G^ TN ½S T0 j y0 pa1 g ¼ 0
ð6:8Þ
T!1
and
lim fP½p~ TN ½ST0 j y^ T pa P½p~ TN ½ST0 j y0 pag
T!1
¼ lim fP½p^ TN ½S T0 j y^ T pa P½p^ TN ½S T0 j y0 pag ¼ 0.
ð6:9Þ
T!1
It is worth noting that condition (6.7) holds whenever F T ½x j y converges to a
distribution function F 1 ½x j y which is continuous with respect to ðx; y0 Þ0 , for
x 2 D0 , as follows:
8x 2 D0 ; 840; 9d1 40 and an open neighborhood B1 ðx; Þ of x such that
!
ky y0 kpd ) lim sup
T!1
sup
jF T ½y j y F 1 ½y j yj p,
ð6:10Þ
y2B1 ðx;Þ\D0
8x 2 D0 ; 840; 9d2 40 and an open neighborhood B2 ðx; Þ of x such that
ky y0 kpd2 )
sup
jF 1 ½y j y F 1 ½y j y0 jp.
ð6:11Þ
y2B2 ðx;Þ\D0
It is then easy to see that (6.10)–(6.11) entail (6.7) on noting that
jF T ½x j y F T ½x j y0 jpjF T ½x j y F 1 ½x j yj þ jF T ½x j y0 F 1 ½x j y0 j
þ jF 1 ½x j y F 1 ½x j y0 j; 8x.
Note also that (6.11) holds whenever F 1 ½x j y is continuous with respect to ðx; y0 Þ0 ,
although the latter condition is not at all necessary (e.g., in models where D0 is a
discrete set of points). In particular, (6.10)–(6.11) will hold when F T ½x j y admits an
expansion around a pivotal distribution:
F T ½x j y ¼ F 1 ðxÞ þ T g gðx; yÞ þ hT ðx; yÞ,
(6.12)
ARTICLE IN PRESS
J.-M. Dufour / Journal of Econometrics 133 (2006) 443–477
465
where F 1 ðxÞ is a distribution function that does not depend on y; g40, with the
following assumptions on gðx; yÞ and hT ðx; yÞ:
8x 2 D0 ; 9 an open neighborhood Bðx; y0 Þ of ðx; y00 Þ0 such that
jgðy; yÞjpCðx; yÞ; for all ðy; y0 Þ0 2 Bðx; y0 Þ \ D0 ,
where Cðx; yÞ is a positive constant, and
T hT ðy; yÞ ! 0 uniformly on Bðx; y0 Þ \ D0 .
g
T!1
ð6:13Þ
The latter is quite similar (although somewhat weaker) to the assumption considered
by Hall and Titterington (1989, Eq. (2.5)).
When S T0 is distributed like S T ðy0 Þ, i.e., P½S T0 px ¼ F T ½x j y0 , we can apply
Proposition 2.4 and see that P½G~ TN ½S T0 j y0 pa1 ¼ ðI½a1 N þ 1Þ=ðN þ 1Þ, hence
lim P½G~ TN ½S T0 j y^ T pa1 ¼
T!1
I½a1 N þ 1
.
N þ1
(6.14)
7. Conclusion
In this paper, we have made four main contributions. First, for the case where we
have a test statistic whose distribution does not involve nuisance parameters under
the null hypothesis, we have proposed a general form of Monte Carlo testing which
allows for exchangeable (as opposed to i.i.d.) Monte Carlo replications of general
test statistics whose distribution can take an arbitrary form (continuous, discrete or
mixed). In particular, this form is not limited to permutation tests, which has
received considerable attention in the earlier literature on Monte Carlo tests (see
Dwass, 1957; Green, 1977; Vadiveloo, 1983; Keller-McNulty and Higgins, 1987;
Lock, 1991; Edgington, 1995; Manly, 1997; Noreen, 1989; Good, 1994]. Second, we
have shown how the method can be extended to models with nuisance parameters as
long as the null distribution of the test statistic can be simulated once the nuisance
parameters have been specified. This leads to what we called maximized Monte Carlo
tests which were shown to satisfy the level constraint. Thirdly, we proposed a
simplified version of the latter method which can lead to asymptotically valid tests,
even if the asymptotic distribution depends on nuisance parameters in a
discontinuous way. This method only requires one to use a consistent set estimator of the nuisance parameters, which is always feasible as long as a consistent point estimate of the nuisance parameters is available. Further, in the latter
case, no additional information is required on the asymptotic distribution of
the consistent estimator. Fourth, we showed that Monte Carlo tests obtained
upon replacing unknown nuisance parameters by consistent estimates also lead to
asymptotically valid tests. However, it is important to note that stronger conditions are needed for this to occur and such conditions may be difficult to check
in practice.
The main shortcoming of the proposed MMC tests comes from the fact that such
tests may be computationally demanding. We cannot study here the appropriate
ARTICLE IN PRESS
J.-M. Dufour / Journal of Econometrics 133 (2006) 443–477
466
numerical algorithms or detailed implementations of the theory described above. But
a number of such applications are presented in companion papers (which are based
on earlier versions of the present paper). For example, for an illustration of the
adjustment for discreteness proposed here, the reader may consult Dufour et al.
(1998) where it is used to correct the size of Kolmogorov–Smirnov tests (which
involve a discrete statistic) for the normality of errors in a linear regression. The
method of maximized Monte Carlo tests can of course be applied to a wide array of
models where nuisance parameters problems show up: for example, inference in
seemingly unrelated regressions (Dufour and Khalaf, 2003a), simultaneous
equations models (Dufour and Khalaf, 2003b), dynamic models (Dufour and
Jouini, 2005; Dufour and Vale´ry, 2005), and models with limited dependent variables
(Jouneau-Sion and Torre`s, 2005). It is clear many more applications are possible.
The size and power properties of the proposed procedures are also studied by
simulation methods in this work.
Acknowledgements
The author thanks Torben Andersen, Benoıˆ t Perron, Russell Davidson, V.P.
Godambe, Joel Horowitz, Joanna Jasiak, Linda Khalaf, Jan Kiviet, James
MacKinnon, Charles Manski, Eric Renault, Christopher Sims, Olivier Torre`s,
Pascale Vale´ry, Michael Veall, Mark Watson, and two anonymous referees for
several useful comments. This work was supported by the Canada Research Chair
Program (Chair in Econometrics, Universite´ de Montre´al), the Alexander-vonHumboldt Foundation (Germany), the Canadian Network of Centres of Excellence
[program on Mathematics of Information Technology and Complex Systems
(MITACS)], the Canada Council for the Arts (Killam Fellowship), the Natural
Sciences and Engineering Research Council of Canada, the Social Scienes and
Humanities Research Council of Canada, the Fonds de recherche sur la socie´te´ et la
culture (Que´bec), and the Fonds de recherche sur la nature et les technologies
(Que´bec).
Appendix. Proofs
Proof of Lemma 2.1. By condition (2.12), the variables y1; y2 ; . . . ; yN are all distinct
with probability 1, and the rank vector ðR1 ; R2 ; . . . ; RN Þ0 is with probability 1 a
permutation of the integers ð1; 2; . . . ; NÞ. Furthermore, since the variables
y1 ; y2 ; . . . ; yN are exchangeable, the N! distinct permutations of ðy1 ; y2 ; . . . ; yN Þ have
the same probability 1=N!. Consequently, we have:
1
; i ¼ 1; 2; . . . ; N,
N
PðRj =NpxÞ ¼ I½xN=N; 0pxp1,
PðRj ¼ iÞ ¼
ARTICLE IN PRESS
J.-M. Dufour / Journal of Econometrics 133 (2006) 443–477
467
from which (2.13) follows and
PðRj =NoxÞ ¼ ðI½xN 1Þ=N
¼ I½xN=N
if xN 2 Zþ ,
otherwise,
where Zþ is the set of the positive integers. Since, for any real number z,
I½N z ¼ N z
if z is an integer,
¼ N I½z 1
otherwise,
we then have, for 0pxp1,
PðRj =NXxÞ ¼ 1 PðRN =NoxÞ ¼ ðN I½xN þ 1Þ=N
¼ ðN I½xNÞ=N
if xN 2 Zþ ,
otherwise,
hence
P½Rj =NXx ¼ ðI½ð1 xÞN þ 1Þ=N
¼ 1 if x ¼ 0,
from which we get (2.14).
if 0oxp1,
&
Proof of Proposition 2.2. Assuming there are no ties among S0 ; S 1 ; . . . ; SN (an event
with probability 1), we have
N
N
N
1 X
1 X
1 X
1ðSi XS 0 Þ ¼
½1 1ðS i pS 0 Þ ¼ 1 1ðSi pS 0 Þ
G^ N ðS 0 Þ ¼
N i¼1
N i¼1
N i¼1
"
#
N
X
1
1 þ
¼1
1ðS i pS0 Þ ¼ ðN þ 1 R0 Þ=N,
N
i¼0
P
where R0 ¼ N
i¼0 1ðS i pS 0 Þ is the rank of S 0 when the N þ 1 variables S 0 ; S 1 ; . . . ; S N
are ranked in nondecreasing order. Using (2.15) and Lemma 2.1, it then follows that
N þ 1 R0
R0
ð1 a1 ÞN þ 1
^
pa1 ¼ P
X
P½G N ðS 0 Þpa1 ¼ P
N þ1
N
N þ1
I½a1 N þ 1
¼
N þ1
for 0pa1 p1. Furthermore, F^ N ðS0 Þ ¼ 1 G^ N ðS 0 Þ with probability 1, and (2.17)
1
follows. We then get (2.18) on observing that: F^ N ðyÞXq()yXF^ N ðqÞ for y 2 R and
0oqo1 (see Reiss, 1989, Appendix 1). Finally, to obtain (2.19), we note that
p^ N ðS 0 Þ N G^ N ðS 0 Þ þ 1
aðN þ 1Þ 1
pa()G^ N ðS 0 Þp
N þ1
N
ARTICLE IN PRESS
468
J.-M. Dufour / Journal of Econometrics 133 (2006) 443–477
hence, since that 0pG^ N ðS0 Þp1 and using (2.17),
aðN þ 1Þ 1
^
P½p^ N ðS 0 Þpa ¼ P G N ðS 0 Þp
N
8
0;
>
>
>
< I½aðN þ 1Þ 1 þ 1 I½aðN þ 1Þ
¼
;
¼
>
N þ1
N þ1
>
>
:
1;
if ao1=ðN þ 1Þ;
1
pap1;
N þ1
if a41;
if
from which (2.19) follows upon observing that I½aðN þ 1Þ ¼ 0 for 0pao1=
ðN þ 1Þ: &
Proof of Lemma 2.3. From (2.23) and the continuity of the Uð0; 1Þ distribution, we
see easily that P½ðyi ; U i Þ ¼ ðyj ; U j ÞpP½U i ¼ U j ¼ 0, for iaj, from which it follows
that P½R~ i ¼ R~ j ¼ 0 for iaj and the rank vector ðR~ 1 ; R~ 2 ; . . . ; R~ N Þ is with probability 1
a random permutation of the integers ð1; 2; . . . ; NÞ. Set V i ¼ ðyi ; U i Þ; i ¼ 1; . . . ; N.
By considering all possible permutations ðV r1 ; V r2 ; . . . ; V rN Þ of ðV 1 ; V 2 ; . . . ; V N Þ, and
since ðV r1 ; V r2 ; . . . ; V rN ÞðV 1 ; V 2 ; . . . ; V N Þ for all permutations ðr1 ; r2 ; . . . ; rN Þ of
ð1; 2; . . . ; NÞ [by the exchangeability assumption], the elements of ðR~ 1 ; R~ 2 ; . . . ;
R~ N Þ are also exchangeable.The result then follows from Lemma 2.1. The reader may
note that an alternative proof could be obtained by modifying the proof of Theorem
29A of Ha´jek (1969) to relax the independence assumption for y1 ; . . . ; yN . &
Proof of Proposition 2.4. Since the pairs ðSi ; U i Þ; i ¼ 0; 1; . . . ; N; are all distinct with
probability 1, we have almost surely:
N
N
1 X
1 X
1½ðS i ; U i ÞXðS 0 ; U 0 Þ ¼ 1 1½ðS i ; U i ÞpðS0 ; U 0 Þ
G~ N ðS 0 Þ ¼
N i¼1
N i¼1
(
)
N
X
1
1 þ
¼1
1½ðS i ; U i ÞpðS0 ; U 0 Þ ¼ ðN þ 1 R~ 0 Þ=N,
N
i¼0
P
1½ðS
;
U
ÞpðS
where R~ 0 ¼ N
i
i
0 ; U 0 Þ is the randomized rank of S 0 obtained when
i¼0
ranking in ascending order [according to (2.23)] the N þ 1 pairs ðSi ; U i Þ; i ¼ 0;
1; . . . ; N. Using Lemma 2.3, it follows that
~
R0
ð1 a1 ÞN þ 1
~
~
X
P½G N ðS 0 Þpa1 ¼ P½ðN þ 1 R0 Þ=Npa1 ¼ P
N þ1
N þ1
8
0
if a1 o0;
>
>
>
< I½a N þ 1
1
if 0pa1 p1;
¼
N
þ1
>
>
>
:
1
if a1 41:
Since the pairs ðSi ; U i Þ; i ¼ 0; 1; . . . ; N, are all distinct with probability 1, we also
have F~ N ðS 0 Þ ¼ 1 G~ N ðS 0 Þ with probability 1, hence using inequalities (2.33)–(2.36),
P½G^ N ðS0 Þpa1 pP½G~ N ðS0 Þpa1 ¼ P½F~ N ðS0 ÞX1 a1 pP½F^ N ðS0 ÞX1 a1 ARTICLE IN PRESS
J.-M. Dufour / Journal of Econometrics 133 (2006) 443–477
469
1
and (2.36) is established. The identity P½F^ N ðS 0 ÞX1 a1 ¼ P½S 0 XF^ N ð1 a1 Þ
1
follows from the equivalence: F^ N ðyÞXq()yXF^ N ðqÞ; 8y 2 R; 0oqo1. Finally,
to obtain (2.37), we observe that
p~ N ðS 0 Þ ¼
N G~ N ðS 0 Þ þ 1
aðN þ 1Þ 1
pa()G~ N ðS 0 Þp
N þ1
N
hence, using (2.36),
aðN þ 1Þ 1
~
P½p~ N ðS 0 Þpa ¼ P G N ðS 0 Þp
N
8
0
>
<
¼ I½aðN þ 1Þ 1 þ 1 I½aðN þ 1Þ
>
¼
:
N þ1
N þ1
if ao1=ðN þ 1Þ;
if
1
pap1;
N þ1
from which (2.19) follows on observing that I½aðN þ 1Þ ¼ 0 for 0pap1=
ðN þ 1Þ: &
Proof of Proposition 4.1. Since
G^ N ½S0 j yX1 F^ N ½S 0 j y,
(A.1)
we have
P½supfG^ N ½S 0 j y : y 2 O0 gpa1 pP½inffF^ N ½S0 j y : y 2 O0 gX1 a1 .
When y0 2 O0 , it is also clear that: inf y2O0 F^ N ½S 0 j yX1 a1 ) F^ N ½S0 j y0 X1 a1 .
Thus, using Proposition 2.2,
I½a1 N þ 1
.
P½inffF^ N ½S0 j y : y 2 O0 gX1 a1 pP½F^ N ½S0 j y0 X1 a1 ¼
N þ1
Furthermore,
inf F^ N ½S 0 j yX1 a1 ()F^ N ½S 0 j yX1 a1 ;
y2O0
1
()S0 XF^ N ½1 a1 j y;
8y 2 O0
1
8y 2 O0 ()S0 X sup F^ N ½1 a1 j y
ðA:2Þ
y2O0
so that, using Proposition 2.2,
1
P½S 0 X supfF^ N ½1 a1 j y0 : y 2 O0 g ¼ P½inffF^ N ½S 0 j y : y 2 O0 gX1 a1 I½a1 N þ 1
p
N þ1
ARTICLE IN PRESS
J.-M. Dufour / Journal of Econometrics 133 (2006) 443–477
470
and (4.20) is established. Eq. (4.21) follows in the same way on observing that
supy2O0 p~ N ½S 0 j ypsupy2O0 p^ N ½S 0 j y and
sup p~ N ½S 0 j ypa ) p~ N ½S0 j y0 pa;
when y0 2 O0 :
&
y2O0
Proof of Proposition 4.2. Using (2.32)–(2.33), we have:
1 F^ N ½S0 j ypG~ N ½S 0 j ypG^ N ½S 0 j y; 8y,
1 G^ N ½S0 j ypF~ N ½S0 j ypF^ N ½S0 j y,
hence
sup G~ N ½S0 j yp sup G^ N ½S0 j y,
y2O0
y2O0
1 sup G^ N ½S 0 j y ¼ inf f1 G^ N ½S 0 j ygp inf F~ N ½S 0 j y,
y2O0
y2O0
y2O0
sup p~ N ½S 0 j yp sup p^ N ½S 0 j y.
y2O0
y2O0
Furthermore, when y0 2 O0 ,
sup G~ N ½S0 j ypa1 ) G~ N ½S 0 j y0 pa1 ,
y2O0
inf F~ N ½S 0 j yX1 a1 ) F~ N ½S 0 j y0 X1 a1 ,
y2O0
sup p~ N ½S 0 j ypa1 ) p~ N ½S0 j y0 pa1 ,
y2O0
hence, using Proposition 2.4, for 0pa1 p1 and for 0pap1,
P½supfG^ N ½S 0 j y : y 2 O0 gpa1 pP½supfG~ N ½S0 j y : y 2 O0 gpa1 I½a1 N þ 1
,
pP½G~ N ½S 0 j y0 pa1 ¼
N þ1
P½supfG^ N ½S 0 j y : y 2 O0 gpa1 pP½inffF~ N ½S 0 j y : y 2 O0 gX1 a1 I½a1 N þ 1
,
pP½F~ N ½S 0 j y0 X1 a1 ¼
N þ1
P½supfp^ N ½S0 j y : y 2 O0 gpapP½supfp~ N ½S0 j y : y 2 O0 gpa
I½aðN þ 1Þ
:
pP½p~ N ½S 0 j y0 pa ¼
N þ1
&
ARTICLE IN PRESS
J.-M. Dufour / Journal of Econometrics 133 (2006) 443–477
471
Proof of Proposition 5.1. Using arguments similar to the ones in the proof of
Proposition 4.2 [see (A.1)–A.2)], it is easy to see that
P½supfG^ TN ½ST0 j y : y 2 C T gpa1 pP½inffF^ TN ½ST0 j y : y 2 C T gX1 a1 1
¼ P½ST0 X supfF^ TN ½1 a1 j y : y 2 C T g.
Further,
P½inffF^ TN ½S T0 j y : y 2 C T gX1 a1 ¼ P½inffF^ TN ½S T0 j y : y 2 C T gX1 a1 and y0 2 C T þ P½inffF^ TN ½S T0 j y : y 2 C T gX1 a1 and y0 eC T pP½F^ TN ½S T0 j y0 X1 a1 þ P½y0 eC T ¼
I½a1 N þ 1
þ P½y0 eC T ,
N þ1
where the last identity follows from Proposition 2.2, hence, since limT!1 P½y0 eC T ¼ 0,
I½a1 N þ 1
þ lim P½y0 eC T T!1
N þ1
I½a1 N þ 1
,
¼
N þ1
lim P½inffF^ TN ½ST0 j y : y 2 C T gX1 a1 p
T!1
from which (5.9) and (5.10) follow.
&
Proof of Proposition 5.2. The result follows from arguments similar to the ones used
in the proofs of Propositions 4.1 and 5.2 (with O0 replaced by C T Þ. &
In order to prove Proposition 6.1, it will be convenient to first demonstrate the
following two lemmas
Lemma A.1 (Continuity of p-value function). Under the assumptions and notations
(5.1), (5.2), (5.4), (5.5), (5.14)–(5.16) and (6.1), set
QTN ðy; x; u0 ; a1 Þ ¼ P½G~ TN ½x j ypa1 j U 0 ¼ u0 ,
(A.3)
¯ TN ðy; x; a1 Þ ¼ P½G^ TN ½x j ypa1 ;
Q
(A.4)
0pa1 p1,
and suppose U 0 is independent of ST ðN; yÞ. For any y, y0 2 O, x 2 R and u0 , a1 2 ½0; 1,
the inequality
jF T ½y j y F T ½y j y0 jp;
8y 2 ðx d; x þ dÞ,
where d40, entails the inequalities:
jQTN ðy; x; u0 ; a1 Þ QTN ðy0 ; x; u0 ; a1 Þjp3CðN; a1 Þ,
¯ TN ðy; x; a1 Þ Q
¯ TN ðy0 ; x; u0 Þjp3CðN; a1 Þ,
jQ
PI½a1 N N
where CðN; a1 Þ ¼ N k¼0 ð k Þ.
(A.5)
(A.6)
ARTICLE IN PRESS
J.-M. Dufour / Journal of Econometrics 133 (2006) 443–477
472
Proof. It is easy to see [as in (3.3)] that
QTN ðy; x; u0 ; a1 Þ ¼ P½G~ TN ðx j yÞpa1 j U 0 ¼ u0 I½a
1 N
X
N ¯
¯ T ðx; u0 j yÞNk ,
G T ðx; u0 j yÞk ½1 G
¼
k
k¼0
where
¯ T ðx; u0 j yÞ ¼ Pð1½ðS Ti ðyÞ; U i ÞXðx; u0 Þ ¼ 1Þ
G
¼ P½STi ðyÞ4x þ P½S Ti ðyÞ ¼ xP½U i Xu0 ¼ 1 F T ½x j y þ gT ðx j yÞð1 u0 Þ; 1pipT.
Note also that gT ðx j yÞ ¼ F T ½x j y lim F T ½y d0 j y. Then the inequality
d0 !0þ
jF T ½y j y F T ½y j y0 jp;
8y 2 ðx d; x þ dÞ,
entails the following inequalities:
j1 F T ½x j y 1 þ F T ½x j y0 jp,
jgT ðx j yÞ gT ðx j y0 Þj ¼ jf1 F T ½x j yg f1 F T ½x j y0 g
þ lim fF T ½y d0 j y F T ½y d0 j y0 gj
d0 !0þ
pjF T ½x j y F T ½x j y0 j þ lim jF T ½y d0 j y
d0 !0þ
F T ½y d0 j y0 jp2,
hence, for all u0 2 ½0; 1,
¯ T ðx; u0 j yÞ G
¯ T ðx; u0 j y0 ÞjpjF T ½x j y F T ½x j y0 j
jG
þ j1 u0 jjgT ðx j yÞ gT ðx j y0 Þjp3,
8u0 2 ½0; 1,
jQTN ðy; x; u0 ; a1 Þ QTN ðy0 ; x; u0 ; a1 Þj
( ½Na
1
N ¯
X
¯ T ðx; u0 jyÞNk
G T ðx; u0 jyÞk ½1 G
p
k¼0
k
)
k
Nk ¯
¯
GT ðx; u0 jy0 Þ ½1 G T ðx; u0 jy0 Þ
½Na
X1 N ¯ T ðx; u0 j y0 Þk j
¯ T ðx; u0 j yÞk G
p
jG
k
k¼0
¯ T ðx; u0 j yÞNk ½1 G
¯ T ðx; u0 j y0 ÞNk j
þ j½1 G
ARTICLE IN PRESS
J.-M. Dufour / Journal of Econometrics 133 (2006) 443–477
473
½Na
X1 N
¯ T ðx; u0 j y0 Þj
¯ T ðx; u0 j yÞ G
p
kjG
k
k¼0
¯ T ðx; u0 j yÞ G
¯ T ðx; u0 j y0 Þj
þ ðN kÞjG
¯ T ðx; u0 j yÞ G
¯ T ðx; u0 j y0 Þjp3CðN; a1 Þ,
¼ CðN; a1 ÞjG
PI½a N
(A.6)
where CðN; a1 Þ ¼ N k¼01 Nk , from which (A.5) follows. The
P inequality
¯ TN ðy; x; a1 Þ ¼ I½a1 N N G T ðx j yÞk
follows in a similar way on noting that Q
k¼0
k
½1 G T ðx j yÞNk , where G T ðx j yÞ ¼ Py ½STi ðyÞXx; 1pipN. &
Lemma A.2 (Convergence of Bootstrap p-values). Under the assumptions and
notations of Lemma A.1, suppose that (6.2) and (6.3) also hold. If y^ T ! y0 in
T!1
probability and condition (6.4) or (6.5)–(6.7) holds, then
p
sup jQTN ðy^ T ; S T0 ; u0 ; a1 Þ QTN ðy0 ; ST0 ; u0 ; a1 Þj ! 0,
T!1
0pu0 p1
p
¯ TN ðy^ T ; ST0 ; a1 Þ Q
¯ TN ðy0 ; S T0 ; a1 Þ ! 0.
Q
T!1
(A.7)
(A.8)
We can now prove the following proposition.
Proof. Let a1 2 ½0; 1, 40 and 0 40 and suppose first that (6.4) holds. Then, using
Lemma A.1, we can find d40 and T 1 such that
x 2 DT0 ð0 Þ; ky y0 kpd and T4T 1
) jF T ½x j y F T ½x j y0 jp1 =½3CðN; a1 Þ
) jQTN ðy; x; u0 ; a1 Þ QTN ðy0 ; x; u0 ; a1 Þjp; 8u0 2 ½0; 1.
Thus
ST0 2 DT0 ð0 Þ and ky^ T y0 kpd ) DTN ðy^ T ; y0 ; ST0 ; a1 Þp,
where
hence
DTN ðy^ T ; y0 ; S T0 ; a1 Þ sup0pu0 p1 jQTN ðy^ T ; S T0 ; u0 ; a1 Þ QTN ðy0 ; x; u0 ; a1 Þj,
P½DTN ðy^ T ; y0 ; S T0 ; a1 ÞpXP½ST0 2 DT0 ð0 Þ and ky^ T y0 kpd
X1 P½S T0 eDT0 ð0 Þ P½ky^ T y0 k4d
¼ P½ST0 2 DT0 ð0 Þ P½ky^ T y0 k4d.
p
Since y^ T ! y0 , it follows that
lim inf P½DTN ðy^ T ; y0 ; S T0 ; a1 ÞpX lim inf P½S T0 2 DT0 ð0 ÞX1 0
T!1
T!1
for any 0 40; hence limT!1 P½DTN ðy^ T ; y0 ; S T0 ; a1 Þp ¼ 1. Since the latter identity holds for any 40, (A.7) is established. (A.8) follows in a similar way upon
using (A.6).
ARTICLE IN PRESS
474
J.-M. Dufour / Journal of Econometrics 133 (2006) 443–477
p
Suppose now (6.5)–(6.7) hold instead of (6.4). Then, ðS T0 ; y^ T Þ ! ðS 0 ; y0 Þ and
T!1
p
ðS T¯ k ; y^ T¯ k Þ ! ðS0 ; y0 Þ,
(A.9)
T!1
for any subsequence fðS T¯ k ; y^ T¯ k Þ : k ¼ 1; 2; :::g of fðST ; y^ T Þ : TXI 0 g. Since S T0 and y^ T ,
TXI 0 , are random variables (or vectors) defined on Z, we can write S T0 ¼ S T0 ðoÞ,
y^ T ¼ y^ T ðoÞ and S 0 ¼ S0 ðoÞ, o 2 Z. By (6.6), the event
A0 ¼ fo : S0 ðoÞ 2 D0 and S T0 ðoÞ 2 D0 ; for TXI 0 g
0
has probability one. Furthermore, by (A.9), the subsequence ðS T¯ k ; y^ T¯ k Þ0 contains a
0
0
further subsequence ðST 0 ; y^ Þ0 ; kX1 such that ðS T 0 ; y^ Þ0 ! ðS 0 ; y0 Þ0 a.s. (where
k
Tk
k
Tk
T!1
0
T 1 oT 2 o Þ; see Bierens (1994, pp. 22–23). Consequently, the set
C 0 ¼ fo 2 Z : S 0 ðoÞ 2 D0 ; lim S T k 0 ðoÞ ¼ S 0 ðoÞ and lim y^ T k ðoÞ ¼ y0 g
k!1
k!1
has probability one. Now, let 40: By (6.7), for any x 2 D0 , we can find dðx; Þ40,
Tðx; Þ40 and an open neighborhood Bðx; Þ of x such that
ky y0 kpdðx; Þ and T4Tðx; Þ ) jF T ½y j y F T ½y j y0 jp,
8y 2 Bðx; Þ \ D0 .
Furthermore, for o 2 C 0 ; we can find k0 such that
kXk0 ) S T k 0 ðoÞ 2 BðS 0 ðoÞ; Þ \ D0 and ky^ T k y0 kpdðS 0 ðoÞ; Þ,
so that T k 4 maxfTðS 0 ðoÞ; Þ; T k0 g entails jF T k ½ST k 0 ðoÞ j yT k ðoÞ F T k ½S T k 0 ðoÞ j
y0 jp. Thus limk!1 fF T k ½S T k 0 ðoÞ j yT k ðoÞ F T k ½S T k 0 ðoÞ j y0 g ¼ 0 for o 2 C 0 ,
hence, using Lemma A.1, limk!1 DT N ðy^ T ðoÞ; y0 ; S T 0 ðoÞ; a1 Þ ¼ 0 and
k
k
k
DT k N ðy^ T k ; y0 ; ST k 0 ; a1 Þ ! 0, a.s. This shows that any subsequence of the sequence
k!0
DTN ðy^ T ; y0 ; S T0 ; a1 Þ,TXI 0 , contains a further subsequence which converge a.s. to
p
zero. It follows that DTN ðy^ T ; y0 ; ST0 ; a1 Þ ! 0 and (A.7) is established. The proof of
T!1
(A.8)) under the condition (6.5)–(6.7) is similar.
&
Proof of Proposition 6.1. Using the fact that y^ T , ST0 and U 0 are independent of
S T ðN; yÞ, we can write
P½G~ TN ½S T0 j y^ T pa1 P½G~ TN ½S T0 j y0 pa1 ¼ EfP½G~ TN ½ST0 j y^ T pa1 j ðy^ T ; ST0 ; U 0 Þ
P½G~ TN ½S T0 j y0 pa1 j ðy^ T ; S T0 ; U 0 Þg
¼ E½Q ðy^ T ; ST0 ; U 0 ; a1 Þ Q ðy0 ; ST0 ; U 0 ; a1 Þ.
TN
TN
ARTICLE IN PRESS
J.-M. Dufour / Journal of Econometrics 133 (2006) 443–477
475
From Lemma A.1 and using the Lebesgue dominated convergence theorem, we
then get
jP½G~ TN ½ST0 jy^ T pa1 P½G~ TN ½S T0 jy0 pa1 j
¼ jE½Q ðy^ T ; S T0 ; U 0 ; a1 Þ Q ðy0 ; S T0 ; U 0 ; a1 Þj
TN
TN
pEfjQTN ðy^ T ; ST0 ; U 0 ; a1 Þ QTN ðy0 ; S T0 ; U 0 ; a1 Þjg
"
#
pE sup jQTN ðy^ T ; S T0 ; u0 ; a1 Þ QTN ðy0 ; S T0 ; u0 ; a1 Þj ! 0.
T!1
0pu0 p1
We can show in a similar way that
^ T0 j y^ T pa1 P½G½S T0 j y0 pa1 j ! 0,
jP½G½S
T!1
from which we get (6.8). Eq. (6.9) then follows from the definitions of p~ TN ðx j yÞ and
p^ TN ðx j yÞ. &
References
Andrews, D.W.K., 2000. Inconsistency of the bootstrap when a parameter is on the boundary of the
parameter space. Econometrica 68, 399–405.
Athreya, K.B., 1987. Bootstrap of the mean in the infinite variance case. The Annals of Statistics 15,
724–731.
Barnard, G.A., 1963. Comment on ‘The spectral analysis of point processes’ by M.S. Bartlett. Journal of
the Royal Statistical Society, Series B 25, 294.
Basawa, I.V., Mallik, A.K., McCormick, W.P., Reeves, J.H., Taylor, R.L., 1991. Bootstrapping unstable
first-order autoregressive processes. The Annals of Statistics 19, 1098–1101.
Benkwitz, A., Lu¨tkepohl, H., Neumann, M.H., 2000. Problems related to confidence intervals for impulse
responses of autoregressive processes. Econometric Reviews 19, 69–103.
Beran, R., Ducharme, G. R., 1991. Asymptotic Theory for Bootstrap Methods in Statistics. Centre de
Recherches Mathe´matiques, Universite´ de Montre´al, Montre´al, Canada.
Besag, J., Diggle, P.J., 1977. Simple Monte Carlo tests for spatial pattern. Applied Statistics 26,
327–333.
Bierens, H.J., 1994. Topics in Advanced Econometrics. Cambridge University Press, New York.
Birnbaum, Z.W., 1974. Computers and unconventional test-statistics. In: Proschan, F., Serfling, R.J.
(Eds.), Reliability and Biometry. SIAM, Philadelphia, PA, pp. 441–458.
Brown, L.D., Purves, R., 1973. Measurable selections of extrema. The Annals of Statistics 1,
902–912.
Campbell, B., Dufour, J.-M., 1997. Exact nonparametric tests of orthogonality and random walk in the
presence of a drift parameter. International Economic Review 38, 151–173.
Chernick, M.R., 1999. Bootstrap Methods: A Practitioner’s Guide. Wiley, New York.
Davison, A., Hinkley, D., 1997. Bootstrap Methods and Their Application. Cambridge University Press,
Cambridge, UK.
Debreu, G., 1967. Integration of correspondences. In: LeCam, L., Neyman, J. (Eds.), Proceedings of the
Fifth Berkeley Symposium on Mathematical Statistics and Probability, vol. II-1. The University of
California Press, Berkeley and Los Angeles, California, pp. 351–372.
Dufour, J.-M., 1989. Nonlinear hypotheses, inequality restrictions, and non-nested hypotheses: Exact
simultaneous tests in linear regressions. Econometrica 57, 335–355.
Dufour, J.-M., 1990. Exact tests and confidence sets in linear regressions with autocorrelated errors.
Econometrica 58, 475–494.
ARTICLE IN PRESS
476
J.-M. Dufour / Journal of Econometrics 133 (2006) 443–477
Dufour, J.-M., Jasiak, J., 2001. Finite sample limited information inference methods for structural
equations and models with generated regressors. International Economic Review 42, 815–843.
Dufour, J.-M., Jouini, T., 2005. Finite-sample simulation-based tests in VAR models with applications to
causality testing. Journal of Econometrics, forthcoming.
Dufour, J.-M., Khalaf, L., 2003a. Finite sample tests in seemingly unrelated regressions. In: Giles, D.E.A.
(Ed.), Computer-Aided Econometrics. Marcel Dekker, New York, pp. 11–35 (Chapter 2).
Dufour, J.-M., Khalaf, L., 2003b. Simulation-based finite-sample inference in simultaneous equations.
Technical report, Centre interuniversitaire de recherche en analyse des organisations (CIRANO) and
Centre interuniversitaire de recherche en e´conomie quantitative (CIREQ), Universite´ de Montre´al.
Dufour, J.-M., Kiviet, J.F., 1996. Exact tests for structural change in first-order dynamic models. Journal
of Econometrics 70, 39–68.
Dufour, J.-M., Kiviet, J.F., 1998. Exact inference methods for first-order autoregressive distributed lag
models. Econometrica 66, 79–104.
Dufour, J.-M., Vale´ry, P., 2005. Exact and asymptotic tests for possibly non-regular hypotheses on
stochastic volatility models. Technical report. Centre interuniversitaire de recherche en analyse des
organisations (CIRANO) and Centre interuniversitaire de recherche en e´conomie quantitative
(CIREQ), Universite´ de Montre´al.
Dufour, J.-M., Hallin, M., Mizera, I., 1998. Generalized runs tests for heteroskedastic time series. Journal
of Nonparametric Statistics 9, 39–86.
Dufour, J.-M., Farhat, A., Gardiol, L., Khalaf, L., 1998. Simulation-based finite sample normality tests in
linear regressions. The Econometrics Journal 1, 154–173.
Dwass, M., 1957. Modified randomization tests for nonparametric hypotheses. Annals of Mathematical
Statistics 28, 181–187.
Edgington, E.S., 1980. Randomization Tests. Marcel Dekker, New York.
Edgington, E.S., 1995. Randomization Tests, third ed. Marcel Dekker, New York.
Edwards, D., 1985. Exact simulation-based inference: A survey, with additions. Journal of Statistical
Computation and Simulation 22, 307–326.
Edwards, D., Berry, J.J., 1987. The efficiency of simulation-based multiple comparisons. Biometrics 43,
913–928.
Efron, B., 1982. The Jacknife, the Bootstrap and Other Resampling Plans CBS-NSF Regional Conference
Series in Applied Mathematics, Monograph No. 38. Society for Industrial and Applied Mathematics,
Philadelphia, PA.
Efron, B., Tibshirani, R.J., 1993. An Introduction to the Bootstrap of Monographs on Statistics and
Applied Probability, vol. 57. Chapman & Hall, New York.
Foutz, R.V., 1980. A method for constructing exact tests from test statistics that have unknown null
distributions. Journal of Statistical Computation and Simulation 10, 187–193.
Gallant, A.R., Tauchen, G., 1996. Which moments to match? Econometric Theory 12, 657–681.
Goffe, W.L., Ferrier, G.D., Rogers, J., 1994. Global optimization of statistical functions with simulated
annealing. Journal of Econometrics 60, 65–99.
Good, P., 1994. Permutation Tests. A Practical Guide to Resampling Methods for Testing Hypotheses,
second ed. Springer, New York.
Gourie´roux, C., Monfort, A., 1996. Simulation-Based Econometric Methods. Oxford University Press,
Oxford, UK.
Green, B.F., 1977. A practical interactive program for randomization tests of location. The American
Statistician 31, 37–39.
Ha´jek, J., 1969. A Course in Nonparametric Statistics. Holden-Day, San Francisco.
Hajivassiliou, V.A., 1993. Simulation estimation methods for limited dependent variables. In: Maddala,
et al. (Eds.), Handbook of Statistics 11: Econometrics. North-Holland, Amsterdam, pp. 519–543.
Hall, P., 1992. The Bootstrap and Edgeworth Expansion. Springer, New York.
Hall, P., Titterington, D.M., 1989. The effect of simulation order on level accuracy and power of Monte
Carlo tests. Journal of the Royal Statistical Society, Series B 51, 459–467.
Hope, A.C.A., 1968. A simplified Monte Carlo test procedure. Journal of the Royal Statistical Society,
Series B 30, 582–598.
ARTICLE IN PRESS
J.-M. Dufour / Journal of Econometrics 133 (2006) 443–477
477
Horowitz, J.L., 1997. Bootstrap methods in econometrics: Theory and numerical performance. In: Kreps,
D., Wallis, K.W. (Eds.), Advances in Economics and Econometrics, vol. 3. Cambridge University
Press, Cambridge, UK, pp. 188–222.
Inoue, A., Kilian, L., 2002. Bootstrapping autoregressive process with possible unit roots. Econometrica
70 (1), 377–391.
Inoue, A., Kilian, L., 2003. The continuity of the limit distribution in the parameter of interest is not
essential for the validity of the bootstrap. Econometric Theory 19 (6), 944–961.
Jeong, J., Maddala, G.S., 1993. A perspective on application of bootstrap methods in econometrics. In:
Maddala, et al. (Eds.), Handbook of Statistics 11: Econometrics. North-Holland, Amsterdam,
pp. 573–610.
Jo¨ckel, K.-H., 1986. Finite sample properties and asymptotic efficiency of Monte Carlo tests. The Annals
of Statistics 14, 336–347.
Jouneau-Sion, F., Torre´s, O., 2005. MMC techniques for limited dependent variables models:
Implementation by the branch and bound algorithm. Journal of Econometrics, this issue,
doi:10.1016/j.jeconom.2005.06.006.
Keane, M.P., 1993. Simulation estimation for panel data models with limited dependent variables. In:
Maddala, et al. (Eds.), Handbook of Statistics 11: Econometrics. North-Holland, Amsterdam,
pp. 545–571.
Keller-McNulty, S., Higgins, J.J., 1987. Effect of tail weight and outliers and type-I error of robust
permutation tests for location. Communications in Statistics, Simulation and Computation 16 (1),
17–35.
Kiviet, J.F., Dufour, J.-M., 1997. Exact tests in single equation autoregressive distributed lag models.
Journal of Econometrics 80, 325–353.
Lehmann, E.L., 1986. Testing Statistical Hypotheses, second ed. Wiley, New York.
Lock, R.H., 1991. A sequential approximation to a permutation test. Communications in Statistics,
Simulation and Computation 20 (1), 341–363.
Manly, B.F.J., 1997. Randomization, Bootstrap and Monte Carlo Methods in Biology; Permutation Test,
second ed. Chapman & Hall, London.
Mariano, R.S., Brown, B.W., 1993. Stochastic simulation for inference in nonlinear errors-in-variables
models. In: Maddala, et al. (Eds.), Handbook of Statistics 11: Econometrics. North-Holland,
Amsterdam, pp. 611–627.
Marriott, F.H.C., 1979. Barnard’s Monte Carlo tests: How many simulations? Applied Statistics 28,
75–77.
McFadden, D., 1989. A method of simulated moments for estimation of discrete response models without
numerical integration. Econometrica 57, 995–1026.
Noreen, E.W., 1989. Computer Intensive Methods for Testing Hypotheses: An Introduction. Wiley,
New York.
Reiss, H.D., 1989. Approximate Distributions of Order Statistics with Applications to Nonparametric
Statistics Springer Series in Statistics. Springer, New York.
Ripley, B.D., 1981. Spatial Statistics. Wiley, New York.
Shao, S., Tu, D., 1995. The Jackknife and Bootstrap. Springer, New York.
Shorack, G.R., Wellner, J.A., 1986. Empirical Processes with Applications to Statistics. Wiley, New York.
Sriram, T.N., 1994. Invalidity of bootstrap for critical branching processes with immigration. The Annals
of Statistics 22, 1013–1023.
Stinchcombe, M.B., White, H., 1992. Some measurability results for extrema of random functions over
random sets. Review of Economic Studies 59, 495–512.
Vadiveloo, J., 1983. On the theory of modified randomization tests for nonparametric hypotheses.
Communications in Statistics, Theory and Methods 12 (14), 1581–1596.
Vinod, H.D., 1993. Bootstrap methods: Applications in econometrics. In: Maddala, et al. (Eds.),
Handbook of Statistics 11: Econometrics. North-Holland, Amsterdam, pp. 629–661.