Document 274206

K-SAMPLE ANDERSON-DARLING TESTS OF FIT, FOR
CONTINUOUS AND DISCRETE CASES
by
F.W.SCHOLZ
M.A. STEPHENS
TECHNICAL REPORT No. 81
May 1986
Department of Statistics, GN-22
University of Washington
Seattle, Washington 98195 USA
K-Sample Anderson-Darling Tests of Fit, for
Continuous and Discrete Cases
F.W. Scholz! and M.A. Stephens/
1 BoeingComputer Services, MS 9C-Ol,Seattle, Wa. 98124-0346,
P.O. Box 24346, USA
2 Department of Mathematics and Statistics, SimonFraser University,
Burnaby, B.C., V5A IS6, Canada
ABSTRACT
Two k-sample versions of the Anderson-Darling (AD) test
of fit are proposed and their asymptotic null distributions are
derived for the continuous as well as the discrete case. In the
continuous case the asymptotic distributions coincide with the
(k-l)-fold convolution of the l-sample AD asymptotic
distribution. Monte Carlo simulation is used to investigate the
null distribution small sample behaviour of the two versions
under various degrees of data rounding and sample size
imbalances. Tables for carrying out these tests are provided and
their usage in combining independent 1- or k-sample AD-tests
is pointed out.
Some key words: Combining tests, Convolution, Empirical
processes, Midranks, Pearson curves, Simulation.
20, 1986
K-Sample Anderson-Darling Tests of Fit, for
Continuous and Discrete Cases
F.W. Scholz! and M.A. Stephens/
1 Boeing ComputerServices, MS 9C-Ol,Seattle,Wa. 98124-0346,
P.O. Box 24346, USA
2 Department of Mathematics and Statistics, SimonFraser University,
Burnaby, B.C., V5A IS6, Canada
1. Introduction and Summary
Anderson and Darling (1952, 1954) introduce the goodness of fit
statistic
2
1
00
Am = m
(Fm(x) - F O(x)} 2
F O(x){1 _ F O(x)} dF O(x)
to test the hypothesis that a random sample X l'
... ,
Xm , with empirical
distribution F m (x), comes from a continuous population with distribution
function
where
F (x)
F (x) = F o(x )
for some completely specified
distribution function F o(x). Here F m(x) is defined as the proportion of the
sample X l' ... , Xm which is not greater than x. The corresponding twosample version
(I)
A 2 = mn
mn
N
j
-<>0
(Fmex) - G n(x)}2 dHN(x)
H N(x){1 - HN(x)}
was proposed by Darling (1957) and studied in detail by Pettitt (1976). Here
G n (x) is the empirical distribution function of the second (independent)
sample Y t- ... , Yn obtained from a continuous population with distribution
function G(x) and HN(x)={m Fm(x)+n Gn(x)}IN, with N=m +n,
is the empirical distribution function of the pooled sample. The above
integrand is appropriately defined to be zero whenever H N (x) = 1.
~~L'~LV case
2
common contmuous distribution runcuon,
-2-
In Sections 2 and 3 two k -sample versions of the Anderson-Darling test
are proposed for the continuous as well as discrete case and computational
formulae are given. Section 4 discusses the finite sample distribution of
these statistics and gives a variance formula for one of the two statistics.
Section 5 derives the asymptotic null distribution of both versions which in
the continuous case is the (k-l)-fold convolution of the l-sample
Anderson-Darling asymptotic distribution. Section 6 describes the proposed
test procedures and gives a table for carrying out the tests. Section 7 reports
the results of a Monte Carlo simulation testing the adequacy of the table for
the continuous and discrete case. Section 8 presents two examples and
Section 9 points out another use of the above table for combining
independent 1- and k -sample Anderson-Darling tests of fit.
2. The K-Sample Anderson-Darling Test
On the surface it is not immediately obvious how to extend the twosample test to the k -sample situation. There are several reasonable
possibilities but not all are mathematically tractable as far as asymptotic
theory is concerned. Kiefer's (1959) treatment of the k-sample analogue of
the Cramer-v. Mises test shows the appropriate path. To set the stage the
following notation is introduced. Let Xij be the j tk observation in the i th
sample, j = 1,...., ni' i = 1,..., k . All observations are independent. Suppose
the i tk sample has distribution function F i . We wish to test the hypothesis
H o: F 1 = ... =Fk
without specifying the common distribution F. Since rounding of data
occurs routinely in practice we will not necessarily assume that the Fi , and
hence the common F under H 0' are continuous. Denote the empirical
distribution function of the i tk sample by Fn.(x) and that of the pooled
I
sample of all N = n 1 + ... + nk observations by HN(x). The k-sample
Anderson-Darling test statistic is then defined as
=L
=1
-3-
R: HN(x) < I}. For k =2 (2) reduces to (1). In the case
of untied observations, i.e. the pooled ordered sample is Z 1 < ... < ZN' a
is
computational formula for
where BN
= [x
E
AJv
A 2 = -
(3)
kN
k I N -1 (N M·· - j n.)2
1 L
- L
I)
I
j(N - j)
N i=1 ni j=1
where Mij is the number of observations in the i th sample which are not
greater than Zj •
3. Discrete Parent Population
If continuous data is grouped, or if the parent populations are discrete,
tied observations can occur. To give the computational formula in the case
of tied observations we introduce the following notation. Let
denote the L (> 1) distinct ordered observations in the
Z ~ < ... <
zt
pooled sample. Further let
be the number of observations in the
/ij
l th
k
sample coinciding with Z/' and let Ij =
Z/'. Using (2) as the definition of AJv
L
/ij
denote the multiplicity of
i=1
the computing formula in the case of
ties becomes
AJv = Lk
(4)
-
1
L-l
L
I· (N M .. -n.B.)2
.L
I)
I}
i=1 ni j=1 N
Bj(N - B)
where M··I} =/'1
+ ... +/..I and
B· =/ 1 + .. , +1·} •
I
}}
An alternate way of dealing with ties is to change the definition of the
empirical distribution function to the average of the left and right limit of the
ordinary empirical distribution function, i.e.
Fan.(x):=
t
1
-2 (Fn·(x)
+ Fn·(x-»
t
t
and similarly HaN (x ). Using these modified distribution functions we
modify (2) slightly to
=--
-4-
for (nondegenerate) samples whose observations do not all coincide.
Otherwise let Aak = O. The denominator of the integrand of Aak is
chosen to simplify the mean of Aak. For nondegenerate samples the
computational formula for
2
(5)
A akN =
N - 1 k
N
Aak
1
becomes
Ij
L
L- L-
i=1 ni j=1 N
(N M aij - ni B aj ) 2
Ba/N - B aj) - N 1/4
,
where M aij = i, 1 + ... + lij-l + li/2 and B aj = II + ... + I j _ 1 + 1/2.
The formula (5) is valid provided not all the Xij are the same. This latter
way of dealing with ties corresponds to the treatment of ties through
midranks in the case of the Wilcoxon two-sample and the Kruskall-Wallis
k-sample tests, see Lehmann (1975).
4. The Finite Sample Distribution under H 0
Under H 0 the expected value of
E(Al!v)
=(k
AlN is
-1) N ~ 1 [1 -
1
J(\V(u)}N-l du] ,
o
where \V(u):= F (F- 1(u » with \V(u) ~ u and \V(u) == u if and only if F
is continuous, see Section 5 for some details. In the continuous case the
expected value becomes k - 1. In general, as N ~ 00 , the expected value
converges to (k -1) P(\V(U) < 1) where U - U(O,l) is uniform.
The expected value of Aak under H 0 is
E(Aak) = (k - 1) (l- pr(X 11 = .. , =Xknk )}
,
which is k - 1 for continuous F and otherwise becomes k - 1 in the
nondegenerate case as N ~ 00.
Higher moments of Al!v and Aak are very difficult to compute.
Pettitt (1976) gives an approximate variance formula for A:iv as
var(A:iv) :::: 0 2 (1 - 3.lIN) where
0
2
= 2 (x 2 - 9)/3 is the variance of A
f.
-5continuous case.
2 ._
(6)
(1 N
.-
var A
2
_
(kN ) -
3
2
a N + b N + eN + d
(N _ 1)(N - 2)(N - 3) ,
with
a
b
=(2 g
=(4 g -6) k + (10 -
- 4) k 2 + 8 h k
+ (2 g
6 g) H - 4 g + 6
- 14 h - 4) H - 8 h
+4 g
- 6
c = (6 h + 2 g - 2) k 2 + (4 h - 4 g + 6) k + (2 h - 6) H + 4 h
d
=(2 h + 6) k 2 -
4h k
where
k
1
H='Li=1 ni
,
N- 1 1
h = 4.J.
"\:'-
. 1 I
z=
and g =
N-2
'L
i=1
N-l
'L
j=i+l (N
1
.. '
-I) ]
Note that
1 1
g~fJ
1
Oyx(1-y)
2
dxdy=~.
6
as N ~ 00 and thus var(AJv) ~ (k - 1) (12 as min(n l' ..., nk) ~ 00. The
effect of the individual sample sizes is reflected through H and is not
was not derived.
negligible to order liN. A variance formula for
Aak
However, simulations showed that the variances of AJv
Aak
and
are
very close to each other in the continuous case. Here closeness was judged
and that
by the discrepancy between the simulated variance of
obtained by (6).
AJv
In principle it is possible to derive the conditional null distribution
(under H 0) of (3), (4) or (5) given the pooled (ordered) vector
2 = (2 1 ' ... , 2N ) of observations 2 1::; ... ::; 2N by recording the
distribution of (3), (4) or (5) as one traverses through all possible ways of
splitting 2 into k samples of sizes n 1 ' ... , nk respectively. Thus the
test is truly non-parametric in this sense. For small sample sizes it may be
-6-
as k gets larger. Not only will the null distribution be required for all
possible combinations (n 1 ' ... , nk) but also for all combinations of ties.
A more pragmatic approach would be to record the relative frequency
p with which the observed value a 2 of (3), (4) or (5) is matched or
exceeded when computing (3), (4) or (5) for a large number Q of random
partitions of Z into k samples of sizes n 1 ' ... ,nk respectively. This
was done, for example, to get the distribution of the two-sample Watson
statistic Un~ in Watson (1962). This bootstrap like method is applicable
equally well in small and large samples. p is an unbiased estimator of the
true conditional as well as unconditional P -value of a 2, and the variance of
p can be controlled by the choice of Q. In the next section the
unconditional asymptotic distribution of (3), (4) and (5) will be derived.
AJv
5. Asymptotic Distribution of
under H 0
Since the asymptotic distribution of (4) reduces to that of (3) in the case
that the common distribution function F is assumed continuous, only the
case of (4) will be treated in detail and the result in the case of (5) will only
be stated with the proof following similar lines. In deriving the asymptotic
distribution of (4) we combine the techniques of Kiefer (1959) and Pettitt
(1976) with a slight shortening in the argument of the latter and track the
effect of a discontinuous F.
Using the special construction of Pyke and Shorack (1968), see also
Shorack and Wellner (1986), we can assume that on a common probability
,nb
space 11 there exist for each N, and corresponding n rindependent uniform samples U isN - U(O,l), s = 1 , ... , ni, i = 1 ,
,k
and independent Brownian bridges U 1 '
IIUiN
for every co E 11 as ni
=
:=
-
t
~
00.
sup
E
[0,1]
... ,
Uk such that
I UiN(t) -
Ui(t) I ~
°
Here
1:
=1
St
-7is the empirical process corresponding to the i th uniform sample. Let
X isN := F-1(UisN) and
U iN {F (x)}
=ni 112 (FiN (x) -
F (x)}
.
1
nj
l
s=l
=--;:; L
FiN (x)
With
I
[X
isN
s x]
so that FiN (x ) is equal in distribution to F n'(x). The empirical distribution
I
function of the pooled sample of the X isN is also denoted by H N (x) and
that of the pooled uniform sample of the U isN is denoted by KN(t) so that
HN(x) =KN(F(x». This double use of H N as empirical distribution of the
X isN and of the Xis should cause no confusion as long as only
distributional conclusions concerning (4) are drawn.
Following Kiefer (1959), let C = (Cij) denote a k xk orthonormal
matrix with C lj = (n/N/I 2, j = 1 , ... , k. If U = (U I" .. , ukl then
the components of V = (V I ' ... , Vkl = C U are again independent
Brownian bridges.
Further, if
= (V IN ,
= C UN
, VkN)t
i = 1,
k
L
UN =
and
(U IN , ... , UkN)t
then
IIViN - Vi/I ~ 0
2
2
for all
VN
(0 E
n,
, k and
ni{FiN(x) -HN(x)}
2
= Lk
i=l
UiN{F(x)} - V IN {F (x)}
= Lk
i=2
i=l
for all x
E
2
ViN{F(x)}
R.
This suggests that
AlN, which is equal in distribution to
k
k
L Vi~{F(x)}
LH~~~){l-
L Vi~{W(U)}
HN(x )} dHN(x) =
LKN{\jI::~}
[1 - KN{IjI(u )}] dKN(u),
converges in distribution to
k
f
2
L
V/{W(U)}
i=2
A k - 1 := A -W-(u -)-{-l---'V-(u-)-} du
$
=
E
..
1:
"
:=
'j
<
E
<
- 8-
To make this rigorous we follow Pettitt and claim that for each
oE (0,112) and S (0) = (u E [0,1]: 0::S; u , \V(u) ::s; 1 - o} we have (see
Billingsley, theorem 5.2)
k
k
L Vi~{\V(U)}
f
i=2
dKN(u) ANnS(O) KN{\V(u)} [l-KN{\V(u)}]
f
S(o)
L
V/{\V(u)}
i=2
dKN(u)
\V(u) (l-\V(u)}
k
L
+
f
i(=\{1_ ()} d{KN(u)-u}-tO
\V u
\V u
S(o)
n
for all ro E
If W
as M
V/{\V(u)}
-t
=(W I ' ... ,
00.
WN)
with W I < ... < WN
denote the order
statistics of the pooled sample of the UisN then the conditional expectation
of
given W
is KN{\V(Wj
)}
[l - KN{\V(Wj
)}] (k - 1)N /(N -
1). Thus the
unconditional expectation of
i:
i
DIN
Vi~{\V(U)}
=f I [O,o)(u) IAN(u) KN{\V~:~} [1
KN{o/(u )}] dKN(u)
IS
Similarly the unconditional expectation of
k
L
>
-
{o/(u
-9IS
E(D m ) =
(k -1) N
N _ 1 pr[w(U lIN) > 1- 0, KN{W(U lIN)} < 1] .
If W(t) = 1 for some
if W(t) < 1 for all
t
t
< 1 then E (D m) = 0 for 0 sufficiently small, and
< 1 then
s (kN-~)lN
E(D 2N)
(l- W- 1(1 - a)} .
In either case E (D IN + D 2N ) ~ 0 as 0 ~ 0 , uniformly in N. Thus by
Markov's inequality pr(D 1N +D 2N z e) ~O as M ~oo and then
~ O. Theorem 4.2 of Billingsley (1968) and the fact that for all ro E
o
n
k
A 2 (0) k-l
-
L
f
V?{W(u)}
i=2
S (0)
W(u) (l - W(u)}
du ~Al-l
as 0 ~ 0 (monotone convergence theorem) proves the claim that Ak~
converges in distribution to
finite for almost all ro E
n
A/:_1 . The integral defining
Ak~1 exists and is
by Fubini's theorem upon taking expectation of
Ak~I'
Similarly one can show that under H 0 the modified version AdkN
converges in distribution to
k
1
'-f
L
[Vdw(u)}
+ Vdw_(u )}]2
2
A a(k-l)'- = i=2
= - - - - - : = , - - - - - - - - - du ,
o 4 W(u) (l - W(u)} - (W(u) - W_(u)}
where
W_(u)
=F (F- 1(u )-)
and
W(u)
=(W(u) + W_(u ))/2
and
V 2,' .. Vk are the same independent Brownian bridges as before. Note
1
that Al- 1 and A}(k-l) coincide when F is continuous. Thus the limiting
distribution in the continuous case can be considered an approximation to
the limiting distributions of Ak~ and
AalV
under rounding of data
provided the rounding is not too severe. Analytically it appears difficult to
continuous case.
- 10-
diagonal better than 'If, appears to point to A}kN as the better approximand.
Only a simulation can bear this out.
6. Table of Critical Points
Since for the 1- and 2-sample Anderson-Darling tests of fit the use of
asymptotic percentiles works very well even in small samples, Stephens
(1974) and Pettitt (1976), the use of the asymptotic percentiles is suggested
here as well. To obtain a somewhat better accuracy in the approximation we
follow Pettitt (1976) and reject H 0 at significance level a whenever
Ak~ - (k -1)
----aN
~
zk-1 (1 - a)
where zk-1(1 - a) is the (1 - a)-percentile of the standardized asymptotic
Zk-1
= {A/'_1
- (k - l))/a distribution and aN is given in (6).
If Y l' Y 2' ... are independent chi-square random variables with k-l
degrees of freedom then
Al-1
has the same distribution as
Li (i1+1) Y"
.
z=1
z
Its cumulants and first four moments are easily calculated and approximate
percentiles of this distribution were obtained by fitting Pearson curves as in
Stephens (1976) and Solomon and Stephens (1978). This approximation
works very well in the case k -1 = 1 and can be expected to improve as k
increases. A limited number of standardized percentiles zm (1 - a) of Zm
are given in Table 1. The test using A}kN is carried out the same way by
just replacing
AJv
lNabove.
with A a
- 11 Table 1
Percentiles zm (y) of the 2 m -Distribution
y
m
.75
.90
.95
.975
.99
1
2
3
4
6
8
10
.326
.449
.498
.525
.557
.576
.590
.674
1.225
1.309
1.324
1.329
1.332
1.330
1.329
1.282
1.960
1.945
1.915
1.894
1.859
1.839
1.823
1.645
2.719
2.576
2.493
2.438
2.365
2.318
2.284
1.960
3.752
3.414
3.246
3.139
3.005
2.920
2.862
2.326
00
For values of m not covered by Table 1 the following interpolation
formula should give satisfactory percentiles. It reproduces the entries in
Table 1 to within half a percent relative error. The general form of the
interpolation formula is
where the coefficients for each y may be found in Table 2.
Table 2
Interpolation Coefficients
Y
bo
bI
b2
.75
.90
.95
.975
.675
1.281
1.645
1.960
-.245
.250
.678
1.149
-.105
-.305
-.362
-.391
- 12to Y III order to establish an approximate P-value for the observed
Anderson-Darling statistic, see Section 8 for examples.
7. Monte Carlo Simulation
To see how well the percentiles given in Table 1 perform in small
samples a number of Monte Carlo simulations were performed. Samples
were generated from a Weibull distribution, with scale parameter a = 1 and
shape b = 3.6, to approximate a normal distribution reasonably well. The
underlying uniform random numbers were generated using Schrage's (1979)
portable random number generator. The results of the simulations are
summarized in Tables 3-17 of the Appendix. For each of these tables 5000
pooled samples were generated. Each pooled sample was then broken down
into the indicated number of subsamples with the given sample sizes. The
observed false alarm rates are recorded in columns 2 and 3 for the two
versions of the statistic. Next, for each pooled sample created above the
scale was changed to a = 150, a = 100 and a = 30 and the sample values
were rounded to the nearest integer. The observed false alarm rates are given
respectively in columns (4,5), (6,7) and (8,9). On top of these columns the
degree of rounding is expressed in terms of the average proportion of
distinct observations in the pooled sample.
It appears that the proposed tests maintain their levels quite well even
for samples as small as ni = 5. Another simulation implementing the tests
without the finite sample variance adjustment did not perform quite as well
although the results were good once the individual sample sizes reached 30.
It is not clear whether a
has any clear advantage over
as far as
A1N
1N
data rounding is concerned. At level .01 Aa
AJv
AJv
seems to perform better than
although that is somewhat offset at level .25.
power behavior of these
has not been studied
one can
expect that the good behavior
2-sample test A';n' as demonstrated by
(1
over
k -sampie case as
IJp't't,'t't
- 13 -
8. Two Examples
As a first example consider the paper smoothness data used by
Lehmann (1968, p. 209, Example 3 and reproduced in Table 18 of the
Appendix) as an illustration of the Kruskal-Wallis test adjusted for ties.
According to this test the four sets of eight laboratory measurements show
significant differences with P -value > .005.
Applying the two versions of the Anderson-Darling k-sample test to
this set of data yields AJr = 8.3559 and A}kN = 8.3926. Together with
aN = 1.2038
this yields standardized Z -scores of 4.449 and 4.480
respectively, which are outside the range of Table 1. Plotting the log-odds of
y versus z 3(Y) a strong linear pattern indicates that simple linear
extrapolation should give good approximate P -values. They are .0023 and
.0022 respectively, somewhat smaller than that of the Kruskal-Wallis test.
As a second example consider the air conditioning failure data analyzed
by Proschan (1963) who showed that the data sets are significantly different
(P -value ::::.007) under the assumption that the data sets are individually
exponentially distributed. The data consists of operating hours between
failures of the air conditioning system on a fleet of Boeing 720 jet airplanes.
For some of the airplanes the sequence of intervals between failures is
interrupted by a major overhaul. For this reason segments of data from
separate airplanes, which are not separated by a major overhaul, are treated
as separate samples. Also, only segments of length at least 5 are considered.
These are reproduced in Table 19 of the Appendix.
Applying the Anderson-Darling tests to these 14 data sets yields
AJr =21.6948 and Aaltv =21.7116. Together with aN =2.6448 this
yields standardized Z -scores of 3.288 and 3.294 respectively, which are
outside the range of Table 1. Using Table 2, the interpolation formula for
zm (y) yields appropriate percentiles for m = 13. Plotting the log-odds of y
versus
Z
13(Y) suggests that a cubic extrapolation should provide good
approximate P -values.
are .0042 and .0043
respectively. Hence
even witnout
- 14-
9. Combining Independent Anderson-Darling Tests
Due to the convolution nature of the asymptotic distribution of the ksample Anderson-Darling tests of fit the following additional use of Table 1
is possible. If m independent I-sample Anderson-Darling tests of fit, see
Section 1, are performed for various hypotheses then the joint statement of
these hypotheses may be tested by using the sum S of the m l-sample test
statistics as the new test statistic and by comparing the appropriately
standardized S against the row corresponding to m in Table 1. To
standardize S note that the variance of a l-sample Anderson-Darling test
based on ni observations can either be computed directly or can be
deduced from the variance formula (6) for k
size go to infinity as:
=2
by letting the other sample
var(A;)
= 2 (n 2 - 9)/3 + (10 -n2)/ni .
I
It should be noted that these l-sample tests can only be combined this way if
no unknown parameters are estimated. In that case different tables would be
required. This problem is discussed further by Stephens (1986), where tables
are given for combining tests of normality or of exponentiality.
Similarly, independent k-sample Anderson-Darling tests can be
combined. Here the value of k may change from one group of samples to
the next and the common distribution function may also be different from
group to group.
Acknowledgements
We would like to thank Steve Rust for drawing our attention to this
problem and Galen Shorack and Jon Wellner for several helpful
conversations and the use of galleys of their book. The variance formula (6)
was derived with the partial aid of MACSYMA (1984), a large symbolic
manipulation program. This research was supported in part by the Natural
Sciences and Engineering Research Council of Canada, and by the U.S.
Office Naval Research.
- 15 -
References
Anderson, T.W. and Darling, D.A. (1952), Asymptotic theory of certain
"goodness of fit" criteria based on stochastic processes. Ann. Math. Statist.,
23, 193-212
Anderson, T.W. and Darling, D.A. (1954), A test of goodness of fit, J. Amer.
Stat. Assn., 49, 765-769
Billingsley, P. (1968), Convergence of Probability Measures. Wiley; New
York
Darling, D.A. (1957), The Kolmogorov-Smirnov, Cramer-von Mises Tests,
Ann. Math. Statist., 28, 823-838
Kiefer, J. (1959), K-sample analogues of the Kolmogorov-Smirnov and
Cramer-v. Mises tests, Ann. Math. Statist., 30, 420-447
Lehmann, E.L. (1968), Nonparametrics, Statistical Methods Based on
Ranks, Holden Day; San Francisco
MACSYMA (1984), Symbolics Inc., Eleven Cambridge Center, Cambridge,
MA.02142
Pettitt, A.N. (1976), A two-sample Anderson-Darling rank statistic,
Biometrika, 63,161-168
Pyke, R. and Shorack, G.R. (1968), Weak convergence of a two-sample
empirical process and a new approach to Chernoff-Savage theorems, Ann.
Math. Statist., 39, 755-771
Schrage, L. (1979), A more portable Fortran random number generator,
ACM Trans. Math. Software, 5, 132-138
Shorack, G.R. and Wellner lA. (1986), Empirical Processes with
Applications to Statistics, Wiley; New York
Solomon, H. and Stephens, M.A. (1978), Approximations to Density
Functions Using Pearson Curves, J. Amer. Stat. Assoc., 73, 153-160
Stephens, M.A. (1974), EDF statistics for goodness of fit and some
compansons Amer. Stat. Assoc. 69, 730-737
- 16-
Stephens, M.A. (1976), Asymptotic Results for Goodness-of-Fit Statistics
with Unknown Parameters, Ann. Statist., Vol. 4, 2, 357-369
Stephens, M.A. (1986), Tests based on EDF Statistics, Chap. 4 of
Goodness-of-Fit Techniques (ed. R.B. d' Agostino and M.A. Stephens),
Marcel Dekker; New York.
Watson, G.S. (1962), Goodness-of-fit tests on a circle II, Biometrika, 49,
57-63
- 17 -
Appendix
Results of the Monte Carlo Simulations
Observed Significance Levels of AJ" and Aa1v
Table 3
Number of Replications = 5000
Sample Sizes 555
I nominal
, significance
level a
.250
.100
.050
.025
.010
1.0000
A2
.2654
.1000
.0476
.0228
.0070
average proportion of distinct observations
.9555
.9346
Aa2
.2656
.1040
.0502
.0252
.0086
A2
.2656
.0998
.0488
.0230
.0058
Aa2
.2714
.1062
.0526
.0256
.0086
A2
.2614
.0994
.0486
.0224
.0062
Aa2
.2702
.1046
.0532
.0262
.0084
.8034
A2
.2632
.1034
.0500
.0220
.0054
Aa2
.2770
.1146
.0586
.0298
.0096
Observed Significance Levels of AJ" and Aa1v
Table 4
Number of Replications = 5000
Sample Sizes 10 10 10
nominal
significance
level a
.250
.100
.050
.025
.010
1.0000
A2
.2416
.1044
.0496
.0238
.0102
average proportion of distinct observations
.9099
.8700
Aa2
.2438
.1066
.0524
.0244
.0108
A2
.2418
.1040
.0502
.0240
.0102
Aa2
.2464
.1086
.0542
.0256
.0112
A2
.2438
.1050
.0490
.0242
.0100
Aa2
.2482
.1078
.0530
.0254
.0114
.6520
A2
.2432
.1026
.0512
.0236
.0110
Aa2
.2534
.1126
.0588
.0314
.0128
- 18 -
Observed Significance Levels of Al;.; and A,lv
Table 5
Number of Replications = 5000
Sample Sizes 30 30 30
nominal
significance
level ex
.250
II
.100
I
.050
.025
.010
average proportion of distinct observations
.7594
.6719
1.0000
A2
Aa2
A2
.2504
.0964
.0466
.0228
.0092
.2508
.0972
.0474
.0236
.0092
.2488
.0946
.0474
.0238
.0094
Aa2
.2532
.0978
.0482
.0244
.0096
.3552
A2
Aa2
A2
Aa2
.2502
.0950
.0480
.0242
.0090
.2556
.0998
.0500
.0252
.0094
.2478
.0956
.0462
.0228
.0086
.2572
.1048
.0532
.0268
.0104
I
Observed Significance Levels of Al;.; and A,lv
Table 6
Number of Replications = 5000
Sample Sizes 5 5 5 5 5
nominal
significance
level ex
.250
.100
.050
I
.025
I .010
I
average proportion of distinct observations
.9249
.8904
1.0000
A2
A a2
A2
.2576
.1068
.0498
.0202
.0068
.2618
.1100
.0528
.0226
.0074
.2580
.1062
.0492
.0188
.0064
Table 7
Aa2
.2636
.1130
.0544
.0234
.0074
.6971
A2
A a2
A2
Aa2
.2600
.1044
.0490
.0196
.0070
.2646
.1128
.0550
.0232
.0078
.2568
.1036
.0476
.0214
.0060
.2704
.1210
.0592 I
.0280 I
.0094 i
Observed Significance Levels of Al;.; and Aak
Number of Replications = 5000
Sample Sizes 10 10 10 10 10
nominal
significance
level ex
.250
.100
.050
average proportion of distinct observations
.8551
.7951
1.0000
2
.2526
.1042
.0512
A2
.2524
.1034
.0498
Aa2
.2548
.1070
.0524
A2
Aa2
.2562
.1066
.0528
.5134
A2
.2522
.1018
.0478
2
.2604
.1114
.0566
- 19-
Observed Significance Levels of AJ., and A"k
Table 8
Number of Replications = 5000
Sample Sizes 30 30 30 30 30
nominal
significance
level ex
.250
.100
.050
.025
.0lD
I
average proportion of distinct observations
.6461
.5402
1.0000
A2
Aa2
A2
.2524
.1036
.0520
.0256
.0114
.2532
.1042
.0522
.0260
.0114
.2516
.1022
.0518
.0266
.0116
.2417
Aa2
A2
Aa2
A2
Aa2
.2542
.1056
.0532
.0266
.0120
.2526
.1034
.0514
.0262
.0112
.2552
.1068
.0550
.0266
.0122
.2522
.1028
.0512
.0264
.0108
.2596
1138
.
.0588
.0306
.0136
Observed Significance Levels of AJ., and A"k
Table 9
Number of Replications = 5000
Sample Sizes 5 5 5 5 5 5 5 5 5
nominal
significance
level ex
.250
.100
.050
.025
.010
average proportion of distinct observations
.8678
.8126
1.0000
A2
Aa2
A2
Aa2
.2610
.1108
.0470
.0206
.0064
.2650
.1128
.0486
.0226
.0070
.2618
.1114
.0460
.0204
.0064
.2674
.1142
.0504
.0242
.0076
.5438
A2
Aa2
A2
Aa2
.2616
.1096
.0462
.0198
.0068
.2692
.1150
.0510
.0246
.0078
.2584
.1056
.0470
.0204
.0058
.2722
.1234
.0562
.0290
.0080
Observed Significance Levels of AJ., and A"k
Table 10
Number of Replications = 5000
Sample Sizes 10 10 10 10 10 10 10 10 10
nominal
significance
1.0000
average proportion of distinct observations
.7598
.6721
.3551
1
- 20-
Observed Significance Levels of AJ., and Aalv
Table 11
Number of Replications = 5000
Sample Sizes 30 30 30 30 30 30 30 30 30
nominal
significance
level a.
.250
.100
.050
.025
.010
average proportion of distinct observations
.3816
.4900
1.0000
A2
Aa2
A2
.2494
.0956
.0454
.0222
.0076
.2498
.0958
.0456
.0224
.0076
.2522
.0964
.0450
.0220
.0078
Table 12
Aa2
.2534
.0978
.0462
.0230
.0080
.1486
A2
Aa2
A2
Aa2
.2520
.0956
.0448
.0224
.0076
.2542
.0994
.0468
.0230
.0078
.2512
.0968
.0438
.0214
.0078
.2612
.1058
.0504
.0260
.0098
Observed Significance Levels of AJ., and Aalv
Number of Replications = 5000
Sample Sizes 5 10 15 20 25
nominal
significance
level a.
.250
.100
.050
.025
.010
average proportion of distinct observations
.7938
.7152
1.0000
A2
A a2
A2
.2522
.1016
.0494
.0228
.0110
.2552
.1026
.0510
.0234
.0110
.2538
.1020
.0496
.0222
.0114
I
Aa2
.2582
.1044
.0516
.0248
.0114
.4035
A2
Aa2
A2
Aa2
.2528
.1024
.0496
.0234
.0114
.2572
.1056
.0526
.0252
.0118
.2544
.1016
.0492
.0230
.0108
.2632
.1090
.0566
0302
.
.0142
Observed Significance Levels of AJ., and Aa~
Table 13
Number of Replications = 5000
Sample Sizes 5 5 5 5 25
nominal
significance
level a.
1.0000
average proportion of distinct observations
.8688
.8124
Aa2
.2512
.0998
.0488
A2
Aa2
.2550
.1022
.0506
A2
2
.2554
.1024
.0502
.5432
A2
.2608
.1076
.0544
1
- 21 -
Observed Significance Levels of AJ, and AJ,y
Table 14
Number of Replications = 5000
Sample Sizes 5 25 25 25 25
nominal
significance
level a
.250
.100
.050
.025
.010
1.0000
A2
.2620
.1040
.0512
.0248
.0090
average proportion of distinct observations
.7286
.6347
Aa2
.2626
.1056
.0520
.0252
.0090
A2
.2606
.1028
.0514
.0252
.0090
Aa2
.2640
.1068
.0530
.0260
.0092
A2
.2618
.1028
.0522
.0250
.0090
Aa2
.2662
.1084
.0538
.0268
.0096
.3180
A2
.2580
.1042
.0494
.0242
.0088
Observed Significance Levels of AJ, and AJ,y
Table 15
Number of Replications = 5000
Sample Sizes 10 20 30 40 50
nominal
significance
level a
.250
.100
.050
.025
.010
1.0000
A2
.2596
.1052
.0526
.0266
.0074
average proportion of distinct observations
.6465
.5407
Aa2
.2610
.1062
.0528
.0270
.0078
Table 16
A2
.2592
.1054
.0528
.0268
.0078
Aa2
.2644
.1072
.0554
.0280
.0084
A2
.2590
.1056
.0530
.0258
.0078
Aa2
.2638
.1078
.0566
.0282
.0086
.2417
A2
.2602
.1040
.0540
.0240
.0076
Aa2
.2682
.1136
.0606
.0312
.0098
Observed Significance Levels of AJ, and A;kN
Number of Replications = 5000
Sample Sizes 10 10 10 10 50
nominal
significance
level a
1.0000
average proportion of distinct observations
.7607
.6733
Aa2
.2634
.1104
.0534
A2
.2630
.1098
.0530
Aa2
A2
Aa2
.2678
.1124
.0560
A2
I
Aa2
I
.2686
.1144
.0588
.0290
.0102
.3561
- 22-
Table 17
Observed Significance Levels of AJ., and Aa1v
Number of Replications = 5000
Sample Sizes
nominal
significance
level ex.
.250
.100
.050
.025
.010
10 50 50 50 50
average proportion of distinct observations
.5594
.4483
1.0000
A2
A a2
.2480
.1016
.0506
.0254
.01l0
.2490
.1012
.0510
.0258
.0110
A2
.2478
.1010
.0508
.0254
.0104
A a2
.2486
.1030
.0532
.0266
.0110
.1837
A2
A a2
A2
A a2
.2468
.1018
.0514
.0262
.0102
.2500
.1040
.0540
.0272
.01l2
.2490
.1010
.0500
.0268
.0108
.2544
.1074
.0582
.0290
.0126
Data Sets
Table 18
Four sets of eight measurements each of the smoothness
of a certain type of paper, obtained in four laboratories
laboratory
A
B
C
D
smoothness
44.5
45.5
41.4
41.8
40.0
43.0
35.4
37.2
38.7
39.2
34.0
34.0
41.5
39.3
35.0
34.8
43.8
39.7
39.0
34.8
46.0
42.9
43.0
37.8
47.7
43.3
44.0
41.2
58.0
45.8
45.0
42.8
- 23I
Table 19
Operating hours between failures of air conditioning systems
for separate airplanes and major overhaul (MO) segments
airplane
operating hours
7907
7908
7908MO
7909
7909MO
7910
7911
7912
7913
7914
7915
7916
8044
194
413
36
34
90
79
26
130
74
59
55
33
23
246
71
97
1
39
50
79
30
359
50
487
3
8045
102
27
15
14
201
31
10
84
44
208
57
27
320
41
58
118
18
60
44
23
70
48
29
37
33
100
181
65
9
169
447
184
18
186
59
62
101
29
67
61
29
57
49
118
62
14
25
7
24
156
22
56
310
34
20
76
208
502
12
70
21
29
386
56
104
220
239
47
246
176
182
261
21
11
51
16
63
87
42
14
11
106
18
102
46
13
12
5
100
7
20
11
4
206
191
72
5
14
270
283
7
120
5
16
141
82
18
22
5
14
12
90
18
54
163
39
36
62
120
1
142
31
24
3
22
47
11
16
68
216
225
3
52
77
46
71
14
95
80
111
15
139
197
210
188
97
603
35
98
3
104
2
438
12
5
85
91
43
230
14
230
57
66
54
61
32
34
67
59
134
152
44
88
23
9
254
18
130
209
14