Download Report

A survey about the efficiency of
two-sample survival tests for
randomly censored data
Arnold Janssen and Wiebke Werft
Heinrich-Heine-Universität Düsseldorf
Universitätsstraße 1, D-40225 Düsseldorf
[email protected]
Abstract. The present paper summarizes recent developements about twosample tests for randomly right censored data. The tests are derived for hazard
oriented models. In particular, it is reviewed how asymptotically efficient tests
can be constructed and how to prove the asymptotic efficiency of a large class
of linear survival tests. The quality of competing tests can be compared by the
local asymptotic Pitman efficiency ARE. Special attention is devoted to the
martingale approach which corresponds to the sequential nature of survival
data. Typically, central limit theorems for continuous martingales are applied.
A new discrete martingale approach is presented in section 5. This point of
view has the advantage that common central limit theorems for partial sums
of martingal difference arrays may substitute more technical continuous martingale arguments. This approach is not only of interest for teaching courses.
Contents
1 Introduction
2
2 The two-sample hazard model
5
2.1 Semiparametric hazard model . . . . . . . . . . . . . . . . . . . 8
2.2 The score function . . . . . . . . . . . . . . . . . . . . . . . . . 11
3 The model for censored data
12
4 Towards two-sample survival
tests
16
5 Martingale representations of
survival test statistics
21
6 Conditional survival tests
29
I
6.1 The homogeneous null hypothesis H0 . . . . . . . . . . . . . . . 29
6.2 The heterogeneous null hypotheses H0II and H0III . . . . . . . . . 31
7 The efficiency of semiparametric
survival tests
33
7.1 The null hypothesis H0I with equal censorship G1 = G2 . . . . . 37
7.2 The null hypothesis H0II (unequal censorship with G1 6= G2 ) . . . 39
8 Omnibus tests and related tests
41
8.1 Two-sided goodness of fit tests . . . . . . . . . . . . . . . . . . . 41
8.2 Projection tests in survival analysis . . . . . . . . . . . . . . . . 42
Bibliography
45
1
Chapter 1
Introduction
In this paper we are concerned with an overview about semiparametric and
nonparametric two-sample survival tests for two groups of life time data with
incomplete observations. We will restrict ourselves to randomly right censored
data (X, ∆) where a survival time T is censored by some random variable C
and X = min(T, C) and the censoring status ∆ = 1(T ≤ C) are observable.
It is not the aim of this article to present the mathematical details with complete proofs. For this subject we will mostly refer to the literature. A bright
reference are the chapters III 2, V and VIII of the fundamental work of Andersen, Borgan, Gill and Keiding (1993), briefly cited as ABGK [3]. Good sources
are also Gill [9], chapt. 3-5, Andersen et al. [2] and Fleming and Harrington
[8], chapt. 7. An overview can also be found in the recent handbook [4].
Below we like to give the motivation for the survival modelling, their meaning, the mathematics and the technical difficulties behind the construction of
survival tests. We start with a risk model in section 2 which may explain
that the survival in a test group (with disease) is more risky than survival in
a control group given by a standard population. The modelling is done via
hazard ratios and so called risk component models explaining how additional
risk factors work.
Next, survival tests are derived step by step, see sections 4 and 5. We begin
with purely parametric models and their score tests. It is shown how non parametric nuisance parameters like the baseline hazards and their distributions
can be eliminated. We like to stress the sequential nature of survival data.
Typically, patients enter a clinical study, the data are collected sequentially
until the end of the study which may cause censoring. The sequential behavior
leads to well motivated weighted test statistics. They are based on increments
given by observed minus expected (with respect to the past) quantities. Its
connection to the martingale approach is outlined.
Observe that the classical field of rank tests is contained as special case in survival analysis if censoring is absent. The reader should take this into account
and ask what is the meaning of survival quantities in the rank test set-up.
Recall that in case of uncensored data classical two-sample rank tests and per-
2
Introduction
mutation tests work well. Since they are finite sample distribution free under
the null hypothesis the statistician has complete control over the error probability of tests of first kind for each finite sample size. It is reported in section
6 how the permutation principle can be used for survival tests under random
censorship and how it differs from the classical case.
The quality of survival tests is studied in the last sections. We summarize recent results about asymptotic efficent tests. These tests reach their asymptotic
envelope power functions under local alternatives. The asymptotic relative Pitman efficiency ARE can be calculated which is used to compare competing
tests.
We proceed with some historical remarks and comments about the literature.
A review about the martingale and counting process approach is given by Andersen and Borgan [1]. A good source for the presentation of the construction
principles concerning efficient survival tests is Neuhaus [36]. For teaching and
beginners the introduction of Lan and Wittes [26] can be recommended. However, it is impossible to review all papers with applications to survival tests.
There are thousands of them which are often published in applied journals like
Biometrika or Statistics in Medicine.
The history of survival tests is very old. Long time ago the well-known logrank test has been used to test proportional hazard differences. Recall that the
two-sample testing problem with constant time independent hazard ratios is a
special case for the famous Cox-model, see ABGK [3], chapt. VII, or Fleming
and Harrington [8], sect. 4.2.
The log-rank test is the censored data extension of the Savage rank test, see
Hájek et al. [12]. The log-rank test is well understood. It is now the question
how different competing tests can be compared under semiparametric models. The comparison is typically done within the asymptotic set-up of power
function via local alternatives. This leads to the asymptotic Pitman efficiency
ARE, 0 ≤ ARE ≤ 1, as measure of performance. A test with ARE = 1 is
asymptotically efficient. A sequence of further tests with 0 < ARE < 1 needs
1
more observations to obtain the same asymptotic power as an efficient
ARE
sequence of tests (or 100(1 − ARE) percent of the observations are wasted).
To give an example consider again the proportional hazard two-sample Cox
model. It was shown by ABGK [3], VIII 2.3 and 4.2, that in this case the
log-rank test is asymptotically efficient.
The last section 8 is concerned with goodness of fit tests. It is reported that
hazard oriented Kolmogorov-Smirnov type tests apply in survival analysis leading to Rényi tests. Personally, we prefer another class of tests the so called
projection tests for cones or subspaces of alternatives. The reason is that the
statistician can select a region of alternatives where he needs high power. They
can be motivated as asymptotic likelihood ratio tests and were first introduced
by Behnen and Neuhaus [5] for uncensored data.
The state of the art for uncensored two-sample data can be found in Hájek and
Šidák [11], the new edition Hájek et al. [12] and Behnen und Neuhaus [5]. A
3
Introduction
good source for the general asymptotic theory of tests is Witting and MüllerFunk [43], chapter 6. For the general methodology we refer to the monographs
about asymptotic statistics of Strasser [40] and LeCam and Yang [28]. The
mathematical treatment of survival tests is presented in ABGK [3] and Fleming and Harrington [8]. A good review about martingale methods in survival
analysis can be found in Shorack and Wellner [39], sect. 7 and their appendix
B. More applied text books are Klein and Moeschberger [25] and Schumacher
and Schulgen [38], sect. 5 and 6. Here the reader will find a lot of examples
and the books can be used for consulting and statistical courses about survival
analysis in medicine.
Let Φ denote the standard normal distribution function.
4
Chapter 2
The two-sample hazard model
Throughout, let the random variable
T : (Ω, A, P ) → [0, ∞]
denote a survival time. For convenience we allow positive mass P (T = ∞) > 0
at infinity which is used to study risk factors later on which do not occur with
a certain probability. Let
F (x) := P (T ≤ x), S(x) = 1 − F (x), x ∈ [0, ∞]
(2.1)
denote the distribution function F and the survival function S. The hazard
measure Λ on [0, ∞) is defined by
dΛ
1
(x) =
1[0,∞) (x).
dF
S(x−)
(2.2)
The cumulative hazard function Λ is the measure generating function
Λ(t) := Λ([0, t]), t ∈ [−∞, ∞)
(2.3)
of the hazard measure. Without further comments we will add further indices,
like S0 , Λ0 . . ., which are the quantities given by F0 .
Remark 2.1
Let F|[0,∞) be continuous.
(a) Then
S(x) = exp(−Λ(x))
(2.4)
holds for each x ∈ [0, ∞).
(b) Let Λ0 denote the hazard measure of the uniform distribution λλ|(0,1) on
the unit interval (0,1). Then Λ0 (u) = − log(1 − u) holds for 0 < u < 1. If
in addition P (T = ∞) = 0 holds then the time scale may be transformed by
5
The two sample hazard model
F (or its left continuous inverse F −1 on the unit interval (0, 1)). The image
measures are then given by
Λ = Λ0F
−1
,
and Λ(t) = Λ0 (F(t)) ,
Λ0 = ΛF
(2.5)
Λ0 (u) = Λ(F −1 (u)) .
The hazard measure Λ is a risk measure which serves as a meaningful parameter of our survival model. At x ∈ [0, ∞) it describes the risk of T to
fail if {T ≥ x} has already been observed. According to (2.5) it is possible to
reduce the measure Λ on the unit interval by a time scale transformation via F .
A fruitful two-sample model is the so called risk component model for
differences of survival times. It is introduced as follows: Let
T01 , T02 , T11 , T12 : Ω −→ [0, ∞]
(2.6)
denote four mutually independent survival times (with continuous subdistributions on [0, ∞)) where the second index indicates the two-sample group 1 or 2.
D
Suppose that T01 = T02 are real survival times which may stand for variables
with baseline risks of healthy members of a homogeneous population. The
survival times T1 , T2 of members of our groups 1 and 2 are given by
T1 = min(T01 , T11 ) ,
(2.7)
T2 = min(T02 , T12 ).
In this model Ti admits an additional risk factor with fictitious survival
time T1i which may reduce the life time. That model has an additive structure
of the hazards Λi and Λik of Ti and Tik , i.e.
Λ1 = Λ01 + Λ11 ,
Λ2 = Λ01 + Λ12
(2.8)
where Λ01 = Λ02 by our assumption, confer (2.4).
Remark 2.2
Each pair T1 , T2 of real continuous survival times admits a representation (2.7)
via a risk component model. The hazard measures Λ1i are not unique but the
difference of the measures
Λ1 − Λ2 = Λ11 − Λ12
(2.9)
is uniquely determined by the distribution of (T1 , T2 ).
Example 2.3
(a) By definition T1 is said to be stochastically larger than T2 if S1 ≥ S2 .
6
The two sample hazard model
This property is equivalent to the ordering of the cumulative hazards Λ1 (t) ≤
Λ2 (t) for all t ∈ {S2 > 0}.
(b) A stronger condition with practical interpretation as overall risk superiority
of S1 can be introduced by
Λ2 (A) − Λ1 (A) ≥ 0
(2.10)
for all A ∈ B([0, ∞)).
From now on let F1 and F2 be always continuous distributions on [0, ∞). We
will indicate how likelihood based inference can completely be expressed by
the hazard measures for the two-sample testing problem.
dF2
denote the likelihood ratio of the F1 -absolutely continuous part of
Let dF
1
F2 . Then the hazard ratio of Λ2 w.r.t. Λ1 is
S1 (t) dF2
dΛ2
(t) =
(t) .
dΛ1
S2 (t) dF1
(2.11)
The relative logarithmic risk ρ : [0, ∞) −→ [−∞, ∞] is then determined
by
dΛ2
(t) =: exp(ρ(t)).
(2.12)
dΛ1
Rt
In the case when hazards rates λi of Λi with Λi (t) = 0 λi (u) du exist for
i = 1, 2 then ρ = log(λ2 /λ1 ) holds. The function ρ is our parameter of interest.
It describes the time dependent relative risk of F2 w.r.t. F1 to fail within
the logarithmic scale.
Remark 2.4
Under the condition F2 F1 the likelihood can be expressed in terms of ρ by
Z t
dF2
(t) = exp(ρ(t)) exp
(1 − exp(ρ(u))) dΛ1 (u) ,
(2.13)
dF1
0
Rt
dF2
since Λ2 (t) = 0 exp(ρ) dΛ1 and dF
(t) = exp(ρ(t)) exp(Λ1 (t) − Λ2 (t)) hold.
1
Example 2.5 (risk component model)
Suppose that the risk components Tij have hazard measures Λij with hazard
rates λij ≥ 0. Then
dΛ2
λ01 + λ12
λ12 − λ11
=
=1+
=: 1 + γ
dΛ1
λ01 + λ11
λ01 + λ11
(2.14)
holds and γ may serve as the parameter of interest. Formula (2.13) can be
rewritten as
Z t
dF2
(t) = (1 + γ) exp −
γ(u) dΛ1 (u) .
(2.15)
dF1
0
7
The two sample hazard model
R
Below we will see that under mild regularity assumptions (with γ 2 dF1 < ∞)
each perturbation γ (or ρ) of Λ1 defines a path of hazards via a parameter
ϑ ∈ R with
Z t
exp(ϑρ(u)) dΛ1 (u) (+o(ϑ))
(2.16)
Λ2 (t, ϑ) :=
0
or
Z
t
(1 + ϑγ(u)) dΛ1 (u) (+o(ϑ)) .
Λ2 (t, ϑ) :=
(2.17)
0
The term o(ϑ) (as ϑ → 0 for fixed t) is often only needed to guarantee that
t 7−→ Λ2 (t, ϑ) is really increasing.
2.1
Semiparametric hazard model
In practice the baseline survival function and its baseline hazard are typically
unknown. For these reasons invariant tests and invariant estimators are preferable which are invariant under strictly monotone scale transformations of the
time axis. In this connection it is very natural that extended rank procedures
are derived. A specific example of an invariant estimator is the famous KaplanMeier estimator of the survival function for censored data.
On the side of the hazards we will now turn to semiparametric survival models.
The main idea is simple and relies on the fact that the meaning of the relative
risks (2.12) remains preserved under a monotone time scale transformation.
For instance, the practical statement "the risk under Λ2 is twice as high as under Λ1 for certain late survival times (say for the 3/4-quantil t = F −1 (3/4))"
is invariant of F whatever the actual shape of F can be. In our model it is
expressed by exp(ρ(F −1 (3/4))) = 2. Observe that the meaning of traditional
statistical location or scale models is lost under monotone transformations
whereas the meaning of risk models via hazards remains preserved.
As mentioned in Remark 2.1 the survival time T (given by our model Λ2 (·, ϑ))
will now be transformed via x 7→ F (x), with F := F1 , on the unit interval
which is a pivoting procedure. Notice that
L(F (T )|ϑ = 0) = λλ|(0,1)
(2.18)
yields the uniform distribution under ϑ = 0. The local quantities (2.12) and
(2.14) then define new pivoted parameters
ρ0 , γ0 : (0, 1) −→ R
(2.19)
with ρ0 (u) := ρ(F −1 (u)) and γ0 (u) := γ(F −1 (u)) given by the baseline distribution function F . For instance, the model (2.17) reads as
Z
Λ2 (t, ϑ) =
F (t)
(1 + ϑγ0 (v))
0
dv
.
1−v
(2.20)
8
The two sample hazard model
This observation motivates a new parametrization
ρ = ρ0 ◦ F and γ = γ0 ◦ F
(2.21)
where F is the baseline distribution function and the pivoted quantities ρ0 and
γ0 of (2.19) serve as new parameters.
Example 2.6
The choice γ0 = 1 leads to the proportional hazard model with time independent relative risk (for ϑ > −1). Within this semiparametric model the
proportional hazards are attached at each baseline distribution function F .
In order to give an impression about the hazard model the shape of various
pivoted hazard derivatives γ0 : [0, 1] → R is plotted. The methodology is
presented in Fleming and Harrington [8], see p. 258, see also [13]. In figure
2.1 semiparametric models are introduced for proportional, late, early, central
and linear crossing differences of the hazards. The statistican has to decide
which type of alternatives (given by a suitable direction γ0 ) is appropriate and
should be seperated from the null hypothesis with priority. If the statistician
has no preference for one of these directions a projection test would be helpful
to spread out the power over a whole cone, see section 8.2.
9
The two sample hazard model
6
6
γ1 (u) = c
γ2 (u) = cu
-u
-u
1
proportional hazards
late hazards
γ3 (u) = c(1 − u)
6
6
1
γ4 (u) = cu(1 − u)
b
b
b
b
b
b
b
b
b
b
b
b
b
b
b -u
1
early hazards
-u
central hazards
1
γ5 (u) = −c(u − 1/2)
6
b
u
b
b
b
b
b
b
b
b
1
b
b
b
b
b
b
linear crossing hazards
Figure 2.1:
10
The two sample hazard model
2.2
The score function
Consider a family of survival distributions Fϑ , ϑ ∈ U ⊂ R, 0 ∈ U , which admit
at ϑ = 0 a (one- or two-sided) stochastic derivative g of the likelihood ratio
w.r.t. F0 , i.e.
dFϑ
d
log
(t)|ϑ=0 = g(t)
(2.22)
dϑ
dF0
where the tangent
g (or score function) is F0 -square integrable with the tanR
gent conditionR g dF0 = 0. The tangent space at F is denoted by L02 (F ) :=
g ∈ L2 (F ) : g dF = 0 .
Under the model (2.17) (with Λ = Λ1 and F = F1 ) we arrive via (2.13) at
Z t
γ(s) dΛ(s) =: LF (γ)(t)
(2.23)
g(t) = γ(t) −
0
and similarly from (2.12) at
R∞
γ(t) = g(t) −
t
g(s) dF (s)
=: RF (g)(t) ,
1 − F (s)
(2.24)
with R = RF , L = LF briefly. From Ritov and Wellner [37] and Efron and
Johnestone [7] it is known that L = R−1 is the inverse of R and that the
operator R
R : L02 (F ) −→ L2 (F )
(2.25)
is an isometry of the Hilbert spaces where R maps the tangent g into the
hazard rate derivative γ. A rigorous description for L2 -differentiable paths
ϑ 7→ Fϑ can be found in Janssen [17]. Let us mention that local (or asympotic)
properties of the models and related tests do not rely on the specific path but
only on the tangent g and γ = R(g). Thus the parametrization is introduced
via the γ’s. However, all probabilistic calculations will be done on the side
of distributions and tangents. Again we may turn as in (2.21) to pivoted
quantities g0 , γ0 on (0, 1). If L0 and R0 denote the operators for λλ|(0,1) on
(0, 1) then obviously g0 = g ◦ F −1 is the score function of ϑ 7→ L(F (T )|Fϑ ) on
(0, 1) and γ0 = R0 (g0 ), g0 = L0 (γ0 ) hold. The equations (2.23) and (2.24) can
then be expressed by
γ = R0 (g0 ) ◦ F, g = L0 (γ0 ) ◦ F
(2.26)
where γ0 (and g0 ) are the parameters of interest and F is a nuisance parameter.
The present methodology follows the ideas of LeCam and Yang [28]. In order
to study the structure of testing problems at a null hypothesis LeCam proposed
the Hilbert space embedding of an experiment {P : P ∈ P}
1/2
dP
in L2 (P0 ) .
(2.27)
P 7→
dP0
The local geometry of that experiment is given by the tangents g, i.e. the
ϑ 1/2
L2 (P0 ) derivates of all paths ϑ 7→ 2( dP
) at ϑ = 0. They coincide with our
dP0
score functions, see (2.22).
11
Chapter 3
The model for censored data
Throughout, randomly right censored data models are studied. Let T and C
denote two independent r.v. where T is a continuous survival time and C is a
continuous censoring variable with distribution functions
L(T ) = F, L(C) = G .
(3.1)
The statistician is only able to observe the pair (X, ∆)
X := min(T, C) and ∆ = 1{T≤C}
(3.2)
where ∆ is the censoring status. If ∆ = 1 holds, then the observation is called
uncensored. For (x, δ) ∈ [0, ∞) × {0, 1} Fubini’s theorem gives us the joint
distribution
Z x
Z x
F ⊗ G(X ≤ x, ∆ = δ) = δ
(1 − G) dF + (1 − δ)
(1 − F ) dG . (3.3)
0
0
Consider now two L2 -differentiable paths ϑ 7→ Fϑ and ϑ 7→ Gϑ of survival and
consoring distributions, respectively, with score function g of the F ’s, score
function gc of the G’s, respectively, then it is well known that the distributions
L((X, ∆)|Fϑ ⊗ Gϑ ) remain L2 -differentiable under the loss of information, see
LeCam and Yang [27], Witting [42], Satz 1.193. If F = F0 and G = G0 hold
then their score function at ϑ = 0 is
R∞
R∞
gc dG
g
dF
+ (1 − δ)RG (gc )(x) + x
(3.4)
(x, δ) 7→ h(x, δ) = δRF (g)(x) + x
1 − F (x)
1 − G(x)
for the censored model which can be deduced from (3.3), see Janssen [15] for
details. At the level of the score functions the form (3.4) seems to be very
unpleasant. However, if we turn to hazards the function h is treatable. The
reason is that in terms of the hazard measures we have an additive model,
formally ΛX = ΛT + ΛC . Let here gc = 0.
12
The model for censored data
Lemma 3.1
Suppose that γ := R(g) ∈ L2 (F ) denotes the hazard rate derivative of the
survival times T . Then the score function (3.4) can be completely expressed in
terms of hazard quantities
Z x
γ dΛ
h(x, δ) = δR(g)(x) −
0
Z
= δγ(x) −
x
γ dΛ =: h(g, x, δ) .
(3.5)
0
Proof. The isomotry (2.25) implies
R∞
Z x
g dF
x
=−
R(g) dΛ ,
1 − F (x)
0
see also Janssen [17], equation (3.8).
This simple observation leads to the extension of the isometry (2.25) to censored score functions. We will see that the essential informative part of h
is its uncensored part (x, δ) 7→ δR(g)(x) where the index F is suppressed. For
convenience let γ∆ : [0, ∞)×{0, 1} −→ R denote the function (x, δ) 7−→ γ(x)δ
and let
Lhaz (L(X, ∆)) := {γ∆ : γ ∈ L2 (F )} ⊂ L2 (L(X, ∆))
(3.6)
denote the space of censored hazard derivatives.
Theorem 3.2 (Janssen [17], Lemma 3.1)
The operator (acting on (3.5))
˜ : {h(g, ·, ·) : g ∈ L02 (F )} −→ Lhaz
R
h(g, ·, ·) −→ R(g)∆
is an isometry of the Hilbert spaces (as subsets of L2 (L(X, ∆))) with inverse
Z x
˜
L(γ∆)(x, δ) = δγ(x) −
γ dΛ.
(3.7)
0
This result yields further insight for censored models and their score functions. In particular, it is related to the martingale connection and the basic
martingale of survival analysis.
Remark 3.3
(a) In a hazard based model with hazard rate derivative γ of the T ’s the
censored tangent h is given by
˜
h = L(γ∆)
,
(3.8)
13
The model for censored data
which only depends implicitly on the censoring distribution G via the joint
distribution L((X, ∆)|F ⊗ G).
(b) Censored models are often sequential models where the time t is increasing
with t ∈ I := {s ≥ 0 : F (s) < 1}. Suppose that the variables are only
observable up to a time t ∈ I, i.e. the observable events are restricted to the
σ-field
Ft = σ(X1{X≤t} , 1{X≤t} , ∆1{X≤t} ) .
(3.9)
It is well-known that the score function relative on Ft is the conditional expectation E(h(g, ·, ·)|Ft ). For the case γ = R(g) it was shown in Th. 3.2. of
Janssen [17] that the score function has the form
˜
E(h(g, ·, ·)|Ft ) = L((γ1
[0,t] )∆)
(3.10)
which means that restricting on Ft is the same as restricting the hazard rate
derivative γ by γ 1[0,t] .
(c) The restriction (3.10) of the score function yields of course a martingale
for t ∈ I, more precisely
˜
Mγ (X, ∆, t) : = L((γ1
[0,t] )∆)
(3.11)
Z
= ∆γ(X)1[0,t] (X) −
t
γ(s)1{X≥s} dΛ(s)
0
is a martingale. For γ = 1 we arrive at the basic martingale M := M1 of
survival analysis for sample size n = 1, see Shorack and Wellner [39], sect.
7.5. One observes that
Z t
γ(s) M (X, ∆, ds)
(3.12)
Mγ (X, ∆, t) =
0
is just our sequential score function (3.10) and it is the martingale part of the
Doob-Meyer decomposition of the process
t 7−→ ∆γ(X)1[0,t] (X)
(3.13)
which is given by
Z
∆γ(X)1[0,t] (X) = Mγ (X, ∆, t) +
t
γ(s)1{X≥s} dΛ(s) .
(3.14)
0
(d) As a consequence the score function (3.5) and (3.8) admits a martingale
representation via the basic martingale M = M1 . If γ = γ0 ◦ F is parametrized
as in (2.21) the score function of a semiparametric model in direction γ0 is just
Z ∞
(x, δ) 7→
γ0 ◦ F (s) M (x, δ, ds) .
(3.15)
0
14
The model for censored data
In parametric models the score function can be used as test statistic which
yields the so called score tests. However, for semiparametric models and
composite null hypotheses with nuisance parameters the score statistics have
to be modified, see chapter 7. The score functions have to be projected to so
called effective score functions and unknown parameters must be eliminated
or estimated.
15
Chapter 4
Towards two-sample survival
tests
Throughout consider the common two-sample testing problem for randomly
right censored independent continuous survival times T1 , . . . , Tn on [0, ∞) which
are censored by continuous censoring random variables C1 , . . . , Cn , again mutually independent of the T ’s. Suppose that we have two groups with sample
size n1 and n2 , n = n1 + n2 , such that the groups of survival times have the
i.i.d. structure
L(Ti ) = F1 ,
and L(Ti ) = F2 ,
1 ≤ i ≤ n1
(4.1)
n1 + 1 ≤ i ≤ n .
Under censoring we only observe as in (3.2)
(Xi , ∆i ) := (min(Ti , Ci ), 1{Ti ≤Ci } ), 1 ≤ i ≤ n .
(4.2)
Different nonparametric hypotheses are of interest, namely tests for stochastic ordering
H0 : F2 ≤ F1 against F2 ≥ F1 , F1 6= F2 .
(4.3)
Sometimes this alternative is substituted by the alternative of strict superiority
Λ2 − Λ1 ≥ 0 (as measures) .
(4.4)
The two-sided testing problem is given by
˜ 0 : F1 = F2 against F1 6= F2 .
H
(4.5)
At this stage we should remember that in case of the absense of censoring
the present testing problems are well studied and (4.5) is the traditional twosample goodness of fit testing problem. For these reasons the survival tests
will also be compared with classical tests, say rank tests.
16
Towards two-sample survival tests
Up to now the distributions of the censoring random variables Ci are not
specified. We will treat them as nuisance parameters. We will mention three
different models I - III for the C’s under the null hypothesis H0 . At the
boundary {F1 = F2 } of the null (4.3) we will introduce different conditions
which lead to different hypotheses.
H0I : T1 , ..., Tn i.i.d.,
C1 , ..., Cn i.i.d.
(4.6)
The second null hypothesis is given by
H0II : T1 , ..., Tn i.i.d.,
and
L(Ci ) =: G1 , 1 ≤ i ≤ n1
(4.7)
L(Ci ) =: G2 , n1 + 1 ≤ i ≤ n2 .
The censoring distributions may be completely unknown (but continuous in
our set up). In practice the model (4.6) is often too restrictive and must be
avoided. To give an example consider two-sample survival times Ti which are
mainly censored by the end of the study. Under random entrance in our study
the variable Ci is then the time between the beginning of the observation of
patient i until the end of the study. This patient is censored when he is still
alive at the end of our study. In practice typically one group (for instance the
control group) is studied earlier than the other group (which may be here a
test group with a new drug). Then the choice of the model (4.7) with G1 6= G2
would be appropriate.
The most general model consists of the case when every Ti has a censoring
variable Ci with its own censoring distribution for each i.
H0III : T1 , ..., Tn i.i.d., C1 , ..., Cn independent .
(4.8)
One observes that under H0II or H0III the i.i.d. structure of the observation
(X1 , ∆1 ), · · · , (Xn , ∆n ) is lost although F1 = F2 holds.
In the next step we will call our attention to semiparametric alternatives for
the T ’s, see section 2. It is our aim to derive test statistics such that H0I − H0III
can be rejected.
Consider now semiparametric alternatives Λ2 of Λ1 =: Λ (and F1 =: F ) of that
type discussed in section 2. For instance ordering F2 ≥ F1 may be modelled
by an additional risk factor, see Ex. 2.5. Our choice is the model (2.15)
and (2.17) with the parametrization γ = γ0 ◦ F motivated in (2.21). For the
semiparametric submodel we are now going to test
Λ2 (·, 0) (= Λ(·)) against Λ2 (·, ϑ), ϑ > 0 ,
(4.9)
where T1 may have the hazard Λ and Tn the hazard Λ2 (·, ϑ) for some ϑ > 0.
Let 1 − Fϑ (x) = exp(−Λ2 ([0, x], ϑ)) and F = F0 denote their distributions.
17
Towards two-sample survival tests
Suppose that cni , 1 ≤ i ≤ n, is an array of regression coefficients. Then a
parametric path for the two-sample testing problem (4.1) - (4.8) is given by
L(T1 , . . . , Tn ) = ⊗ni=1 Fϑcni .
(4.10)
The score function at ϑ = 0 of that model is by (3.15) just
n
X
cni Mγ (Xi , ∆i , ∞)
(4.11)
i=1
which again only depends implicitely on the censoring distributions of the C’s.
When parametric hypotheses are specified by the parameters ϑ then tests based
on test statistics (4.11) are called score tests. Introduce the non-centered
two-sample regression coefficients
1/2
n
cni =
1{n1 +1,...,n} (i)
(4.12)
n1 n2
P
P
with ni=1 (cni − c¯n )2 = 1, c¯n := n1 ni=1 cni . Of special importance are the
centered versions of (4.12) as we will see later, namely
1/2
1/2
n
n
n2
cni =
1{n1 +1,...,n} (i) −
n1 n2
n1 n2
n

1
n n 1/2  − n1
1 2
=

n
1
n2
1 ≤ i ≤ n1
(4.13)
n1 + 1 ≤ i ≤ n
with c¯n = 0.
In the next step the unknown baseline nuisance parameters F and Λ of (4.11)
will be eliminated. Also introduce the order statistics X1:n ≤ X2:n ≤ · · · ≤ Xn:n
of the X’s, see (4.2), and let ∆i:n be the censoring status of Xi:n and ∆n =
(∆1:n , . . . , ∆n:n ). They are sometimes called the concomitant order statistics.
The antirank vector Dn = (Dni )1≤i≤n of the X’s is defined by Xi:n = XDni
which is actually a random permutation of {1, ..., n}. Using this notation the
score function (4.11) can be rewritten as
Z
n
n
X
X
cni Mγ (Xi , ∆i , ∞) =
cnDni ∆i:n γ(Xi:n ) −
γ dΛ .
(4.14)
i=1
[0,Xi:n ]
i=1
In this formula Λ is now replaced by its nonparametric estimator, the Nelsonˆ n (see ABGK [3], sect. IV.1.)
Aalen estimator Λ
ˆn =
Λ
n
X
j=1
∆j:n
δX ,
n + 1 − j j:n
(4.15)
18
Towards two-sample survival tests
of the pooled sample under the null hypothesis which yields
(
)
n
i
X
X
∆j:n γ(Xj:n )
cnDni ∆i:n γ(Xi:n ) −
n+1−j
i=1
j=1
=
n
X
(
γ(Xi:n )∆i:n
cnDni −
i=1
Pn
j=i cnDnj
(4.16)
)
n+1−i
if summation by parts is used. This consideration suggests to use two-sample
test statistics
(
)
Pn
n
X
j=i cnDnj
Tn =
wn (i)∆i:n cnDni −
,
(4.17)
n+1−i
i=1
where wn (i) are random weights. If the form γ = γ0 ◦ F is used then F may
be estimated by the Kaplan-Meier estimator Fˆn of the pooled sample
and (4.17) turns into (4.16) by the choice of weights
wn (i) = γ0 (Fˆn (Xi:n −)) or γ0 (Fˆn (Xi:n )) .
The Kaplan-Meier estimator is given by
Y
1 − Fˆn (Xi:n ) =
(1 −
j≤i
∆j:n
).
n+1−j
(4.18)
(4.19)
Observe, that here wn (i) only depends via the Kaplan-Meier estimator on
(∆j:n )j≥1 and not on the metric values of the order statistics (Xj:n )j≥1 .
Remark 4.1
(a) In the uncensored case Fˆ (Xi:n ) = ni holds since the Kaplan-Meier estimator coincides with the empirical distribution function. The choice of the
weights (4.18) then naturally leads to rank statistics (4.17) which are of course
appropriate for our semiparametric model.
(b) Note, that the test statistics Tn are the same for our uncentered and centered two-sample regression coefficients (4.12) and (4.13). The corresponding
parametric score statistics are, however, definitely not the same.
Lemma 4.2
Under uniformly distributed antiranks the increments
Pn
j=i cnDnj
,1 ≤ i ≤ n,
cnDni −
n+1−i
(4.20)
are the increments of a discrete martingale w.r.t the filtration
Fj = σ(Dn1 , . . . , Dnj ), j ≤ n. Especially this holds for instance under H0I
(4.6).
19
Towards two-sample survival tests
Proof. The conditional distribution of Dni given Fi−1 is the uniform distribution on the set {1, . . . , n} \ {Dn1 , . . . , Dni−1 }. Thus
Pn
j=i cnDnj
E(cnDni |Fi−1 ) =
(4.21)
n+1−i
holds.
In the special case of the two-sample regression coefficients the increments
(4.20) have a nice sequential interpretation. They are equal to
)
P
1/2 (
n2 − i−1
1
(D
)
n
ni
{n
+1,...,n}
1
j=1
1{n1 +1,...,n} (Dnj ) −
n1 n2
n+1−i
=:
n
n1 n2
1/2
{Oi − Ei } .
(4.22)
The observed part Oi is centered at the expected quantity Ei (w.r.t. Fi−1 ).
Note that Ei is just the ratio of the number of individuals under risk of group
2 w.r.t. the number of the whole population under risk given Fi−1 .
As a conclusion we see that (4.17) is a weighted sum of the martingale increments (4.20). Thus Tn is based on a sequential construction where the increments (4.20) have an interpretation as observed minus expected quantities.
The "status" cnDni is observed and its conditional expectation given Fi−1 is
subtracted.
20
Chapter 5
Martingale representations of
survival test statistics
Unfortunately, Lemma 4.2 is no longer valid under H0II , (4.7), which is the
most interesting hypothesis. In this case the counting process approach leads
to a martingale representation of test statistics. The main object is the so
called log rank process Ln which substitutes the adjusted rank process (4.20).
As we will see it is very convenient to use the counting process approach.
We refer to Gill [9], Fleming and Harrington [8], ABGK [3] and Shorack and
Wellner [39]. Introduce by
N1 (t) =
n1
X
∆i 1[0,t] (Xi ), N2 (t) =
i=1
n
X
∆i 1[0,t] (Xi )
(5.1)
i=n1 +1
for t ≥ 0 the counting processes of uncensored events of group 1 and group 2
and let N (t) = N1 (t) + N2 (t) be the total number of uncensored events until
t. The number of individuals under risk at time t is given by
Y1 (t) =
n1
X
1[t,∞) (Xi ), Y2 (t) =
i=1
n
X
1[t,∞) (Xi ) .
(5.2)
i=n1 +1
Set Y (t) = Y1 (t) + Y2 (t). Let
Ft = σ(Xi 1{Xi ≤t} , 1{Xi ≤t} , ∆i 1{Xi ≤t} , 1 ≤ i ≤ n), 0 ≤ t < ∞
(5.3)
denote the two-sample analogue filtration of (3.9). Define the log-rank process
Z t
Y1 Y2 dN2 d N1
−
(5.4)
Ln (t) =
Y
Y2
d Y1
0
Z
= N2 (t) −
0
t
Y2
dN ,
Y
which is carried out as trivial pathwise stochastic integral.
21
Martingale representations of survival test statistics
Theorem 5.1
Consider as in (4.1) and (4.2) two-sample survival times with arbitrary C’s.
Then
Z t
Z t
Y1 Y2
Y1 , Y2
Ln (t) −
dΛ1 −
dΛ2
(5.5)
Y
Y
0
0
is a martingale w.r.t (Ft )t≥0 .
In particular, Ln is a martingale under the null hypothesis H0III , (4.8).
Proof. For each observation formula (3.11) gives us for γ = 1 the Doob-Meyer
decomposition of ∆i 1[0,t] (Xi ). If we take the sum over the first (and second)
group we achieve that
Z t
Y1 (s) dΛ1 (s) =: M1 (t)
(5.6)
N1 (t) −
0
and
Z
N2 (t) −
t
Y2 (s) dΛ2 (s) =: M2 (t)
(5.7)
0
are independent martingales. Since the processes YY1 and
conclude that
Z t
Z t
Y1
Y2
dM2 −
dM1 =
0 Y
0 Y
Z
Ln (t) −
0
t
Y1 Y2
dΛ1 −
Y
Z
0
t
Y1 Y2
dΛ2
Y
Y2
Y
are predictable we
(5.8)
is a martingale, see also Gill [9], sect. 3.3, and Fleming and Harrington [8],
chapt. 7.
If we make use of the two sample regression coefficients (4.12) then the test
statistic Tn given in (4.17) can be expressed by the log rank process. Observe
that the increments of Ln are just
!
Pn
n n 1/2
c
nD
nj
1 2
j=i
∆i:n cnDni −
(5.9)
∆Ln (Xi:n ) =
n
n+1−i
and (4.17) reads as
Tn =
n
n1 n2
1/2 X
n
wn (i)∆Ln (Xi:n ) .
(5.10)
i=1
Here ∆ has nothing to do with censoring. As usual define ∆f (x) := f (x) −
f (x−) of a function f .
22
Martingale representations of survival test statistics
If (w(t))t≥0 denotes a predictable weight function with w(Xi:n ) = wn (i) then
Tn (t) =
n
n1 n2
1/2 Z
t
w(s) dLn (s)
(5.11)
0
is a martingale with Tn (∞) = Tn . Thus weights (4.18) based on Fˆn (Xi:n −)
yield martingales (5.11). The asymptotics of Tn can then be established by
limit theorems for continuous martingales, see Gill [9], ABGK [3] und Fleming
und Harrington [8].
Janssen and Neuhaus [21] pointed out that another martingale approach of discrete type is of interest. It is closely connected to statistics based on "observed
minus expected" sums and it is mainly needed when ties occur. However,
it is also worthwhile for continuous distributions since non-predictable weights
given by functions of Fˆn (Xi:n ) can be treated. As main tool the filtration is
modified. In the continuous case the main assertion of Lemma 2.1 of Janssen
and Neuhaus [21] can be summarized as follows.
By a scale transformation the order statistics X1:n < X2:n < . . . < Xn:n are
transformed to the grid { n1 , n2 , . . . , nn }. The transformed log rank process is
denoted by
i
Mn ( ) := Ln (Xi:n ), 1 ≤ i ≤ n and Mn (0) = 0 .
n
(5.12)
Check that the increment ∆Ln (Xn:n ) = 0 always vanishes for continuous distributions at the endpoint. Define now the filtration Gi , i = 0, . . . , n − 1 by
Gi = σ(Dn1 , . . . , Dni , ∆1:n , . . . , ∆i:n , ∆N (Xi+1:n ))
(5.13)
where the antiranks and the status of the order statistics are as in (4.14).
Let Gn contain all information. In contrast to the ordinary filtration (3.9) the
increment of the (joint) counting process N at the next time Xi+1:n is added
to Gi .
Lemma 5.2
Under the null hypothesis H0III , see (4.8), the process
i
i 7−→ Mn ( ), 0 ≤ i ≤ n
n
(5.14)
is a martingale w.r.t. the filtration (5.13).
Proof. Our counting processes (5.1) are now transformed to the grid where
the bar indicates their counterparts on {0, n1 , ..., 1}. Define for i = 1, 2
¯i ( k ) := Ni (Xk:n ) , N
¯ ( k ) := N (Xk:n )
N
n
n
23
Martingale representations of survival test statistics
and
k
k
Y¯i ( ) := Yi (Xk:n ) , Y¯ ( ) := Y (Xk:n ) = (n + 1 − k)
n
n
where X0:n := −∞ is used for k = 0.
The crucial step of the proof is based on the following Lemma. Lemma 5.2 is an
immediate consequence of (5.4) and (5.15) below. The proof is figured out for
continuously distributed Ti and Ci within our context. A general version which
also works when ties are present is proved in Janssen and Neuhaus [21].
Lemma 5.3
Let T1 , .., Tn be i.i.d survival times and let C1 , ..., Cn denote independent censoring times. For each 0 ≤ k ≤ n we have
Y¯2 ( nk ) ¯ k
k
¯
E ∆N2 ( ) | Gk−1 = ¯ k ∆N ( ) .
(5.15)
n
n
Y (n)
Remark 5.4
Although the censoring distributions may be inhomogenous the expected number of events ∆N2 ( nk ) of group 2 given Gk−1 can be deduced from a uniform
distribution on the set of individuals under risk with {Xk:n , ..., Xn:n }. Formula
(5.15) admits an interpretation as the ratio of number of individuals under
risks in group 2 divided by the number of all individuals under risk at nk .
¯ ( k ) = 0 since then ∆N2 ( k ) =
Proof. Statement (5.15) is obvious for ∆N
n
n
¯ ( k ) = 1 is valid which means that Xk:n is
0 holds. Assume now that ∆N
n
uncensored, i.e. ∆k:n = 1. Let
k
(5.16)
Zi ( ) := ∆i 1(−∞,Xk:n ] (Xi )
n
denote the uncensored event process of the i-th individual.
The restriction to the grid has the advantage that the conditional expectations
w.r.t. Gk−1 can be calculated as discrete conditional expectations given the
antiranks
(Dn1 , ..., Dn(k−1) ) = (j1 , ..., jk−1 ) ,
(5.17)
their censoring status
(∆1:n , ..., ∆k−1:n ) = (δ1 , ..., δk−1 ) ,
(5.18)
¯ ( k ) = 1. Define
and ∆N
n
I := {1, ..., n} \ {j1 , ..., jk−1 }
which corresponds to the individuals under risk at nk given (5.17).
We will prove below that for each j ∈ I the conditional expectation
k
k
¯
E Zj ( ) | (Dni )i≤k−1 = (ji )i≤k−1 , (∆i:n )i≤k−1 = (δi )i≤k−1 , ∆N ( ) = 1
n
n
(5.19)
24
Martingale representations of survival test statistics
is independent of the index j. It is obvious that (5.19) vanishes for all j ∈
/ I.
For j ∈ I the expression (5.19) equals
¯(k) = 1
P Zj ( nk ) = 1 , (Dni )i≤k−1 = (ji )i≤k−1 , (∆i:n )i≤k−1 = (δi )i≤k−1 , ∆N
n
.
¯(k) = 1
P (Dni )i≤k−1 = (ji )i≤k−1 , (∆i:n )i≤k−1 = (δi )i≤k−1 , ∆N
n
(5.20)
It sufficies to prove that the numerator
k
k
¯
P Zj ( ) = 1 , (Dni )i≤k−1 = (ji )i≤k−1 , (∆i:n )i≤k−1 = (δi )i≤k−1 , ∆N ( ) = 1
n
n
(5.21)
is independent of j ∈ I.
In order to use the conditional distribution given Xj = x the event under
consideration in (5.21) is decomposed as
[
\
Bx ∩ {Xj = x, ∆j = 1} ∩
{Tm > x, Cm > x}
(5.22)
x∈R
m∈I\{j}
where Bx denotes the set
Bx = Xj1 < Xj2 < ... < Xjk−1 < x , (∆ji )i≤k−1 = (δi )i≤k−1 .
(5.23)
Since the individuals are independent the probability (5.21) can be calculated
by Fubini’s theorem as
Z
\
δ P (Bx ∩
{Tm > x, Cm > x}) dL(Xj , ∆j )(x, δ)
m∈I\{j}
Z
=
δ P (Bx )
Y
P (Tm > x)P (Cm > x) dL(Xj , ∆j )(x, δ) .
(5.24)
m∈I\{j}
Let F be the common distribution function of Ti . By (3.3) the integration via
the uncensored subdistribution δ dL(Xj , ∆j )(x, δ) is given by P (Cj > x) dF (x).
Thus (5.24) is equal to
Z
Y
P (Bx ) (1 − F (x))|I\{j}|
P (Cm > x) dF (x)
(5.25)
m∈I
which is the same for all j ∈ I.
Consider now a fixed member m ∈ I. By (5.19)
X
n1
k
k
¯
E ∆Zi ( ) | Gk−1
E ∆N1 ( ) | Gk−1 =
n
n
i=1
k
k
¯
= Y1 ( ) E ∆Zm ( ) | Gk−1
n
n
(5.26)
25
Martingale representations of survival test statistics
follows since |{1, ..., n1 }∩I| = Y¯1 ( nk ) is the number of individuals at risk within
group 1. Similarily we have
k
k
k
¯2 ( ) | Gk−1 = Y¯2 ( ) E ∆Zm ( ) | Gk−1 .
(5.27)
E ∆N
n
n
n
¯ =N
¯1 + N
¯2 holds we obtain by (5.26) and (5.27)
Since N
k
k
¯ ( ) = E ∆N
¯ ( ) | Gk−1
∆N
n
n
k
k
¯
= Y ( ) E ∆Zm ( ) | Gk−1 .
n
n
The statements (5.26)-(5.28) prove our lemma.
(5.28)
The discrete martingale approach of Lemma 5.2 can be used to establish similar
results as in (5.11) for a larger class of weight functions. For this purpose
introduce the transformed version of (5.9) by
Sn (t) :=
n
n1 n2
1/2 X
[nt]
i=1
i
wn (i)Mn ( ), 0 ≤ t ≤ 1 ,
n
(5.29)
on [0, 1] with Sn (1) = Tn . Note that whenever the weights i 7−→ wn (i) are
predictable w.r.t (Gi )i≤n then (5.29) is a martingale under H0 . For that type
of martingales arising from a discrete martingale difference array functional
central limit theorems on D[0, 1] are well known.
Remark 5.5
Suppose that the weights wn (i) of the test statistic Tn (5.10) are Gi predictable.
Note that weights of the form wn (i) = Fˆn (Xi:n ) given by the pooled KaplanMeier estimator are now allowed. Then Tn is the partial sum of a martingale
difference array. Under regularity assumptions the discrete martingale central
limit theorem can be applied to prove that Tn is asymptotically normal distributed. Also invariance principles can be established, see McLeish [31]. For
further information we refer to Hall and Heyde [10], chapt. 3. In particular,
we refer to Corrolary 3.1 of Hall and Heyde [10]. Here it is pointed out that
the convergence of the predictable quadratic variance Vn of Tn (see (5.32)) is
an important ingredient of the central limit theorem.
Functional central limit theorems can be applied to the continuous time process (Tn (t))t≥0 ) given in (5.11) to prove convergence of the process on the
Skorokhod space D[0, ∞). The arguments are based on Rebolledo’s central
limit theorem, see Gill [9], Fleming and Harrington [8] and ABGK [3].
26
Martingale representations of survival test statistics
Under regularity assumptions these CLT ’s establish distributional convergence
of the test statistics Tn = Tn (∞) and Tn = Sn (1) to a centered normal
random variable. Since the asymptotic variance depends on the unknown
censoring distribution this variance has to be estimated.
Recall that the predictable quadratic variation process of Mn is given by
i
X
i
j 2
hMn i( ) =
E Mn ( ) |Gj−1 .
(5.30)
n
n
j=1
It is the compensator of M2n which means that M2n − hMn i is a martingale.
The increments of that process can be calculated via a conditionally binomial
distributed random variable with parameter p = Y2 (Xj:n )/Y (Xj:n ) with
j 2
Y1 (Xj:n )Y2 (Xj:n )
E Mn ( ) |Gj−1 = ∆j:n
,
(5.31)
n
Y (Xj:n )2
see Fleming and Harrington [8], p. 8 and sect. 7. The predictable quadratic
variation of Tn is then
X
n
n
Y1 (Xi:n )Y2 (Xi:n )
wn (i)2 ∆i:n
(5.32)
Vn =
n1 n2 i=1
Y (Xi:n )
=
n
n1 n2
Z
∞
wn (s)
0
Y1 (s)Y2 (s)
dN (s) .
Y 2 (s)
The latter form is used for the continuous martingale approach, see Gill [9]
and Fleming and Harrington [8]
Together with a conditional Lindeberg condition the convergence of Vn → σ 2 >
0 in probability implies L(Tn ) → N (0, σ 2 ) in law. Thus Vn may serve as a
1/2
consistent variance estimator and Tn /Vn can be taken as studentized test
statistics. Under regularity assumptions
Tn > −1
1
Φ (1 − α)
(5.33)
ϕn =
if
1/2
0
Vn ≤
establishes a sequence of tests with asymptotic nominal level E(ϕn ) → α under
H0III . Sufficient conditions can be found in Neuhaus [35], Th. 3.2.
The program can completely be carried out for the statistics Tn and Vn under
H0II , see (4.7), which are given by the weights (4.18). For this purpose suppose
that
n1
lim
= η ∈ (0, 1)
(5.34)
n→∞ n
exists and let γ0 be continuous on [0, 1]. To motivate the asymptotic variance
formula (5.38) below suppose that i(n) is a sequence of integers with Xi(n):n −→
s. Under H0II the Kaplan-Meier estimator is consistent and
γ0 (Fˆn (Xi(n):n −)) −→ γ0 (F (s))
(5.35)
27
Martingale representations of survival test statistics
follows. The strong law of numbers implies
Yk (Xi(n):n )
−→ (1 − Gk (s))(1 − F (s))
nk
Y (X
(5.36)
)
i(n):n
for k = 1, 2 and
−→ [η(1 − G1 (s)) + (1 − η)(1 − G2 (s))](1 − F (s))
n
almost surely, where G1 and G2 are the censoring distributions of (4.7). By
(3.3) we have in addition for k = 1, 2
Z s
Nk (Xi(n):n )
−→
(1 − Gk (v)) dF (v) .
(5.37)
nk
0
Altogether (5.35) - (5.37) motivates that Vn is convergent in probability under
H0II with
Z ∞
(1 − G1 (s))(1 − G2 (s))
γ(F (s))
lim Vn =
dF (s) .
(5.38)
n→∞
η(1 − G1 (s)) + (1 − η)(1 − G2 (s))
0
The technical details are discussed in Gill [9], sect. 4 and 5.1, and Fleming
and Harrington [8], sect. 7.
Example 5.6
Consider the two-sample proportional hazard model of Ex. 2.6. The test ϕn
(5.33) given by constant weights wn (i) = 1 is called the log rank test which
arises from risk component models with γ0 = 1.
28
Chapter 6
Conditional survival tests
Monte Carlo experiments show that the nominal level of the asymptotic tests
ϕn (5.33) may exceed the given level α at a small and intermediate finite
sample size. Problems of this type may occur for an unbalanced design when
the sample size n2 of the test group is much smaller than the sample size
n1 in the control group or vice versa. Thus, the question arises whether ϕn
can be made finite sample distribution free of exact level α. Recall that the
survival tests are extended rank tests which can be made distribution free
when all observations are uncensored. The answer of that question relies on
the censoring distributions whether they are homogeneous or not.
6.1
The homogeneous null hypothesis H0I
Under the null hypothesis (4.6) the observations (X1 , ∆1 ), . . . , (Xn , ∆n ) are
i.i.d. Thus, the test ϕn can be carried out as a permutation test which works
as follows.
• In a first step the data (Xi (w), ∆i (w))i≤n are kept fix.
• The permutation step consists in choosing a random permutation τ =
(τ1 , . . . , τn ) of {1, . . . , n} which is mutually independent of the data. Let
Fn,w (·) denote the conditional distribution function for fixed w of the
permutation distribution given by
τ 7−→ Tn ((Xτ1 (w), ∆τ1 (w)), . . . , (Xτn (w), ∆τ n (w)))
(6.1)
where Tn = Tn ((Xi , ∆i )i≤n ) denotes the test statistic (5.10).
The permutation version of the survival test ϕn (5.33) works with the condi−1
tional critical value c∗n = c∗n (w) := Fn,w
(1 − α), i.e.

>
 1
γα if Tn = c∗n
ϕn,perm =
(6.2)

0
<
29
Conditional survival tests
where the random variable γα is determined by
E(ϕn,perm |(Xi , ∆i )i≤n ) = α .
(6.3)
Actually, ϕn,perm is a conditional test (test with Neyman structure) given by
the order statistics and their concomitants
(Xi:n , ∆i:n )i≤n .
(6.4)
Note that under H0I the statistic (6.4) is sufficient and ϕn,perm has nominal
level α. Within this special submodel every test with exact nominal level α is
already a permutation test.
Lemma 6.1 (Moser 1994, Völker 2003)
Within the full nonparametric model with arbitrary continuous distribution
functions F and G the statistic (6.4) is complete under H0I .
Thus every exact test ψ with E(ψ) = α under H0I is then a conditional test
with E(ψ|(Xi:n , ∆i:n )i≤n ) = α.
Moser [32] proved the completeness of (∆i:n )i≤n which is enough to treat extended rank statistics Tn ((Dni , ∆i:n )i≤n ). A proof of the general completeness
result can be found in Völker [41], p.93.
Remark 6.2
(a) Under H0I the antiranks (Dn1 , . . . , Dnn ) of the X’s, see section 4, are independent of the sufficient statistic (6.4) and they are uniformly distributed
permutations, see Neuhaus [33]. By Basu’s theorem the independence also
follows from Lemma 6.1.
(b) For fixed observations of (6.4) the antiranks of the statistic (4.17) are formally substituted by our random permutations (τ1 , . . . , τn ), see (6.1), in order
to get the permutation distribution.
(c) Suppose that as in (4.18) the random weights wn (i) only depend on the
Kaplan-Meier estimator. Then the test statistic Tn = Tn (Dni , ∆i:n )i≤n is a
generalized rank statistic which is independent of the value of the order statistics (Xi:n )i≤n . This reflects the semiparametric nature of the testing problem.
It is not at once clear that ϕn,perm behaves like its unconditional counterpart
ϕn (5.33).
Theorem 6.3 (Janssen (1991), Theorem 2.1)
Consider the two-sample test (5.33) with test statistics (4.17) and (5.32). Under mild regularity assumptions (with L2 -convergent weights) the tests ϕn,perm
and ϕn are asymptotically equivalent under H0I , i.e.
ϕn − ϕn,perm −→ 0 in probability .
(6.5)
30
Conditional survival tests
Whenever γ0 is continuous on [0,1] the regularity assumptions hold for the
natural weights (4.18) and (6.5) follows. The permutation version has the
advantage that the error probability of the first kind can be controlled for
each finite sample size n. Power functions under alternatives are discussed in
chapter 7.
6.2
The heterogeneous null hypotheses H0II and
H0III
Under the null hypothesis H0II the i.i.d. structure of the data (Xi , ∆i )i≤n is lost.
In comparision with Tn the conditional distribution of (6.1) has the wrong permutation variance and the ordinary permutation test ϕn,perm does no longer
work. Moreover, the equivalence (6.5) will fail in general and E(ϕn,perm ) may
exceed the level α under H0II . In view of Lemma 6.1 it is therefore hopeless to
look for tests with exact nominal level α under H0II .
To overcome these difficulties Neuhaus [35] introduced very interesting studentized permutation tests for H0II which naturally extend the permutation tests
(6.2) being valid for H0I . His procedure is based on a variance correction of the
permutation variance of (6.1) for different G1 6= G2 . This is done by taking
the permutation distribution of the studentized statistic Sn := Tn /V n 1/2 of
(5.33), i.e.
τ → Sn ((Xτ 1 , ∆τ 1 ), . . . , (Xτ n , ∆τ n ))
(6.6)
1/2
for fixed data similar as in (6.1) where the denominator Vn is taken into
account and is part of the permutation procedure. If d∗n denotes the conditional
(1 − α)-quantile of that permutation distribution of (6.6) then
Tn > ∗
1
S
dn
(6.7)
ϕn,perm =
if
1/2
0
Vn ≤
denotes the studentized permutation version of the survival test ϕn (5.33).
Theorem 6.4 (Neuhaus 1993)
Under mild regularity assumptions (with L2 -convergent weights) the survival
tests ϕn given by the extended rank statitstics Tn = Tn ((Dni )i≤n , (∆i:n )i≤n ) and
the studentized permutation tests ϕSn,perm are asymptotically equivalent under
H0II , i.e.
ϕn − ϕSn,perm → 0 in probability .
(6.8)
In particular, E(ϕSn,perm ) → α holds as n → ∞.
Again the theorem applies for continuous γ0 and weights (4.18) under the
two-sample regime (5.34). The new permuation test has several advantages:
• Under H0I the tests ϕSn,perm are of exact level α for each sample size n.
31
Conditional survival tests
• Within the asymptotic set-up it shares the same good properties as ϕn .
The proof relies on a conditional central limit theorem for the studentized
statistic (6.6). A general conditional central limit theorem can be found in
Janssen and Mayer [22]. In that paper the condition (5.34) is weakend and
the equivalence (6.8) can be extended to H0III (4.8) for various cases. Also the
weights wn (i) of the numerator Tn may depend on the whole set of ordered
quantities (6.4).
Monte Carlo experiments of Neuhaus [35] and Heller and Venkatraman [14]
support the studentized permutation tests. Their nominal level is acceptable
also under different censorship with G1 6= G2 .
32
Chapter 7
The efficiency of semiparametric
survival tests
The present survival tests were introduced in section 4 as semiparametric versions of parametric score tests. Recall that under regularity assumptions parametric one-sided score tests are locally most powerful tests along the underlying
path of alternatives in the sense that they maximize the slope of the power
function at the origin (null hypothesis), Witting [42], sect. 2.2.4.
In general it is hopeless to get explicit finite sample formulas for nonparametric
power functions of competing tests. For these reasons the statistician turns to
the comparison of asymptotic power functions under local alternatives given
by L2 -differentiable
√ paths. Local alternatives are rescaled alternatives (typically of order 1/ n), which can be motivated as follows. Under this type
of alternatives the interesting asymptiotic envelope power functions (serving
as benchmark) are non-trivial with √
power between the level α and power 1.
Slower rescaled alternatives than 1/ n lead to asymptotic envelope power 1
and a power comparison is here doubtful. The asymptotic envelope power
function is evaluated at an arbitrary sequence of local alternatives and it is
given by the asymptotic power within the underlying class of tests. It is obvious that different classes of tests may have different envelope power functions.
In our case the comparison is now done via parametric paths of alternatives,
given by the direction γ0 ∈ L2 (0, 1), which are still present but the statistician
does not know in advance which path√ would be adequate. A typical twosample path of alternatives of order 1/ n was introduced in (4.10) via (2.17).
A sequence of tests at asymptotic level α for some null hypothesis is called
asymptotically efficient w.r.t. a given class of tests along a path of alternatives if its asymptotic local power function reaches the underlying asymptotic
envelope power function. As consequences of the three famous Lemmas of
LeCam, see Hájek und Šidák [12], the following result is well-known for onesample and two-sample testing problems.
33
The efficiency of semiparametric survival tests
Theorem 7.1
Consider a parametric L2 -differentiable
path of distributions and one-sided lo√
cal alternatives of order 1/ n for a simple null hypothesis. Then the finite
sample locally most powerful score tests of asymptotic level α are asymptotically efficient in the sense that their power reaches the asymptotic envelope
power function given by the Neyman Pearson tests of level α for local alternatives.
The same results holds for two-sided testing problems and asymptotically unbiased tests w.r.t. unbiased asymptotic envelope power functions.
The question of efficiency is slightly different for a composite null hypothesis. We do not like to go into details but we like to explain the principle
of efficient score functions for a composite null hypothesis. Suppose that we
consider a curve Pϑ which hits a composite null hypothesis H(0) at P0 , see
figure 7.1. When testing H(0) against Pϑ the Neymann Pearson test for P0
against Pϑ is not adequate since additional information concerning the path is
used. For this reason Pϑ is projected on some Qϑ ∈ H(0) where Qϑ and Pϑ
may stand for a least favorable pair, i.e. they correspond to the hardest binary
testing problem arising from H(0) against Pϑ .
To explain this approach let
βα (Pϑ ) = sup EPϑ (ϕ)
(7.1)
denote the envelope power function at level α for testing H(0) against {Pϑ }
where the supremum is taken over all tests ϕ of level α for H(0). Let ϕN P (α,Q,Pϑ )
be the Neyman Pearson test for a simple null hypothesis {Q} ⊂ H(0) against
Pϑ with nominal level α. Obviously, the upper bound
βα (Pα ) ≤ inf EPϑ (ϕN P (α,Q,Pϑ ) )
(7.2)
Q∈H(0)
of the envelope power function can be derived. Thus a level α test ψ for H(0) is
efficient for testing H(0) against {Pϑ } if there exists a (projection) Qϑ ∈ H(0)
with EPϑ (ψ) = EPϑ (ψN P (α,Qϑ ,Pϑ ) ). For an increasing sample size n we attach
the index n at Hn (0), Pϑn and ψn . A sequence of tests ψn of asymptotic level
α for testing Hn (0) against {Pϑn } is then efficient if there exists a sequence
Qϑn ∈ Hn (0) with
EPϑn (ψn ) − EPϑn (ψN P (α,Qϑn ,Pϑn ) ) −→ 0
(7.3)
as n → ∞.
The projection procedure is visualized in figure 7.1.
34
The efficiency of semiparametric survival tests
H(0)
u
u
Qϑ
Pϑ
u
P
0
Figure 7.1:
35
The efficiency of semiparametric survival tests
There is a nice analytic way to describe the projection procedure of Pϑ on Qϑ
in H(0) via score functions. Let
l=
dPϑ
d
log
dϑ
dP0 |ϑ=0
(7.4)
be the score function of the underlying path. Introduce the set M ⊂ L2 (P0 )
which is the closed linear subspace generated by all possible score functions
at P0 arising from paths in H(0). Then l = l1 + l2 can be decomposed in a
score function l2 ∈ M and a M -orthogonal part l1 . The function l1 is called
the efficient score function, see Bickel et al. [6]. The members of the path
Qϑ , given by the projection Pϑ on H(0), have usually the score function l2 .
Pϑ
. On
In this case the efficient score function is equal to l1 = ddϑ log dd Q
ϑ |ϑ=0
the other hand if we start with the decomposition l = l1 + l2 then typically
a path in H(0) with score function l2 can be constructed. Their members
are then candidates for the projection Qϑ . For a L2 -differentiable path we
now summarize the results for testing a composite null hypothesis of product
measures against local alternatives derived by our path Pϑ . The next theorem
is in the spirit of Theorem 7.1.
Theorem 7.2
Score tests based on the efficient score function l1 (of the underlying score
function l) are asymptotically efficient for composite hypotheses against onesided local alternatives given by the underlying path with score function l.
The details about efficient score functions can be found in Bickel et al. [6]. We
refer to Witting and Müller-Funk [43], sect. 6.4., for applications to tests.
The meaning of the projection method can easily be understood if we turn
to the limit experiment, see Witting and Müller-Funk [43], sect. 6.4. Under L2 -differentiability local asymptotic normality (LAN) holds and the limit
experiment is a Gaussian shift with a Gaussian loglikelihood process. The
asymptotic envelope power function is then the envelope power function for
the limit experiment. It is easy to see that for Gaussian shifts the envelope
power function is given by the Neyman Pearson power of the projection Qϑ of
Pϑ on H(0) against Pϑ . Under LAN the same holds for the asymptotic power
function within the asymptotic set-up.
These general principles will now be applied and explained for the different
null hypotheses H0I and H0II where the distributions F1 = F2 and the G’s are
nuisance parameters. The path of alternatives are constructed as in (4.10) via
the hazard rate models of chapter 2.
In his fundamental work Gill [9], chapt. 5, calculated the asymptotic power
function of our unconditional survival tests (5.33) under local hazard alternatives. As a main tool he used the martingale approach summarized in Th. 5.1.
36
The efficiency of semiparametric survival tests
Let w(·) be a F-predictable weight function. Then
Z t
Z t
Y 1 Y2
w
w(s)dLn (s) −
[dΛ1 − dΛ2 ]
Y
0
0
(7.5)
is a martingale. Under local alternatives of type (4.10) then continuous time
martingale central limit theorems prove asymptotic normality of our test statistic Tn . Notice that contiguity implies that the variance estimator Vn of Tn
remains consistent under local alternatives whenever (5.38) holds under the
null hypothesis. These results were used by Gill [9], sect. 5.1 and 5.3, to compare competing weight functions and tests for the underlying hazard models,
see also Fleming and Harrington [8]. The general question about the efficiency
in comparison with envelope power functions remained open at that time. Below the efficiency of tests is discussed for the different null hypotheses (4.6) (4.8). We like to restrict ourselves to one-sided tests. Two-sided alternatives
can be handled similarly.
7.1
The null hypothesis H0I with equal censorship G1 = G2
For the parametric set-up the score tests based on the statistic (4.14) are
asymptotically efficient for testing {ϑ = 0} against {ϑ > 0} (asymptotically equivalent to Neyman Pearson tests) in direction γ = γ0 ◦ F under
the parametrization (4.10) with arbitrary centered regression coefficients if
maxi |cni | → 0 holds.
At this stage we like to point out how the parametrization (4.13) given by the
centered regression coefficients fits into the efficiency concept of Theorem 7.2.
Let us start with an uncentered two-sample parametrization
n2
Pϑ := L(T1 , ..., Tn ) = F n1 ⊗ Fϑ(
n
)1/2
n1 n2
see (4.10) and (4.12), with score function
1/2 X
n
n
l=
Mγ (Xi , ∆i , ∞) .
n1 n2
i=n +1
,
(7.6)
(7.7)
1
We will see that (at least in the asymptotic set-up) the projection of Pϑ in H0I
is given by the product measure
Qϑ = Fϑnn2 (
n
n
)1/2
n1 n2
.
(7.8)
Observe that the family ϑ 7→ Qϑ has the score function
n
l2 =
n2 n 1/2 X
(
)
Mγ (Xi , ∆i , ∞) .
n n1 n2
i=1
(7.9)
37
The efficiency of semiparametric survival tests
It is easy to see that the efficient score function of l is just
l1 = l − l2 =
n
X
cni Mγ (Xi , ∆i , ∞)
(7.10)
i=1
where cni denote the centered regression coefficients (4.13). We remark that
the projection step can be avoided when we are starting with the centered
regression model (4.13). This discussion is not primarily a matter of censored
data. In case of ordinary rank tests Hájek et al. [12] already worked with
centered regression coefficients.
For simplicity only centered regression coefficients (4.13) are used troughout.
Remark 7.3
In Janssen [16] the elemination of the nuisance parameter F and Λ, see (4.15) (4.17), was made rigorous. For weights of type (4.18) with wn (i) = γ0 (Fˆn (Xi:n −))
(or more generally for L2 -convergent weights) our test statistic Tn (4.17) is
asymptotically equivalent to the score function (4.14), i.e. Tn − l1 → 0 under
H0I . Together with the consistency of Vn , see (5.38), this equivalence implies
the efficiency of ϕn (and its conditional counterpart ϕn,perm (6.2)) in direction
of all semiparamteric alternatives given by γ0 ◦ F where F is the nuisance
parameter and γ0 is the parameter of interest.
The next Theorem answers the question about the power of ϕn (constructed
for direction γ0 ◦ F ) under local alternatives specified by a hazard derivative
γ1 ◦ F .
R∞
We assume that 0 (γ0 ◦ F )(γ1 ◦ F )(1 − G) dF ≥ 0 holds. Otherwise, a minus
sign has to be added before (ARE)1/2 in formula (7.11).
Theorem 7.4
Let ϕn be the sequence of efficient survival tests in direction γ0 ◦ F . Under
local two-sample alternatives given by the hazard rate derivatives in direction
γ1 ◦ F of type (4.10) with (4.13) the asymptotic power function is
Φ((ARE)1/2 σ − u1−α )
(7.11)
R
where σ 2 = (γ1 ◦ F )2 (1 − G) dF and ARE is the asymptotic relative Pitman
efficiency
2
R∞
(γ
◦
F
)(γ
◦
F
)(1
−
G)
dF
0
1
0
R∞
.
(7.12)
ARE = R ∞
(γ
◦
F
)2 (1 − G) dF 0 (γ1 ◦ F )2 (1 − G) dF
0
0
Proof. See Janssen [16], Th. 3.1 and (3.17).
More information about the Pitman efficiency can be found in Hájek and Šidák
[12], Witting and Müller-Funk [43] and the lecture notes Janssen [19].
38
The efficiency of semiparametric survival tests
7.2
The null hypothesis H0II (unequal censorship
with G1 6= G2)
Under unequal censorship the null hypothesis is much larger as above. Observe first that then the asymptiotic variances (5.38) of Tn (as well as the
asymptotic power function (7.11)) depend on the pair G1 , G2 . However, the
efficiency of extended survival rank tests can be obtained. This was first done
by Gill in ABGK [3], VIII 2.3 under parametric models and in VIII 4.2 for
semiparametric transformation models. On page 624 the authors were talking
about "our informal discussion here". For an easy proof of the general asymptotic efficiency we refer to the beautiful article of Neuhaus [36], who followed
the lines of ABGK [3]. He calculated the efficient score function and then he
proves the efficiency of the underlying survival tests for the semiparametric
direction γ ◦ F . We will indicate how his construction fits in the methodology
of Theorem 7.2 which gives us further insight in this result.
Under centered regression coefficients the parametric score function l (4.11)
can be expressed by the martingales Mi of the counting processes Ni , see (5.6)
and (5.7). Combining the results of chapter 4 and 5 we arrive at
n
n n 1/2 Z ∞
X
dM2 dM1
1 2
−
(7.13)
γ◦F
l=
cni Mγ (Xi , ∆i , ∞) =
n
n
n
2
1
0
i=1
which differs from Neuhaus [36], (3.35), by a minus sign since group 1 and 2
are interchanged. Again the score function l can be projected in a set of score
functions arising from paths in H0II . In contrast to part 7.1 it is necessary to
vary the direction γ and the projection procedure is not merely a manipulation
of regression coefficients.
Observe that L2 -differentiable paths (of i.i.d. survival times) within H0II admit
score functions
Z ∞
n
1 X
1
h [dM1 + dM2 ]
(7.14)
Z(h) := √
Mh (Xi , ∆i , ∞) = √
n i=1
n 0
at F n where a further hazard derivative h serves as a nuisance parameter. The
calculation is similar to (7.13).
Neuhaus calculated the projection l2 of l in the space generated by (7.14) (at
¯ for the
least in the asymptotic set-up). This projection is given by l2 = Z(h)
special choice
1/2 G
¯2 − G
¯1
¯h = n1 n2
γ◦F .
(7.15)
¯
n2
G
¯ i is defined by G
¯ i = 1 − Gi and G
¯ = n1 G
¯ 1 + n2 G
¯ 2.
The function G
n
n
The efficient score function of (7.13) is then
Z ∞
¯1
¯2
1
n1 1/2 G
n2 1/2 G
l1 = l − l2 = √
γ0 ◦ F ( )
¯ dM2 − ( n1 ) G
¯ dM1 . (7.16)
n2
n 0
G
39
The efficiency of semiparametric survival tests
The efficient score function l1 is of parametric nature. The semiparametric
version is
Z ∞
γ0 ◦ Fˆn (t−) dLn (t)
(7.17)
Tn =
0
given by the log-rank process Ln and with the weight function (4.18) based on
the Kaplan-Meier estimator. Gill [9], Theorem 4.2.1, proves that
l2 − Tn −→ 0
(7.18)
holds in probability under H0II and mild assumptions, see also Fleming and
Harrington [8]. Recall from section 5 that the variance estimator Vn is still
consistent under H0II . Together with (7.18) the asymptotic efficiency of the
semiparametric survival tests ϕn , see (5.33), based on (7.17) follows for onesided hazard alternatives in direction γ0 ◦ F . According to Theorem 6.4 the
same holds for their conditional counterparts of general studentized permutaion test type.
For G1 6= G2 the form (7.12) of ARE has to be modified. Note that the discussion above proves that the ARE coincides with the efficacies of Gill [9], sect. 5,
who was already able to compare the power functions of competing underlying
survival tests with linear statistics. By the results of Neuhaus we are now able
to compare the asymptotic power of ϕn with the asymptotic envelope power
function w.r.t. H0II level α tests which is of course more general.
40
Chapter 8
Omnibus tests and related tests
Similar to section 7 asymptotically efficient two-sided survival tests can be
obtained for the testing problem (4.5) with F1 = F2 against two-sided alternatives. The tests given by the statistics |Tn | are then asymptotically efficient within the class of asymptotically unbiased tests for the underlying
semiparametric model given by direction γ ◦ F . Also the formula (7.12) for the
asymptotic relative Pitman efficiency ARE remains unchanged under G1 = G2 .
These tests have the disadvantage that they have poor power when ARE is
small. For those directions with ARE = 0 they are not consistent. In order to
overcome this difficulty goodness of fit tests can be applied.
8.1
Two-sided goodness of fit tests
For the motivation of this section we refer to Gill [9], sect. 5.4, and ABGK [3].
To give the motivation let us consider first the uncensored two-sample case.
Often classical goodness of fit test statistics are given by
Z t
sup |
w (dFˆn1 − dFˆn2 ) |
(8.1)
t
−∞
which is based on the weighted difference of the underlying empirical processes Fˆni of group i = 1, 2. Also the sup-norm can be substituted by other
seminorms. It is already known from the work of Khmaladze [24] that the
martingale part of the Doob-Meyer decomposition of the empirical processes
may serve as a good ingredient of test statistics. In the case of uncensored
data the details are outlined in Janssen and Milbrodt [23] for the two-sample
problem. In conclusion (8.1) is substituted by
sup |Tn (t)|
(8.2)
t≥0
where Tn (t) is as in (5.11) a sequential version of our survival test statistic
(4.16). Tests based on (8.2) are now appropriate for censored data. They are
41
Omnibus tests and related tests
called tests of Rényi type and have already been studied by Gill [9], sect.
5.4, see also Fleming and Harrington [8]. The asymptotics is fully understood
and we will only summarize the results and indicate how the test statistics
(8.2) can be treated.
• Under H0II a functional central limit theorem holds for t 7→ Tn (t) on
D[0, ∞) where the limit is given by a time rescaled Brownian motion
t 7→ B(V (t)). Here V (t) is given by the integrand of (5.32) on the
domain [0, t]. Similarly to section 5 discrete martingale methods apply
to the grid, confer Remark 5.5.
• Unconditional asymptotic level α tests can thus be introduced via the
critical values calculated for the limit process.
• Similarily as in section 6 conditional versions of Rényi tests (permutation
tests) work well. Again studentized versions should be prefered. In this
case Theorem 6.4 carries over to Rényi tests and the conditional versions
remain asymptotically equivalent. This result is due to Neuhaus [34].
Conditional functional central limit theorems were also established by
Janssen and Mayer [22].
8.2
Projection tests in survival analysis
Although goodness of fit tests are usually consistent under fixed alternatives
their quality is strongly influenced by the choice of weight functions or the
underlying seminorm on the space of trajectories. The judgement of goodness
of fit tests relies on a principle component decomposition of the asymptotic
power function given by sequences of local alternatives, see Janssen and Milbrodt [23] and Janssen [18], [20]. However, already in the case of uncensored
data the analytic calculations of the principle components may be difficult, see
also Shorack and Wellner [39], chapter V.
At this stage another class of adaptive tests called projection tests is most
promising. They have the advantage that the statistician can select a whole
cone or subspace of alternatives (not only of dimension one treated earlier in
section 4 and 7) which can be seperated from the null hypothesis with sufficiently high probability. We will only present the methodology and refer the
details from the literature. The basic idea is due to Behnen and Neuhaus [5],
section 3.2(C), who developed projection tests for models given by tangents
(score functions).
The extention to hazard rate models and censored data was done by Mayer
[29], [30]. As example we will treat the one-sided two-sample testing problem
with stochastically larger alternatives. Two-sided tests can be treated similarly. It is now the idea to replace the semiparametric direction γ ◦ F of (2.17)
and (4.10) by a cone of relevant alternatives. The scientist has to specify a
42
Omnibus tests and related tests
finite number r (not too many) of semiparametric directions γi ◦ F , 1 ≤ i ≤ r,
derived from the stochastically larger alternatives. The hazard rate derivatives
γi ◦ F stand for directions of high preference. For instance the statistician likes
to discover differences of the relative risk (hazard rates) for
• proportional hazards (constant over time)
• early survival times
• central survival times
• late survival times.
(The example for the γi , 1 ≤ i ≤ 4, may be taken from the list of figure 2.1.)
For a fixed choice γi : [0, 1] → R let us now denote by
(
)
r
X
V + := γβ ◦ F :=
βi γi ◦ F, βi ≥ 0, β = (β1 , ..., βr )
(8.3)
i=1
the cone of relevant hazard rate derivatives. A concrete model is just (2.17)
with γβ ◦ F where now β ∈ [0, ∞)r (and also F ) is an additional nuisance
parameter. Testing H0II can be expressed by testing the local coordinates
β = 0 against β ∈ V + \ {0} .
(8.4)
Let us attach an additional index γ with the convention Tn =: Tn (γ) at the
ordinary survival statistic (5.10) with weights (4.18). We briefly summarize the
solution of Mayer [29], [30] which works along the lines of Behnen and Neuhaus
[5]. The projection test for (8.4) is an asymptotic likelihood ratio test within
the asymptotic limit model under local alternatives. By the projection method
of Behnen and Neuhaus [5] the vector β is estimated by βˆ = (βˆ1 , ..., βˆr ). The
likelihood ratio type test statistic is then
r
r
X
X
ˆ
Tn (
βi γi ) =
βˆi Tn (γi ) .
i=1
(8.5)
i=1
This test statistic can be motivated as likelihood ratio test statistic for (8.5)
within the asymptotic Gaussian shift model. The choice of asymptotic or permutation based critical values now depends as in (5.33) on a proper variance
estimation Vˆn under H0II . This is solved in the work of Mayer [29], [30]. The
test ϕn where Tn and Vn are replaced by (8.5) and some estimator Vˆn is called
projection test. The region with high power is then spread out over the cone
V + of alternatives. In the case of two-sided alternatives (with a subspace instead of the cone V + ) the tests are Neyman smooth type tests for uncensored
data.
43
Omnibus tests and related tests
Mayer showed that these projection tests are asymptotically admissible and
that Monte-Carlo simulations support them. He also proved (similarly to
section 6) that permutation versions of studentized statistics share the same
asymptotic properties as their unconditional counterparts.
44
Bibliography
[1] Andersen; P.K., Borgan, Ø. (1985). Counting Process Models for Life History
Data: A Review. Scand. J. Statist. 12, 97-158.
[2] Andersen, P.K.; Borgan, Ø.; Gill, R.D.; Keiding, N. (1982). Linear nonparametric tests for comparison of counting processes with application to censored survival
data (with discussion). Int. Statist. Rev. 50, 219-258.
[3] Andersen, P.K.; Borgan, Ø.; Gill, R.D.; Keiding, N. (1993). Statistical models
based on counting processes. Springer, New York.
[4] Balakrishnan, N., Rao, C.R. (2004) Advances in Survival Analysis. Handbook of
Statistics 23. Elsevier, Amsterdam, 251-262.
[5] Behnen, K.; Neuhaus, G. (1989). Rank tests with estimated scores and their application. Teubner-Skripten zur Mathematischen Stochastik. Stuttgart.
[6] Bickel, P.; Klaasen, C.; Ritov, Y.; Wellner, J. (1993). Efficient and adaptive
estimation for semiparametric models. John Hopkins Series in Math. Sciences, The
John Hopkins Univ. Press, Baltimore.
[7] Efron, B.; Johnstone, I. (1990). Fisher‘s information in terms of the hazard rate.
Ann. Statist. 18, 38-62.
[8] Fleming, T.R.; Harrington, D.P. (1991). Counting processes and survival analysis. Wiley, New York.
[9] Gill, R.D. (1980). Censoring and stochastic integrals. Math. Centre Tracts 124, Mathematisch Centrum, Amsterdam.
[10] Hall, P.; Heyde, C.C. (1980) Martingale limit theory and its application. Probability
and Mathematical Statistics. Academic Press, New York.
[11] Hájek, J.; Šidák, Z. (1967). Theory of rank tests. New York-London: Academic
Press, Prague.
[12] Hájek, J.; Šidák, Z.; Sen, P.K. (1999). Theory of rank tests. 2nd ed., Academic
Press. xiv, Orlando.
[13] Harrington, D.P.; Fleming, T.R. (1982). A class of rank test procedures for censored survival data. Biometrika 69, 553-566.
[14] Heller, G.; Venkatraman, E.S. (1996). Resampling procedures to compare two
survival distributions in the presence of right-censored data. Biometrics 52, 1204-1213.
45
Bibliography
[15] Janssen, A. (1989). Local asymptotic normality for randomly censored data with applications to rank tests. Statist. Neerlandica 43, 109-125.
[16] Janssen, A. (1991). Conditional rank tests for randomly censored data. Ann. Stat.
19, 1434-1456.
[17] Janssen, A. (1994). On local odds and hazard rate models in survival analysis. Statistics and Probability Letters 20, 355-365.
[18] Janssen, A. (1995). Principal component decomposition of non-parametric tests.
Probability Theory and Related Fields 101, 193-209.
[19] Janssen, A. (1998). Zur Asymptotic nichtparametrischer Tests. Lecture Notes.
Skripten zur Mathem. Statistik 29, Münster.
[20] Janssen, A. (2000). Global power functions of goodness of fit tests. Ann. Stat. 28,
239-253.
[21] Janssen, A.; Neuhaus, G. (1997) Two-sample rank tests for censored data with
non-predictable weights. J. Stat. Plann. Inference 60, 45-59.
[22] Janssen, A.; Mayer, C.-D. (2001) Conditional Studentized survival tests for randomly censored models. Scand. J. Stat. 28, 283-293.
[23] Janssen, A.; Milbrodt, H. (1993) Rényi type goodness of fit tests with adjusted
principal direction of alternatives. Scand. J. Stat. 20, 177-194.
[24] Khmaladze, E.V. (1981) Martingale approach in the theory of goodness-of-fit tests.
Theory Probab. Appl. 26, 240-257.
[25] Klein, J.P.; Moeschberger, M.L. (2003) Survival analysis. Techniques for censored
and truncated data. 2nd ed. Statistics for Biology and Health. New York.
[26] Lan, K.-K.G.; Wittes, J. (1990). Linear Rank Tests for Survival Data: Equivalence
of Two Formulations. The American Statistican (Teacher’s Corner), Vol. 40, 23-26.
[27] LeCam, L.; Yang, L.G. (1988). On the preservation of local asymptotic normality
under information loss. Ann. Stat. 16, 483-520.
[28] LeCam, L.; Yang, L.G. (2000). Asymptotics in statistics. Some basic concepts. Second edition. Springer series in statistics, New York.
[29] Mayer, C.-D. (1996). Projektionstests für das Zweistichprobenproblem mit zensierten
Daten. Dissertation, Heinrich-Heine Universität Düsseldorf.
[30] Mayer, C.-D. (1998). Projection-type rank tests for randomly right censored data.
Technical Report, University of Düsseldorf.
[31] McLeish, D.L. (1974). Dependent central limit theorems and invariance principles.
Ann. Prob. 2, 620-628.
[32] Moser, M. (1994). Completeness of time-ordered indicators in censored data models.
Stat. Probab. Lett. 21, 163-166.
[33] Neuhaus, G. (1988). Asymptotically optimal rank tests for the two-sample problem
with randomly censored data. Comm. Statist. Therory Methods 17, 2037-2058.
46
Bibliography
[34] Neuhaus, G. (1991). Some linear and nonlinear rank tests for competing risks models.
Commun. Stat., Theory Methods 20, 667-701.
[35] Neuhaus, G. (1993). Conditional rank tests for the two-sample problem under random
censorship. Ann. Stat. 21, 1760-1779.
[36] Neuhaus, G. (2000). A method of constructing rank tests in survival analysis. Journal
of Statistical Planning and Inference 91, 481-497.
[37] Ritov, Y.; Wellner, J.A. (1988). Censoring, martingales, and the Cox model.
Statistical inference from stochastic processes, Contemp. Math. 80, 191-219.
[38] Schumacher, M.; Schulgen, G. (2002) Methodik klinischer Studien. Methodische
Grundlagen der Planung, Durchführung und Auswertung. Springer Verlag. Berlin.
[39] Shorack, G.R.; Wellner, J.A. (1986). Empirical Processes with Applications to
Statistics. Wiley, New York.
[40] Strasser, H. (1985). Mathematical theory of statistics. Statistical experiments and
asymptotic decision theory. De Gruyter, Berlin.
[41] Völker, D. (2003). Finit optimale nichtparametrische Tests für Lebensdauerzeiten.
Dissertation, Westfälische Wilhelms-Universität Münster.
[42] Witting, H. (1985). Mathematische Statistik I. Parametrische Verfahren bei festem
Stichprobenumfang. B.G. Teubner, Stuttgart.
[43] Witting, H.; Müller-Funk, U. (1995) Mathematische Statistik II. Asymptotische
Statistik: Parametrische Modelle und nichtparametrische Funktionale. B.G. Teubner,
Stuttgart.
47