A survey about the efficiency of two-sample survival tests for randomly censored data Arnold Janssen and Wiebke Werft Heinrich-Heine-Universität Düsseldorf Universitätsstraße 1, D-40225 Düsseldorf [email protected] Abstract. The present paper summarizes recent developements about twosample tests for randomly right censored data. The tests are derived for hazard oriented models. In particular, it is reviewed how asymptotically efficient tests can be constructed and how to prove the asymptotic efficiency of a large class of linear survival tests. The quality of competing tests can be compared by the local asymptotic Pitman efficiency ARE. Special attention is devoted to the martingale approach which corresponds to the sequential nature of survival data. Typically, central limit theorems for continuous martingales are applied. A new discrete martingale approach is presented in section 5. This point of view has the advantage that common central limit theorems for partial sums of martingal difference arrays may substitute more technical continuous martingale arguments. This approach is not only of interest for teaching courses. Contents 1 Introduction 2 2 The two-sample hazard model 5 2.1 Semiparametric hazard model . . . . . . . . . . . . . . . . . . . 8 2.2 The score function . . . . . . . . . . . . . . . . . . . . . . . . . 11 3 The model for censored data 12 4 Towards two-sample survival tests 16 5 Martingale representations of survival test statistics 21 6 Conditional survival tests 29 I 6.1 The homogeneous null hypothesis H0 . . . . . . . . . . . . . . . 29 6.2 The heterogeneous null hypotheses H0II and H0III . . . . . . . . . 31 7 The efficiency of semiparametric survival tests 33 7.1 The null hypothesis H0I with equal censorship G1 = G2 . . . . . 37 7.2 The null hypothesis H0II (unequal censorship with G1 6= G2 ) . . . 39 8 Omnibus tests and related tests 41 8.1 Two-sided goodness of fit tests . . . . . . . . . . . . . . . . . . . 41 8.2 Projection tests in survival analysis . . . . . . . . . . . . . . . . 42 Bibliography 45 1 Chapter 1 Introduction In this paper we are concerned with an overview about semiparametric and nonparametric two-sample survival tests for two groups of life time data with incomplete observations. We will restrict ourselves to randomly right censored data (X, ∆) where a survival time T is censored by some random variable C and X = min(T, C) and the censoring status ∆ = 1(T ≤ C) are observable. It is not the aim of this article to present the mathematical details with complete proofs. For this subject we will mostly refer to the literature. A bright reference are the chapters III 2, V and VIII of the fundamental work of Andersen, Borgan, Gill and Keiding (1993), briefly cited as ABGK [3]. Good sources are also Gill [9], chapt. 3-5, Andersen et al. [2] and Fleming and Harrington [8], chapt. 7. An overview can also be found in the recent handbook [4]. Below we like to give the motivation for the survival modelling, their meaning, the mathematics and the technical difficulties behind the construction of survival tests. We start with a risk model in section 2 which may explain that the survival in a test group (with disease) is more risky than survival in a control group given by a standard population. The modelling is done via hazard ratios and so called risk component models explaining how additional risk factors work. Next, survival tests are derived step by step, see sections 4 and 5. We begin with purely parametric models and their score tests. It is shown how non parametric nuisance parameters like the baseline hazards and their distributions can be eliminated. We like to stress the sequential nature of survival data. Typically, patients enter a clinical study, the data are collected sequentially until the end of the study which may cause censoring. The sequential behavior leads to well motivated weighted test statistics. They are based on increments given by observed minus expected (with respect to the past) quantities. Its connection to the martingale approach is outlined. Observe that the classical field of rank tests is contained as special case in survival analysis if censoring is absent. The reader should take this into account and ask what is the meaning of survival quantities in the rank test set-up. Recall that in case of uncensored data classical two-sample rank tests and per- 2 Introduction mutation tests work well. Since they are finite sample distribution free under the null hypothesis the statistician has complete control over the error probability of tests of first kind for each finite sample size. It is reported in section 6 how the permutation principle can be used for survival tests under random censorship and how it differs from the classical case. The quality of survival tests is studied in the last sections. We summarize recent results about asymptotic efficent tests. These tests reach their asymptotic envelope power functions under local alternatives. The asymptotic relative Pitman efficiency ARE can be calculated which is used to compare competing tests. We proceed with some historical remarks and comments about the literature. A review about the martingale and counting process approach is given by Andersen and Borgan [1]. A good source for the presentation of the construction principles concerning efficient survival tests is Neuhaus [36]. For teaching and beginners the introduction of Lan and Wittes [26] can be recommended. However, it is impossible to review all papers with applications to survival tests. There are thousands of them which are often published in applied journals like Biometrika or Statistics in Medicine. The history of survival tests is very old. Long time ago the well-known logrank test has been used to test proportional hazard differences. Recall that the two-sample testing problem with constant time independent hazard ratios is a special case for the famous Cox-model, see ABGK [3], chapt. VII, or Fleming and Harrington [8], sect. 4.2. The log-rank test is the censored data extension of the Savage rank test, see Hájek et al. [12]. The log-rank test is well understood. It is now the question how different competing tests can be compared under semiparametric models. The comparison is typically done within the asymptotic set-up of power function via local alternatives. This leads to the asymptotic Pitman efficiency ARE, 0 ≤ ARE ≤ 1, as measure of performance. A test with ARE = 1 is asymptotically efficient. A sequence of further tests with 0 < ARE < 1 needs 1 more observations to obtain the same asymptotic power as an efficient ARE sequence of tests (or 100(1 − ARE) percent of the observations are wasted). To give an example consider again the proportional hazard two-sample Cox model. It was shown by ABGK [3], VIII 2.3 and 4.2, that in this case the log-rank test is asymptotically efficient. The last section 8 is concerned with goodness of fit tests. It is reported that hazard oriented Kolmogorov-Smirnov type tests apply in survival analysis leading to Rényi tests. Personally, we prefer another class of tests the so called projection tests for cones or subspaces of alternatives. The reason is that the statistician can select a region of alternatives where he needs high power. They can be motivated as asymptotic likelihood ratio tests and were first introduced by Behnen and Neuhaus [5] for uncensored data. The state of the art for uncensored two-sample data can be found in Hájek and Šidák [11], the new edition Hájek et al. [12] and Behnen und Neuhaus [5]. A 3 Introduction good source for the general asymptotic theory of tests is Witting and MüllerFunk [43], chapter 6. For the general methodology we refer to the monographs about asymptotic statistics of Strasser [40] and LeCam and Yang [28]. The mathematical treatment of survival tests is presented in ABGK [3] and Fleming and Harrington [8]. A good review about martingale methods in survival analysis can be found in Shorack and Wellner [39], sect. 7 and their appendix B. More applied text books are Klein and Moeschberger [25] and Schumacher and Schulgen [38], sect. 5 and 6. Here the reader will find a lot of examples and the books can be used for consulting and statistical courses about survival analysis in medicine. Let Φ denote the standard normal distribution function. 4 Chapter 2 The two-sample hazard model Throughout, let the random variable T : (Ω, A, P ) → [0, ∞] denote a survival time. For convenience we allow positive mass P (T = ∞) > 0 at infinity which is used to study risk factors later on which do not occur with a certain probability. Let F (x) := P (T ≤ x), S(x) = 1 − F (x), x ∈ [0, ∞] (2.1) denote the distribution function F and the survival function S. The hazard measure Λ on [0, ∞) is defined by dΛ 1 (x) = 1[0,∞) (x). dF S(x−) (2.2) The cumulative hazard function Λ is the measure generating function Λ(t) := Λ([0, t]), t ∈ [−∞, ∞) (2.3) of the hazard measure. Without further comments we will add further indices, like S0 , Λ0 . . ., which are the quantities given by F0 . Remark 2.1 Let F|[0,∞) be continuous. (a) Then S(x) = exp(−Λ(x)) (2.4) holds for each x ∈ [0, ∞). (b) Let Λ0 denote the hazard measure of the uniform distribution λλ|(0,1) on the unit interval (0,1). Then Λ0 (u) = − log(1 − u) holds for 0 < u < 1. If in addition P (T = ∞) = 0 holds then the time scale may be transformed by 5 The two sample hazard model F (or its left continuous inverse F −1 on the unit interval (0, 1)). The image measures are then given by Λ = Λ0F −1 , and Λ(t) = Λ0 (F(t)) , Λ0 = ΛF (2.5) Λ0 (u) = Λ(F −1 (u)) . The hazard measure Λ is a risk measure which serves as a meaningful parameter of our survival model. At x ∈ [0, ∞) it describes the risk of T to fail if {T ≥ x} has already been observed. According to (2.5) it is possible to reduce the measure Λ on the unit interval by a time scale transformation via F . A fruitful two-sample model is the so called risk component model for differences of survival times. It is introduced as follows: Let T01 , T02 , T11 , T12 : Ω −→ [0, ∞] (2.6) denote four mutually independent survival times (with continuous subdistributions on [0, ∞)) where the second index indicates the two-sample group 1 or 2. D Suppose that T01 = T02 are real survival times which may stand for variables with baseline risks of healthy members of a homogeneous population. The survival times T1 , T2 of members of our groups 1 and 2 are given by T1 = min(T01 , T11 ) , (2.7) T2 = min(T02 , T12 ). In this model Ti admits an additional risk factor with fictitious survival time T1i which may reduce the life time. That model has an additive structure of the hazards Λi and Λik of Ti and Tik , i.e. Λ1 = Λ01 + Λ11 , Λ2 = Λ01 + Λ12 (2.8) where Λ01 = Λ02 by our assumption, confer (2.4). Remark 2.2 Each pair T1 , T2 of real continuous survival times admits a representation (2.7) via a risk component model. The hazard measures Λ1i are not unique but the difference of the measures Λ1 − Λ2 = Λ11 − Λ12 (2.9) is uniquely determined by the distribution of (T1 , T2 ). Example 2.3 (a) By definition T1 is said to be stochastically larger than T2 if S1 ≥ S2 . 6 The two sample hazard model This property is equivalent to the ordering of the cumulative hazards Λ1 (t) ≤ Λ2 (t) for all t ∈ {S2 > 0}. (b) A stronger condition with practical interpretation as overall risk superiority of S1 can be introduced by Λ2 (A) − Λ1 (A) ≥ 0 (2.10) for all A ∈ B([0, ∞)). From now on let F1 and F2 be always continuous distributions on [0, ∞). We will indicate how likelihood based inference can completely be expressed by the hazard measures for the two-sample testing problem. dF2 denote the likelihood ratio of the F1 -absolutely continuous part of Let dF 1 F2 . Then the hazard ratio of Λ2 w.r.t. Λ1 is S1 (t) dF2 dΛ2 (t) = (t) . dΛ1 S2 (t) dF1 (2.11) The relative logarithmic risk ρ : [0, ∞) −→ [−∞, ∞] is then determined by dΛ2 (t) =: exp(ρ(t)). (2.12) dΛ1 Rt In the case when hazards rates λi of Λi with Λi (t) = 0 λi (u) du exist for i = 1, 2 then ρ = log(λ2 /λ1 ) holds. The function ρ is our parameter of interest. It describes the time dependent relative risk of F2 w.r.t. F1 to fail within the logarithmic scale. Remark 2.4 Under the condition F2 F1 the likelihood can be expressed in terms of ρ by Z t dF2 (t) = exp(ρ(t)) exp (1 − exp(ρ(u))) dΛ1 (u) , (2.13) dF1 0 Rt dF2 since Λ2 (t) = 0 exp(ρ) dΛ1 and dF (t) = exp(ρ(t)) exp(Λ1 (t) − Λ2 (t)) hold. 1 Example 2.5 (risk component model) Suppose that the risk components Tij have hazard measures Λij with hazard rates λij ≥ 0. Then dΛ2 λ01 + λ12 λ12 − λ11 = =1+ =: 1 + γ dΛ1 λ01 + λ11 λ01 + λ11 (2.14) holds and γ may serve as the parameter of interest. Formula (2.13) can be rewritten as Z t dF2 (t) = (1 + γ) exp − γ(u) dΛ1 (u) . (2.15) dF1 0 7 The two sample hazard model R Below we will see that under mild regularity assumptions (with γ 2 dF1 < ∞) each perturbation γ (or ρ) of Λ1 defines a path of hazards via a parameter ϑ ∈ R with Z t exp(ϑρ(u)) dΛ1 (u) (+o(ϑ)) (2.16) Λ2 (t, ϑ) := 0 or Z t (1 + ϑγ(u)) dΛ1 (u) (+o(ϑ)) . Λ2 (t, ϑ) := (2.17) 0 The term o(ϑ) (as ϑ → 0 for fixed t) is often only needed to guarantee that t 7−→ Λ2 (t, ϑ) is really increasing. 2.1 Semiparametric hazard model In practice the baseline survival function and its baseline hazard are typically unknown. For these reasons invariant tests and invariant estimators are preferable which are invariant under strictly monotone scale transformations of the time axis. In this connection it is very natural that extended rank procedures are derived. A specific example of an invariant estimator is the famous KaplanMeier estimator of the survival function for censored data. On the side of the hazards we will now turn to semiparametric survival models. The main idea is simple and relies on the fact that the meaning of the relative risks (2.12) remains preserved under a monotone time scale transformation. For instance, the practical statement "the risk under Λ2 is twice as high as under Λ1 for certain late survival times (say for the 3/4-quantil t = F −1 (3/4))" is invariant of F whatever the actual shape of F can be. In our model it is expressed by exp(ρ(F −1 (3/4))) = 2. Observe that the meaning of traditional statistical location or scale models is lost under monotone transformations whereas the meaning of risk models via hazards remains preserved. As mentioned in Remark 2.1 the survival time T (given by our model Λ2 (·, ϑ)) will now be transformed via x 7→ F (x), with F := F1 , on the unit interval which is a pivoting procedure. Notice that L(F (T )|ϑ = 0) = λλ|(0,1) (2.18) yields the uniform distribution under ϑ = 0. The local quantities (2.12) and (2.14) then define new pivoted parameters ρ0 , γ0 : (0, 1) −→ R (2.19) with ρ0 (u) := ρ(F −1 (u)) and γ0 (u) := γ(F −1 (u)) given by the baseline distribution function F . For instance, the model (2.17) reads as Z Λ2 (t, ϑ) = F (t) (1 + ϑγ0 (v)) 0 dv . 1−v (2.20) 8 The two sample hazard model This observation motivates a new parametrization ρ = ρ0 ◦ F and γ = γ0 ◦ F (2.21) where F is the baseline distribution function and the pivoted quantities ρ0 and γ0 of (2.19) serve as new parameters. Example 2.6 The choice γ0 = 1 leads to the proportional hazard model with time independent relative risk (for ϑ > −1). Within this semiparametric model the proportional hazards are attached at each baseline distribution function F . In order to give an impression about the hazard model the shape of various pivoted hazard derivatives γ0 : [0, 1] → R is plotted. The methodology is presented in Fleming and Harrington [8], see p. 258, see also [13]. In figure 2.1 semiparametric models are introduced for proportional, late, early, central and linear crossing differences of the hazards. The statistican has to decide which type of alternatives (given by a suitable direction γ0 ) is appropriate and should be seperated from the null hypothesis with priority. If the statistician has no preference for one of these directions a projection test would be helpful to spread out the power over a whole cone, see section 8.2. 9 The two sample hazard model 6 6 γ1 (u) = c γ2 (u) = cu -u -u 1 proportional hazards late hazards γ3 (u) = c(1 − u) 6 6 1 γ4 (u) = cu(1 − u) b b b b b b b b b b b b b b b -u 1 early hazards -u central hazards 1 γ5 (u) = −c(u − 1/2) 6 b u b b b b b b b b 1 b b b b b b linear crossing hazards Figure 2.1: 10 The two sample hazard model 2.2 The score function Consider a family of survival distributions Fϑ , ϑ ∈ U ⊂ R, 0 ∈ U , which admit at ϑ = 0 a (one- or two-sided) stochastic derivative g of the likelihood ratio w.r.t. F0 , i.e. dFϑ d log (t)|ϑ=0 = g(t) (2.22) dϑ dF0 where the tangent g (or score function) is F0 -square integrable with the tanR gent conditionR g dF0 = 0. The tangent space at F is denoted by L02 (F ) := g ∈ L2 (F ) : g dF = 0 . Under the model (2.17) (with Λ = Λ1 and F = F1 ) we arrive via (2.13) at Z t γ(s) dΛ(s) =: LF (γ)(t) (2.23) g(t) = γ(t) − 0 and similarly from (2.12) at R∞ γ(t) = g(t) − t g(s) dF (s) =: RF (g)(t) , 1 − F (s) (2.24) with R = RF , L = LF briefly. From Ritov and Wellner [37] and Efron and Johnestone [7] it is known that L = R−1 is the inverse of R and that the operator R R : L02 (F ) −→ L2 (F ) (2.25) is an isometry of the Hilbert spaces where R maps the tangent g into the hazard rate derivative γ. A rigorous description for L2 -differentiable paths ϑ 7→ Fϑ can be found in Janssen [17]. Let us mention that local (or asympotic) properties of the models and related tests do not rely on the specific path but only on the tangent g and γ = R(g). Thus the parametrization is introduced via the γ’s. However, all probabilistic calculations will be done on the side of distributions and tangents. Again we may turn as in (2.21) to pivoted quantities g0 , γ0 on (0, 1). If L0 and R0 denote the operators for λλ|(0,1) on (0, 1) then obviously g0 = g ◦ F −1 is the score function of ϑ 7→ L(F (T )|Fϑ ) on (0, 1) and γ0 = R0 (g0 ), g0 = L0 (γ0 ) hold. The equations (2.23) and (2.24) can then be expressed by γ = R0 (g0 ) ◦ F, g = L0 (γ0 ) ◦ F (2.26) where γ0 (and g0 ) are the parameters of interest and F is a nuisance parameter. The present methodology follows the ideas of LeCam and Yang [28]. In order to study the structure of testing problems at a null hypothesis LeCam proposed the Hilbert space embedding of an experiment {P : P ∈ P} 1/2 dP in L2 (P0 ) . (2.27) P 7→ dP0 The local geometry of that experiment is given by the tangents g, i.e. the ϑ 1/2 L2 (P0 ) derivates of all paths ϑ 7→ 2( dP ) at ϑ = 0. They coincide with our dP0 score functions, see (2.22). 11 Chapter 3 The model for censored data Throughout, randomly right censored data models are studied. Let T and C denote two independent r.v. where T is a continuous survival time and C is a continuous censoring variable with distribution functions L(T ) = F, L(C) = G . (3.1) The statistician is only able to observe the pair (X, ∆) X := min(T, C) and ∆ = 1{T≤C} (3.2) where ∆ is the censoring status. If ∆ = 1 holds, then the observation is called uncensored. For (x, δ) ∈ [0, ∞) × {0, 1} Fubini’s theorem gives us the joint distribution Z x Z x F ⊗ G(X ≤ x, ∆ = δ) = δ (1 − G) dF + (1 − δ) (1 − F ) dG . (3.3) 0 0 Consider now two L2 -differentiable paths ϑ 7→ Fϑ and ϑ 7→ Gϑ of survival and consoring distributions, respectively, with score function g of the F ’s, score function gc of the G’s, respectively, then it is well known that the distributions L((X, ∆)|Fϑ ⊗ Gϑ ) remain L2 -differentiable under the loss of information, see LeCam and Yang [27], Witting [42], Satz 1.193. If F = F0 and G = G0 hold then their score function at ϑ = 0 is R∞ R∞ gc dG g dF + (1 − δ)RG (gc )(x) + x (3.4) (x, δ) 7→ h(x, δ) = δRF (g)(x) + x 1 − F (x) 1 − G(x) for the censored model which can be deduced from (3.3), see Janssen [15] for details. At the level of the score functions the form (3.4) seems to be very unpleasant. However, if we turn to hazards the function h is treatable. The reason is that in terms of the hazard measures we have an additive model, formally ΛX = ΛT + ΛC . Let here gc = 0. 12 The model for censored data Lemma 3.1 Suppose that γ := R(g) ∈ L2 (F ) denotes the hazard rate derivative of the survival times T . Then the score function (3.4) can be completely expressed in terms of hazard quantities Z x γ dΛ h(x, δ) = δR(g)(x) − 0 Z = δγ(x) − x γ dΛ =: h(g, x, δ) . (3.5) 0 Proof. The isomotry (2.25) implies R∞ Z x g dF x =− R(g) dΛ , 1 − F (x) 0 see also Janssen [17], equation (3.8). This simple observation leads to the extension of the isometry (2.25) to censored score functions. We will see that the essential informative part of h is its uncensored part (x, δ) 7→ δR(g)(x) where the index F is suppressed. For convenience let γ∆ : [0, ∞)×{0, 1} −→ R denote the function (x, δ) 7−→ γ(x)δ and let Lhaz (L(X, ∆)) := {γ∆ : γ ∈ L2 (F )} ⊂ L2 (L(X, ∆)) (3.6) denote the space of censored hazard derivatives. Theorem 3.2 (Janssen [17], Lemma 3.1) The operator (acting on (3.5)) ˜ : {h(g, ·, ·) : g ∈ L02 (F )} −→ Lhaz R h(g, ·, ·) −→ R(g)∆ is an isometry of the Hilbert spaces (as subsets of L2 (L(X, ∆))) with inverse Z x ˜ L(γ∆)(x, δ) = δγ(x) − γ dΛ. (3.7) 0 This result yields further insight for censored models and their score functions. In particular, it is related to the martingale connection and the basic martingale of survival analysis. Remark 3.3 (a) In a hazard based model with hazard rate derivative γ of the T ’s the censored tangent h is given by ˜ h = L(γ∆) , (3.8) 13 The model for censored data which only depends implicitly on the censoring distribution G via the joint distribution L((X, ∆)|F ⊗ G). (b) Censored models are often sequential models where the time t is increasing with t ∈ I := {s ≥ 0 : F (s) < 1}. Suppose that the variables are only observable up to a time t ∈ I, i.e. the observable events are restricted to the σ-field Ft = σ(X1{X≤t} , 1{X≤t} , ∆1{X≤t} ) . (3.9) It is well-known that the score function relative on Ft is the conditional expectation E(h(g, ·, ·)|Ft ). For the case γ = R(g) it was shown in Th. 3.2. of Janssen [17] that the score function has the form ˜ E(h(g, ·, ·)|Ft ) = L((γ1 [0,t] )∆) (3.10) which means that restricting on Ft is the same as restricting the hazard rate derivative γ by γ 1[0,t] . (c) The restriction (3.10) of the score function yields of course a martingale for t ∈ I, more precisely ˜ Mγ (X, ∆, t) : = L((γ1 [0,t] )∆) (3.11) Z = ∆γ(X)1[0,t] (X) − t γ(s)1{X≥s} dΛ(s) 0 is a martingale. For γ = 1 we arrive at the basic martingale M := M1 of survival analysis for sample size n = 1, see Shorack and Wellner [39], sect. 7.5. One observes that Z t γ(s) M (X, ∆, ds) (3.12) Mγ (X, ∆, t) = 0 is just our sequential score function (3.10) and it is the martingale part of the Doob-Meyer decomposition of the process t 7−→ ∆γ(X)1[0,t] (X) (3.13) which is given by Z ∆γ(X)1[0,t] (X) = Mγ (X, ∆, t) + t γ(s)1{X≥s} dΛ(s) . (3.14) 0 (d) As a consequence the score function (3.5) and (3.8) admits a martingale representation via the basic martingale M = M1 . If γ = γ0 ◦ F is parametrized as in (2.21) the score function of a semiparametric model in direction γ0 is just Z ∞ (x, δ) 7→ γ0 ◦ F (s) M (x, δ, ds) . (3.15) 0 14 The model for censored data In parametric models the score function can be used as test statistic which yields the so called score tests. However, for semiparametric models and composite null hypotheses with nuisance parameters the score statistics have to be modified, see chapter 7. The score functions have to be projected to so called effective score functions and unknown parameters must be eliminated or estimated. 15 Chapter 4 Towards two-sample survival tests Throughout consider the common two-sample testing problem for randomly right censored independent continuous survival times T1 , . . . , Tn on [0, ∞) which are censored by continuous censoring random variables C1 , . . . , Cn , again mutually independent of the T ’s. Suppose that we have two groups with sample size n1 and n2 , n = n1 + n2 , such that the groups of survival times have the i.i.d. structure L(Ti ) = F1 , and L(Ti ) = F2 , 1 ≤ i ≤ n1 (4.1) n1 + 1 ≤ i ≤ n . Under censoring we only observe as in (3.2) (Xi , ∆i ) := (min(Ti , Ci ), 1{Ti ≤Ci } ), 1 ≤ i ≤ n . (4.2) Different nonparametric hypotheses are of interest, namely tests for stochastic ordering H0 : F2 ≤ F1 against F2 ≥ F1 , F1 6= F2 . (4.3) Sometimes this alternative is substituted by the alternative of strict superiority Λ2 − Λ1 ≥ 0 (as measures) . (4.4) The two-sided testing problem is given by ˜ 0 : F1 = F2 against F1 6= F2 . H (4.5) At this stage we should remember that in case of the absense of censoring the present testing problems are well studied and (4.5) is the traditional twosample goodness of fit testing problem. For these reasons the survival tests will also be compared with classical tests, say rank tests. 16 Towards two-sample survival tests Up to now the distributions of the censoring random variables Ci are not specified. We will treat them as nuisance parameters. We will mention three different models I - III for the C’s under the null hypothesis H0 . At the boundary {F1 = F2 } of the null (4.3) we will introduce different conditions which lead to different hypotheses. H0I : T1 , ..., Tn i.i.d., C1 , ..., Cn i.i.d. (4.6) The second null hypothesis is given by H0II : T1 , ..., Tn i.i.d., and L(Ci ) =: G1 , 1 ≤ i ≤ n1 (4.7) L(Ci ) =: G2 , n1 + 1 ≤ i ≤ n2 . The censoring distributions may be completely unknown (but continuous in our set up). In practice the model (4.6) is often too restrictive and must be avoided. To give an example consider two-sample survival times Ti which are mainly censored by the end of the study. Under random entrance in our study the variable Ci is then the time between the beginning of the observation of patient i until the end of the study. This patient is censored when he is still alive at the end of our study. In practice typically one group (for instance the control group) is studied earlier than the other group (which may be here a test group with a new drug). Then the choice of the model (4.7) with G1 6= G2 would be appropriate. The most general model consists of the case when every Ti has a censoring variable Ci with its own censoring distribution for each i. H0III : T1 , ..., Tn i.i.d., C1 , ..., Cn independent . (4.8) One observes that under H0II or H0III the i.i.d. structure of the observation (X1 , ∆1 ), · · · , (Xn , ∆n ) is lost although F1 = F2 holds. In the next step we will call our attention to semiparametric alternatives for the T ’s, see section 2. It is our aim to derive test statistics such that H0I − H0III can be rejected. Consider now semiparametric alternatives Λ2 of Λ1 =: Λ (and F1 =: F ) of that type discussed in section 2. For instance ordering F2 ≥ F1 may be modelled by an additional risk factor, see Ex. 2.5. Our choice is the model (2.15) and (2.17) with the parametrization γ = γ0 ◦ F motivated in (2.21). For the semiparametric submodel we are now going to test Λ2 (·, 0) (= Λ(·)) against Λ2 (·, ϑ), ϑ > 0 , (4.9) where T1 may have the hazard Λ and Tn the hazard Λ2 (·, ϑ) for some ϑ > 0. Let 1 − Fϑ (x) = exp(−Λ2 ([0, x], ϑ)) and F = F0 denote their distributions. 17 Towards two-sample survival tests Suppose that cni , 1 ≤ i ≤ n, is an array of regression coefficients. Then a parametric path for the two-sample testing problem (4.1) - (4.8) is given by L(T1 , . . . , Tn ) = ⊗ni=1 Fϑcni . (4.10) The score function at ϑ = 0 of that model is by (3.15) just n X cni Mγ (Xi , ∆i , ∞) (4.11) i=1 which again only depends implicitely on the censoring distributions of the C’s. When parametric hypotheses are specified by the parameters ϑ then tests based on test statistics (4.11) are called score tests. Introduce the non-centered two-sample regression coefficients 1/2 n cni = 1{n1 +1,...,n} (i) (4.12) n1 n2 P P with ni=1 (cni − c¯n )2 = 1, c¯n := n1 ni=1 cni . Of special importance are the centered versions of (4.12) as we will see later, namely 1/2 1/2 n n n2 cni = 1{n1 +1,...,n} (i) − n1 n2 n1 n2 n 1 n n 1/2 − n1 1 2 = n 1 n2 1 ≤ i ≤ n1 (4.13) n1 + 1 ≤ i ≤ n with c¯n = 0. In the next step the unknown baseline nuisance parameters F and Λ of (4.11) will be eliminated. Also introduce the order statistics X1:n ≤ X2:n ≤ · · · ≤ Xn:n of the X’s, see (4.2), and let ∆i:n be the censoring status of Xi:n and ∆n = (∆1:n , . . . , ∆n:n ). They are sometimes called the concomitant order statistics. The antirank vector Dn = (Dni )1≤i≤n of the X’s is defined by Xi:n = XDni which is actually a random permutation of {1, ..., n}. Using this notation the score function (4.11) can be rewritten as Z n n X X cni Mγ (Xi , ∆i , ∞) = cnDni ∆i:n γ(Xi:n ) − γ dΛ . (4.14) i=1 [0,Xi:n ] i=1 In this formula Λ is now replaced by its nonparametric estimator, the Nelsonˆ n (see ABGK [3], sect. IV.1.) Aalen estimator Λ ˆn = Λ n X j=1 ∆j:n δX , n + 1 − j j:n (4.15) 18 Towards two-sample survival tests of the pooled sample under the null hypothesis which yields ( ) n i X X ∆j:n γ(Xj:n ) cnDni ∆i:n γ(Xi:n ) − n+1−j i=1 j=1 = n X ( γ(Xi:n )∆i:n cnDni − i=1 Pn j=i cnDnj (4.16) ) n+1−i if summation by parts is used. This consideration suggests to use two-sample test statistics ( ) Pn n X j=i cnDnj Tn = wn (i)∆i:n cnDni − , (4.17) n+1−i i=1 where wn (i) are random weights. If the form γ = γ0 ◦ F is used then F may be estimated by the Kaplan-Meier estimator Fˆn of the pooled sample and (4.17) turns into (4.16) by the choice of weights wn (i) = γ0 (Fˆn (Xi:n −)) or γ0 (Fˆn (Xi:n )) . The Kaplan-Meier estimator is given by Y 1 − Fˆn (Xi:n ) = (1 − j≤i ∆j:n ). n+1−j (4.18) (4.19) Observe, that here wn (i) only depends via the Kaplan-Meier estimator on (∆j:n )j≥1 and not on the metric values of the order statistics (Xj:n )j≥1 . Remark 4.1 (a) In the uncensored case Fˆ (Xi:n ) = ni holds since the Kaplan-Meier estimator coincides with the empirical distribution function. The choice of the weights (4.18) then naturally leads to rank statistics (4.17) which are of course appropriate for our semiparametric model. (b) Note, that the test statistics Tn are the same for our uncentered and centered two-sample regression coefficients (4.12) and (4.13). The corresponding parametric score statistics are, however, definitely not the same. Lemma 4.2 Under uniformly distributed antiranks the increments Pn j=i cnDnj ,1 ≤ i ≤ n, cnDni − n+1−i (4.20) are the increments of a discrete martingale w.r.t the filtration Fj = σ(Dn1 , . . . , Dnj ), j ≤ n. Especially this holds for instance under H0I (4.6). 19 Towards two-sample survival tests Proof. The conditional distribution of Dni given Fi−1 is the uniform distribution on the set {1, . . . , n} \ {Dn1 , . . . , Dni−1 }. Thus Pn j=i cnDnj E(cnDni |Fi−1 ) = (4.21) n+1−i holds. In the special case of the two-sample regression coefficients the increments (4.20) have a nice sequential interpretation. They are equal to ) P 1/2 ( n2 − i−1 1 (D ) n ni {n +1,...,n} 1 j=1 1{n1 +1,...,n} (Dnj ) − n1 n2 n+1−i =: n n1 n2 1/2 {Oi − Ei } . (4.22) The observed part Oi is centered at the expected quantity Ei (w.r.t. Fi−1 ). Note that Ei is just the ratio of the number of individuals under risk of group 2 w.r.t. the number of the whole population under risk given Fi−1 . As a conclusion we see that (4.17) is a weighted sum of the martingale increments (4.20). Thus Tn is based on a sequential construction where the increments (4.20) have an interpretation as observed minus expected quantities. The "status" cnDni is observed and its conditional expectation given Fi−1 is subtracted. 20 Chapter 5 Martingale representations of survival test statistics Unfortunately, Lemma 4.2 is no longer valid under H0II , (4.7), which is the most interesting hypothesis. In this case the counting process approach leads to a martingale representation of test statistics. The main object is the so called log rank process Ln which substitutes the adjusted rank process (4.20). As we will see it is very convenient to use the counting process approach. We refer to Gill [9], Fleming and Harrington [8], ABGK [3] and Shorack and Wellner [39]. Introduce by N1 (t) = n1 X ∆i 1[0,t] (Xi ), N2 (t) = i=1 n X ∆i 1[0,t] (Xi ) (5.1) i=n1 +1 for t ≥ 0 the counting processes of uncensored events of group 1 and group 2 and let N (t) = N1 (t) + N2 (t) be the total number of uncensored events until t. The number of individuals under risk at time t is given by Y1 (t) = n1 X 1[t,∞) (Xi ), Y2 (t) = i=1 n X 1[t,∞) (Xi ) . (5.2) i=n1 +1 Set Y (t) = Y1 (t) + Y2 (t). Let Ft = σ(Xi 1{Xi ≤t} , 1{Xi ≤t} , ∆i 1{Xi ≤t} , 1 ≤ i ≤ n), 0 ≤ t < ∞ (5.3) denote the two-sample analogue filtration of (3.9). Define the log-rank process Z t Y1 Y2 dN2 d N1 − (5.4) Ln (t) = Y Y2 d Y1 0 Z = N2 (t) − 0 t Y2 dN , Y which is carried out as trivial pathwise stochastic integral. 21 Martingale representations of survival test statistics Theorem 5.1 Consider as in (4.1) and (4.2) two-sample survival times with arbitrary C’s. Then Z t Z t Y1 Y2 Y1 , Y2 Ln (t) − dΛ1 − dΛ2 (5.5) Y Y 0 0 is a martingale w.r.t (Ft )t≥0 . In particular, Ln is a martingale under the null hypothesis H0III , (4.8). Proof. For each observation formula (3.11) gives us for γ = 1 the Doob-Meyer decomposition of ∆i 1[0,t] (Xi ). If we take the sum over the first (and second) group we achieve that Z t Y1 (s) dΛ1 (s) =: M1 (t) (5.6) N1 (t) − 0 and Z N2 (t) − t Y2 (s) dΛ2 (s) =: M2 (t) (5.7) 0 are independent martingales. Since the processes YY1 and conclude that Z t Z t Y1 Y2 dM2 − dM1 = 0 Y 0 Y Z Ln (t) − 0 t Y1 Y2 dΛ1 − Y Z 0 t Y1 Y2 dΛ2 Y Y2 Y are predictable we (5.8) is a martingale, see also Gill [9], sect. 3.3, and Fleming and Harrington [8], chapt. 7. If we make use of the two sample regression coefficients (4.12) then the test statistic Tn given in (4.17) can be expressed by the log rank process. Observe that the increments of Ln are just ! Pn n n 1/2 c nD nj 1 2 j=i ∆i:n cnDni − (5.9) ∆Ln (Xi:n ) = n n+1−i and (4.17) reads as Tn = n n1 n2 1/2 X n wn (i)∆Ln (Xi:n ) . (5.10) i=1 Here ∆ has nothing to do with censoring. As usual define ∆f (x) := f (x) − f (x−) of a function f . 22 Martingale representations of survival test statistics If (w(t))t≥0 denotes a predictable weight function with w(Xi:n ) = wn (i) then Tn (t) = n n1 n2 1/2 Z t w(s) dLn (s) (5.11) 0 is a martingale with Tn (∞) = Tn . Thus weights (4.18) based on Fˆn (Xi:n −) yield martingales (5.11). The asymptotics of Tn can then be established by limit theorems for continuous martingales, see Gill [9], ABGK [3] und Fleming und Harrington [8]. Janssen and Neuhaus [21] pointed out that another martingale approach of discrete type is of interest. It is closely connected to statistics based on "observed minus expected" sums and it is mainly needed when ties occur. However, it is also worthwhile for continuous distributions since non-predictable weights given by functions of Fˆn (Xi:n ) can be treated. As main tool the filtration is modified. In the continuous case the main assertion of Lemma 2.1 of Janssen and Neuhaus [21] can be summarized as follows. By a scale transformation the order statistics X1:n < X2:n < . . . < Xn:n are transformed to the grid { n1 , n2 , . . . , nn }. The transformed log rank process is denoted by i Mn ( ) := Ln (Xi:n ), 1 ≤ i ≤ n and Mn (0) = 0 . n (5.12) Check that the increment ∆Ln (Xn:n ) = 0 always vanishes for continuous distributions at the endpoint. Define now the filtration Gi , i = 0, . . . , n − 1 by Gi = σ(Dn1 , . . . , Dni , ∆1:n , . . . , ∆i:n , ∆N (Xi+1:n )) (5.13) where the antiranks and the status of the order statistics are as in (4.14). Let Gn contain all information. In contrast to the ordinary filtration (3.9) the increment of the (joint) counting process N at the next time Xi+1:n is added to Gi . Lemma 5.2 Under the null hypothesis H0III , see (4.8), the process i i 7−→ Mn ( ), 0 ≤ i ≤ n n (5.14) is a martingale w.r.t. the filtration (5.13). Proof. Our counting processes (5.1) are now transformed to the grid where the bar indicates their counterparts on {0, n1 , ..., 1}. Define for i = 1, 2 ¯i ( k ) := Ni (Xk:n ) , N ¯ ( k ) := N (Xk:n ) N n n 23 Martingale representations of survival test statistics and k k Y¯i ( ) := Yi (Xk:n ) , Y¯ ( ) := Y (Xk:n ) = (n + 1 − k) n n where X0:n := −∞ is used for k = 0. The crucial step of the proof is based on the following Lemma. Lemma 5.2 is an immediate consequence of (5.4) and (5.15) below. The proof is figured out for continuously distributed Ti and Ci within our context. A general version which also works when ties are present is proved in Janssen and Neuhaus [21]. Lemma 5.3 Let T1 , .., Tn be i.i.d survival times and let C1 , ..., Cn denote independent censoring times. For each 0 ≤ k ≤ n we have Y¯2 ( nk ) ¯ k k ¯ E ∆N2 ( ) | Gk−1 = ¯ k ∆N ( ) . (5.15) n n Y (n) Remark 5.4 Although the censoring distributions may be inhomogenous the expected number of events ∆N2 ( nk ) of group 2 given Gk−1 can be deduced from a uniform distribution on the set of individuals under risk with {Xk:n , ..., Xn:n }. Formula (5.15) admits an interpretation as the ratio of number of individuals under risks in group 2 divided by the number of all individuals under risk at nk . ¯ ( k ) = 0 since then ∆N2 ( k ) = Proof. Statement (5.15) is obvious for ∆N n n ¯ ( k ) = 1 is valid which means that Xk:n is 0 holds. Assume now that ∆N n uncensored, i.e. ∆k:n = 1. Let k (5.16) Zi ( ) := ∆i 1(−∞,Xk:n ] (Xi ) n denote the uncensored event process of the i-th individual. The restriction to the grid has the advantage that the conditional expectations w.r.t. Gk−1 can be calculated as discrete conditional expectations given the antiranks (Dn1 , ..., Dn(k−1) ) = (j1 , ..., jk−1 ) , (5.17) their censoring status (∆1:n , ..., ∆k−1:n ) = (δ1 , ..., δk−1 ) , (5.18) ¯ ( k ) = 1. Define and ∆N n I := {1, ..., n} \ {j1 , ..., jk−1 } which corresponds to the individuals under risk at nk given (5.17). We will prove below that for each j ∈ I the conditional expectation k k ¯ E Zj ( ) | (Dni )i≤k−1 = (ji )i≤k−1 , (∆i:n )i≤k−1 = (δi )i≤k−1 , ∆N ( ) = 1 n n (5.19) 24 Martingale representations of survival test statistics is independent of the index j. It is obvious that (5.19) vanishes for all j ∈ / I. For j ∈ I the expression (5.19) equals ¯(k) = 1 P Zj ( nk ) = 1 , (Dni )i≤k−1 = (ji )i≤k−1 , (∆i:n )i≤k−1 = (δi )i≤k−1 , ∆N n . ¯(k) = 1 P (Dni )i≤k−1 = (ji )i≤k−1 , (∆i:n )i≤k−1 = (δi )i≤k−1 , ∆N n (5.20) It sufficies to prove that the numerator k k ¯ P Zj ( ) = 1 , (Dni )i≤k−1 = (ji )i≤k−1 , (∆i:n )i≤k−1 = (δi )i≤k−1 , ∆N ( ) = 1 n n (5.21) is independent of j ∈ I. In order to use the conditional distribution given Xj = x the event under consideration in (5.21) is decomposed as [ \ Bx ∩ {Xj = x, ∆j = 1} ∩ {Tm > x, Cm > x} (5.22) x∈R m∈I\{j} where Bx denotes the set Bx = Xj1 < Xj2 < ... < Xjk−1 < x , (∆ji )i≤k−1 = (δi )i≤k−1 . (5.23) Since the individuals are independent the probability (5.21) can be calculated by Fubini’s theorem as Z \ δ P (Bx ∩ {Tm > x, Cm > x}) dL(Xj , ∆j )(x, δ) m∈I\{j} Z = δ P (Bx ) Y P (Tm > x)P (Cm > x) dL(Xj , ∆j )(x, δ) . (5.24) m∈I\{j} Let F be the common distribution function of Ti . By (3.3) the integration via the uncensored subdistribution δ dL(Xj , ∆j )(x, δ) is given by P (Cj > x) dF (x). Thus (5.24) is equal to Z Y P (Bx ) (1 − F (x))|I\{j}| P (Cm > x) dF (x) (5.25) m∈I which is the same for all j ∈ I. Consider now a fixed member m ∈ I. By (5.19) X n1 k k ¯ E ∆Zi ( ) | Gk−1 E ∆N1 ( ) | Gk−1 = n n i=1 k k ¯ = Y1 ( ) E ∆Zm ( ) | Gk−1 n n (5.26) 25 Martingale representations of survival test statistics follows since |{1, ..., n1 }∩I| = Y¯1 ( nk ) is the number of individuals at risk within group 1. Similarily we have k k k ¯2 ( ) | Gk−1 = Y¯2 ( ) E ∆Zm ( ) | Gk−1 . (5.27) E ∆N n n n ¯ =N ¯1 + N ¯2 holds we obtain by (5.26) and (5.27) Since N k k ¯ ( ) = E ∆N ¯ ( ) | Gk−1 ∆N n n k k ¯ = Y ( ) E ∆Zm ( ) | Gk−1 . n n The statements (5.26)-(5.28) prove our lemma. (5.28) The discrete martingale approach of Lemma 5.2 can be used to establish similar results as in (5.11) for a larger class of weight functions. For this purpose introduce the transformed version of (5.9) by Sn (t) := n n1 n2 1/2 X [nt] i=1 i wn (i)Mn ( ), 0 ≤ t ≤ 1 , n (5.29) on [0, 1] with Sn (1) = Tn . Note that whenever the weights i 7−→ wn (i) are predictable w.r.t (Gi )i≤n then (5.29) is a martingale under H0 . For that type of martingales arising from a discrete martingale difference array functional central limit theorems on D[0, 1] are well known. Remark 5.5 Suppose that the weights wn (i) of the test statistic Tn (5.10) are Gi predictable. Note that weights of the form wn (i) = Fˆn (Xi:n ) given by the pooled KaplanMeier estimator are now allowed. Then Tn is the partial sum of a martingale difference array. Under regularity assumptions the discrete martingale central limit theorem can be applied to prove that Tn is asymptotically normal distributed. Also invariance principles can be established, see McLeish [31]. For further information we refer to Hall and Heyde [10], chapt. 3. In particular, we refer to Corrolary 3.1 of Hall and Heyde [10]. Here it is pointed out that the convergence of the predictable quadratic variance Vn of Tn (see (5.32)) is an important ingredient of the central limit theorem. Functional central limit theorems can be applied to the continuous time process (Tn (t))t≥0 ) given in (5.11) to prove convergence of the process on the Skorokhod space D[0, ∞). The arguments are based on Rebolledo’s central limit theorem, see Gill [9], Fleming and Harrington [8] and ABGK [3]. 26 Martingale representations of survival test statistics Under regularity assumptions these CLT ’s establish distributional convergence of the test statistics Tn = Tn (∞) and Tn = Sn (1) to a centered normal random variable. Since the asymptotic variance depends on the unknown censoring distribution this variance has to be estimated. Recall that the predictable quadratic variation process of Mn is given by i X i j 2 hMn i( ) = E Mn ( ) |Gj−1 . (5.30) n n j=1 It is the compensator of M2n which means that M2n − hMn i is a martingale. The increments of that process can be calculated via a conditionally binomial distributed random variable with parameter p = Y2 (Xj:n )/Y (Xj:n ) with j 2 Y1 (Xj:n )Y2 (Xj:n ) E Mn ( ) |Gj−1 = ∆j:n , (5.31) n Y (Xj:n )2 see Fleming and Harrington [8], p. 8 and sect. 7. The predictable quadratic variation of Tn is then X n n Y1 (Xi:n )Y2 (Xi:n ) wn (i)2 ∆i:n (5.32) Vn = n1 n2 i=1 Y (Xi:n ) = n n1 n2 Z ∞ wn (s) 0 Y1 (s)Y2 (s) dN (s) . Y 2 (s) The latter form is used for the continuous martingale approach, see Gill [9] and Fleming and Harrington [8] Together with a conditional Lindeberg condition the convergence of Vn → σ 2 > 0 in probability implies L(Tn ) → N (0, σ 2 ) in law. Thus Vn may serve as a 1/2 consistent variance estimator and Tn /Vn can be taken as studentized test statistics. Under regularity assumptions Tn > −1 1 Φ (1 − α) (5.33) ϕn = if 1/2 0 Vn ≤ establishes a sequence of tests with asymptotic nominal level E(ϕn ) → α under H0III . Sufficient conditions can be found in Neuhaus [35], Th. 3.2. The program can completely be carried out for the statistics Tn and Vn under H0II , see (4.7), which are given by the weights (4.18). For this purpose suppose that n1 lim = η ∈ (0, 1) (5.34) n→∞ n exists and let γ0 be continuous on [0, 1]. To motivate the asymptotic variance formula (5.38) below suppose that i(n) is a sequence of integers with Xi(n):n −→ s. Under H0II the Kaplan-Meier estimator is consistent and γ0 (Fˆn (Xi(n):n −)) −→ γ0 (F (s)) (5.35) 27 Martingale representations of survival test statistics follows. The strong law of numbers implies Yk (Xi(n):n ) −→ (1 − Gk (s))(1 − F (s)) nk Y (X (5.36) ) i(n):n for k = 1, 2 and −→ [η(1 − G1 (s)) + (1 − η)(1 − G2 (s))](1 − F (s)) n almost surely, where G1 and G2 are the censoring distributions of (4.7). By (3.3) we have in addition for k = 1, 2 Z s Nk (Xi(n):n ) −→ (1 − Gk (v)) dF (v) . (5.37) nk 0 Altogether (5.35) - (5.37) motivates that Vn is convergent in probability under H0II with Z ∞ (1 − G1 (s))(1 − G2 (s)) γ(F (s)) lim Vn = dF (s) . (5.38) n→∞ η(1 − G1 (s)) + (1 − η)(1 − G2 (s)) 0 The technical details are discussed in Gill [9], sect. 4 and 5.1, and Fleming and Harrington [8], sect. 7. Example 5.6 Consider the two-sample proportional hazard model of Ex. 2.6. The test ϕn (5.33) given by constant weights wn (i) = 1 is called the log rank test which arises from risk component models with γ0 = 1. 28 Chapter 6 Conditional survival tests Monte Carlo experiments show that the nominal level of the asymptotic tests ϕn (5.33) may exceed the given level α at a small and intermediate finite sample size. Problems of this type may occur for an unbalanced design when the sample size n2 of the test group is much smaller than the sample size n1 in the control group or vice versa. Thus, the question arises whether ϕn can be made finite sample distribution free of exact level α. Recall that the survival tests are extended rank tests which can be made distribution free when all observations are uncensored. The answer of that question relies on the censoring distributions whether they are homogeneous or not. 6.1 The homogeneous null hypothesis H0I Under the null hypothesis (4.6) the observations (X1 , ∆1 ), . . . , (Xn , ∆n ) are i.i.d. Thus, the test ϕn can be carried out as a permutation test which works as follows. • In a first step the data (Xi (w), ∆i (w))i≤n are kept fix. • The permutation step consists in choosing a random permutation τ = (τ1 , . . . , τn ) of {1, . . . , n} which is mutually independent of the data. Let Fn,w (·) denote the conditional distribution function for fixed w of the permutation distribution given by τ 7−→ Tn ((Xτ1 (w), ∆τ1 (w)), . . . , (Xτn (w), ∆τ n (w))) (6.1) where Tn = Tn ((Xi , ∆i )i≤n ) denotes the test statistic (5.10). The permutation version of the survival test ϕn (5.33) works with the condi−1 tional critical value c∗n = c∗n (w) := Fn,w (1 − α), i.e. > 1 γα if Tn = c∗n ϕn,perm = (6.2) 0 < 29 Conditional survival tests where the random variable γα is determined by E(ϕn,perm |(Xi , ∆i )i≤n ) = α . (6.3) Actually, ϕn,perm is a conditional test (test with Neyman structure) given by the order statistics and their concomitants (Xi:n , ∆i:n )i≤n . (6.4) Note that under H0I the statistic (6.4) is sufficient and ϕn,perm has nominal level α. Within this special submodel every test with exact nominal level α is already a permutation test. Lemma 6.1 (Moser 1994, Völker 2003) Within the full nonparametric model with arbitrary continuous distribution functions F and G the statistic (6.4) is complete under H0I . Thus every exact test ψ with E(ψ) = α under H0I is then a conditional test with E(ψ|(Xi:n , ∆i:n )i≤n ) = α. Moser [32] proved the completeness of (∆i:n )i≤n which is enough to treat extended rank statistics Tn ((Dni , ∆i:n )i≤n ). A proof of the general completeness result can be found in Völker [41], p.93. Remark 6.2 (a) Under H0I the antiranks (Dn1 , . . . , Dnn ) of the X’s, see section 4, are independent of the sufficient statistic (6.4) and they are uniformly distributed permutations, see Neuhaus [33]. By Basu’s theorem the independence also follows from Lemma 6.1. (b) For fixed observations of (6.4) the antiranks of the statistic (4.17) are formally substituted by our random permutations (τ1 , . . . , τn ), see (6.1), in order to get the permutation distribution. (c) Suppose that as in (4.18) the random weights wn (i) only depend on the Kaplan-Meier estimator. Then the test statistic Tn = Tn (Dni , ∆i:n )i≤n is a generalized rank statistic which is independent of the value of the order statistics (Xi:n )i≤n . This reflects the semiparametric nature of the testing problem. It is not at once clear that ϕn,perm behaves like its unconditional counterpart ϕn (5.33). Theorem 6.3 (Janssen (1991), Theorem 2.1) Consider the two-sample test (5.33) with test statistics (4.17) and (5.32). Under mild regularity assumptions (with L2 -convergent weights) the tests ϕn,perm and ϕn are asymptotically equivalent under H0I , i.e. ϕn − ϕn,perm −→ 0 in probability . (6.5) 30 Conditional survival tests Whenever γ0 is continuous on [0,1] the regularity assumptions hold for the natural weights (4.18) and (6.5) follows. The permutation version has the advantage that the error probability of the first kind can be controlled for each finite sample size n. Power functions under alternatives are discussed in chapter 7. 6.2 The heterogeneous null hypotheses H0II and H0III Under the null hypothesis H0II the i.i.d. structure of the data (Xi , ∆i )i≤n is lost. In comparision with Tn the conditional distribution of (6.1) has the wrong permutation variance and the ordinary permutation test ϕn,perm does no longer work. Moreover, the equivalence (6.5) will fail in general and E(ϕn,perm ) may exceed the level α under H0II . In view of Lemma 6.1 it is therefore hopeless to look for tests with exact nominal level α under H0II . To overcome these difficulties Neuhaus [35] introduced very interesting studentized permutation tests for H0II which naturally extend the permutation tests (6.2) being valid for H0I . His procedure is based on a variance correction of the permutation variance of (6.1) for different G1 6= G2 . This is done by taking the permutation distribution of the studentized statistic Sn := Tn /V n 1/2 of (5.33), i.e. τ → Sn ((Xτ 1 , ∆τ 1 ), . . . , (Xτ n , ∆τ n )) (6.6) 1/2 for fixed data similar as in (6.1) where the denominator Vn is taken into account and is part of the permutation procedure. If d∗n denotes the conditional (1 − α)-quantile of that permutation distribution of (6.6) then Tn > ∗ 1 S dn (6.7) ϕn,perm = if 1/2 0 Vn ≤ denotes the studentized permutation version of the survival test ϕn (5.33). Theorem 6.4 (Neuhaus 1993) Under mild regularity assumptions (with L2 -convergent weights) the survival tests ϕn given by the extended rank statitstics Tn = Tn ((Dni )i≤n , (∆i:n )i≤n ) and the studentized permutation tests ϕSn,perm are asymptotically equivalent under H0II , i.e. ϕn − ϕSn,perm → 0 in probability . (6.8) In particular, E(ϕSn,perm ) → α holds as n → ∞. Again the theorem applies for continuous γ0 and weights (4.18) under the two-sample regime (5.34). The new permuation test has several advantages: • Under H0I the tests ϕSn,perm are of exact level α for each sample size n. 31 Conditional survival tests • Within the asymptotic set-up it shares the same good properties as ϕn . The proof relies on a conditional central limit theorem for the studentized statistic (6.6). A general conditional central limit theorem can be found in Janssen and Mayer [22]. In that paper the condition (5.34) is weakend and the equivalence (6.8) can be extended to H0III (4.8) for various cases. Also the weights wn (i) of the numerator Tn may depend on the whole set of ordered quantities (6.4). Monte Carlo experiments of Neuhaus [35] and Heller and Venkatraman [14] support the studentized permutation tests. Their nominal level is acceptable also under different censorship with G1 6= G2 . 32 Chapter 7 The efficiency of semiparametric survival tests The present survival tests were introduced in section 4 as semiparametric versions of parametric score tests. Recall that under regularity assumptions parametric one-sided score tests are locally most powerful tests along the underlying path of alternatives in the sense that they maximize the slope of the power function at the origin (null hypothesis), Witting [42], sect. 2.2.4. In general it is hopeless to get explicit finite sample formulas for nonparametric power functions of competing tests. For these reasons the statistician turns to the comparison of asymptotic power functions under local alternatives given by L2 -differentiable √ paths. Local alternatives are rescaled alternatives (typically of order 1/ n), which can be motivated as follows. Under this type of alternatives the interesting asymptiotic envelope power functions (serving as benchmark) are non-trivial with √ power between the level α and power 1. Slower rescaled alternatives than 1/ n lead to asymptotic envelope power 1 and a power comparison is here doubtful. The asymptotic envelope power function is evaluated at an arbitrary sequence of local alternatives and it is given by the asymptotic power within the underlying class of tests. It is obvious that different classes of tests may have different envelope power functions. In our case the comparison is now done via parametric paths of alternatives, given by the direction γ0 ∈ L2 (0, 1), which are still present but the statistician does not know in advance which path√ would be adequate. A typical twosample path of alternatives of order 1/ n was introduced in (4.10) via (2.17). A sequence of tests at asymptotic level α for some null hypothesis is called asymptotically efficient w.r.t. a given class of tests along a path of alternatives if its asymptotic local power function reaches the underlying asymptotic envelope power function. As consequences of the three famous Lemmas of LeCam, see Hájek und Šidák [12], the following result is well-known for onesample and two-sample testing problems. 33 The efficiency of semiparametric survival tests Theorem 7.1 Consider a parametric L2 -differentiable path of distributions and one-sided lo√ cal alternatives of order 1/ n for a simple null hypothesis. Then the finite sample locally most powerful score tests of asymptotic level α are asymptotically efficient in the sense that their power reaches the asymptotic envelope power function given by the Neyman Pearson tests of level α for local alternatives. The same results holds for two-sided testing problems and asymptotically unbiased tests w.r.t. unbiased asymptotic envelope power functions. The question of efficiency is slightly different for a composite null hypothesis. We do not like to go into details but we like to explain the principle of efficient score functions for a composite null hypothesis. Suppose that we consider a curve Pϑ which hits a composite null hypothesis H(0) at P0 , see figure 7.1. When testing H(0) against Pϑ the Neymann Pearson test for P0 against Pϑ is not adequate since additional information concerning the path is used. For this reason Pϑ is projected on some Qϑ ∈ H(0) where Qϑ and Pϑ may stand for a least favorable pair, i.e. they correspond to the hardest binary testing problem arising from H(0) against Pϑ . To explain this approach let βα (Pϑ ) = sup EPϑ (ϕ) (7.1) denote the envelope power function at level α for testing H(0) against {Pϑ } where the supremum is taken over all tests ϕ of level α for H(0). Let ϕN P (α,Q,Pϑ ) be the Neyman Pearson test for a simple null hypothesis {Q} ⊂ H(0) against Pϑ with nominal level α. Obviously, the upper bound βα (Pα ) ≤ inf EPϑ (ϕN P (α,Q,Pϑ ) ) (7.2) Q∈H(0) of the envelope power function can be derived. Thus a level α test ψ for H(0) is efficient for testing H(0) against {Pϑ } if there exists a (projection) Qϑ ∈ H(0) with EPϑ (ψ) = EPϑ (ψN P (α,Qϑ ,Pϑ ) ). For an increasing sample size n we attach the index n at Hn (0), Pϑn and ψn . A sequence of tests ψn of asymptotic level α for testing Hn (0) against {Pϑn } is then efficient if there exists a sequence Qϑn ∈ Hn (0) with EPϑn (ψn ) − EPϑn (ψN P (α,Qϑn ,Pϑn ) ) −→ 0 (7.3) as n → ∞. The projection procedure is visualized in figure 7.1. 34 The efficiency of semiparametric survival tests H(0) u u Qϑ Pϑ u P 0 Figure 7.1: 35 The efficiency of semiparametric survival tests There is a nice analytic way to describe the projection procedure of Pϑ on Qϑ in H(0) via score functions. Let l= dPϑ d log dϑ dP0 |ϑ=0 (7.4) be the score function of the underlying path. Introduce the set M ⊂ L2 (P0 ) which is the closed linear subspace generated by all possible score functions at P0 arising from paths in H(0). Then l = l1 + l2 can be decomposed in a score function l2 ∈ M and a M -orthogonal part l1 . The function l1 is called the efficient score function, see Bickel et al. [6]. The members of the path Qϑ , given by the projection Pϑ on H(0), have usually the score function l2 . Pϑ . On In this case the efficient score function is equal to l1 = ddϑ log dd Q ϑ |ϑ=0 the other hand if we start with the decomposition l = l1 + l2 then typically a path in H(0) with score function l2 can be constructed. Their members are then candidates for the projection Qϑ . For a L2 -differentiable path we now summarize the results for testing a composite null hypothesis of product measures against local alternatives derived by our path Pϑ . The next theorem is in the spirit of Theorem 7.1. Theorem 7.2 Score tests based on the efficient score function l1 (of the underlying score function l) are asymptotically efficient for composite hypotheses against onesided local alternatives given by the underlying path with score function l. The details about efficient score functions can be found in Bickel et al. [6]. We refer to Witting and Müller-Funk [43], sect. 6.4., for applications to tests. The meaning of the projection method can easily be understood if we turn to the limit experiment, see Witting and Müller-Funk [43], sect. 6.4. Under L2 -differentiability local asymptotic normality (LAN) holds and the limit experiment is a Gaussian shift with a Gaussian loglikelihood process. The asymptotic envelope power function is then the envelope power function for the limit experiment. It is easy to see that for Gaussian shifts the envelope power function is given by the Neyman Pearson power of the projection Qϑ of Pϑ on H(0) against Pϑ . Under LAN the same holds for the asymptotic power function within the asymptotic set-up. These general principles will now be applied and explained for the different null hypotheses H0I and H0II where the distributions F1 = F2 and the G’s are nuisance parameters. The path of alternatives are constructed as in (4.10) via the hazard rate models of chapter 2. In his fundamental work Gill [9], chapt. 5, calculated the asymptotic power function of our unconditional survival tests (5.33) under local hazard alternatives. As a main tool he used the martingale approach summarized in Th. 5.1. 36 The efficiency of semiparametric survival tests Let w(·) be a F-predictable weight function. Then Z t Z t Y 1 Y2 w w(s)dLn (s) − [dΛ1 − dΛ2 ] Y 0 0 (7.5) is a martingale. Under local alternatives of type (4.10) then continuous time martingale central limit theorems prove asymptotic normality of our test statistic Tn . Notice that contiguity implies that the variance estimator Vn of Tn remains consistent under local alternatives whenever (5.38) holds under the null hypothesis. These results were used by Gill [9], sect. 5.1 and 5.3, to compare competing weight functions and tests for the underlying hazard models, see also Fleming and Harrington [8]. The general question about the efficiency in comparison with envelope power functions remained open at that time. Below the efficiency of tests is discussed for the different null hypotheses (4.6) (4.8). We like to restrict ourselves to one-sided tests. Two-sided alternatives can be handled similarly. 7.1 The null hypothesis H0I with equal censorship G1 = G2 For the parametric set-up the score tests based on the statistic (4.14) are asymptotically efficient for testing {ϑ = 0} against {ϑ > 0} (asymptotically equivalent to Neyman Pearson tests) in direction γ = γ0 ◦ F under the parametrization (4.10) with arbitrary centered regression coefficients if maxi |cni | → 0 holds. At this stage we like to point out how the parametrization (4.13) given by the centered regression coefficients fits into the efficiency concept of Theorem 7.2. Let us start with an uncentered two-sample parametrization n2 Pϑ := L(T1 , ..., Tn ) = F n1 ⊗ Fϑ( n )1/2 n1 n2 see (4.10) and (4.12), with score function 1/2 X n n l= Mγ (Xi , ∆i , ∞) . n1 n2 i=n +1 , (7.6) (7.7) 1 We will see that (at least in the asymptotic set-up) the projection of Pϑ in H0I is given by the product measure Qϑ = Fϑnn2 ( n n )1/2 n1 n2 . (7.8) Observe that the family ϑ 7→ Qϑ has the score function n l2 = n2 n 1/2 X ( ) Mγ (Xi , ∆i , ∞) . n n1 n2 i=1 (7.9) 37 The efficiency of semiparametric survival tests It is easy to see that the efficient score function of l is just l1 = l − l2 = n X cni Mγ (Xi , ∆i , ∞) (7.10) i=1 where cni denote the centered regression coefficients (4.13). We remark that the projection step can be avoided when we are starting with the centered regression model (4.13). This discussion is not primarily a matter of censored data. In case of ordinary rank tests Hájek et al. [12] already worked with centered regression coefficients. For simplicity only centered regression coefficients (4.13) are used troughout. Remark 7.3 In Janssen [16] the elemination of the nuisance parameter F and Λ, see (4.15) (4.17), was made rigorous. For weights of type (4.18) with wn (i) = γ0 (Fˆn (Xi:n −)) (or more generally for L2 -convergent weights) our test statistic Tn (4.17) is asymptotically equivalent to the score function (4.14), i.e. Tn − l1 → 0 under H0I . Together with the consistency of Vn , see (5.38), this equivalence implies the efficiency of ϕn (and its conditional counterpart ϕn,perm (6.2)) in direction of all semiparamteric alternatives given by γ0 ◦ F where F is the nuisance parameter and γ0 is the parameter of interest. The next Theorem answers the question about the power of ϕn (constructed for direction γ0 ◦ F ) under local alternatives specified by a hazard derivative γ1 ◦ F . R∞ We assume that 0 (γ0 ◦ F )(γ1 ◦ F )(1 − G) dF ≥ 0 holds. Otherwise, a minus sign has to be added before (ARE)1/2 in formula (7.11). Theorem 7.4 Let ϕn be the sequence of efficient survival tests in direction γ0 ◦ F . Under local two-sample alternatives given by the hazard rate derivatives in direction γ1 ◦ F of type (4.10) with (4.13) the asymptotic power function is Φ((ARE)1/2 σ − u1−α ) (7.11) R where σ 2 = (γ1 ◦ F )2 (1 − G) dF and ARE is the asymptotic relative Pitman efficiency 2 R∞ (γ ◦ F )(γ ◦ F )(1 − G) dF 0 1 0 R∞ . (7.12) ARE = R ∞ (γ ◦ F )2 (1 − G) dF 0 (γ1 ◦ F )2 (1 − G) dF 0 0 Proof. See Janssen [16], Th. 3.1 and (3.17). More information about the Pitman efficiency can be found in Hájek and Šidák [12], Witting and Müller-Funk [43] and the lecture notes Janssen [19]. 38 The efficiency of semiparametric survival tests 7.2 The null hypothesis H0II (unequal censorship with G1 6= G2) Under unequal censorship the null hypothesis is much larger as above. Observe first that then the asymptiotic variances (5.38) of Tn (as well as the asymptotic power function (7.11)) depend on the pair G1 , G2 . However, the efficiency of extended survival rank tests can be obtained. This was first done by Gill in ABGK [3], VIII 2.3 under parametric models and in VIII 4.2 for semiparametric transformation models. On page 624 the authors were talking about "our informal discussion here". For an easy proof of the general asymptotic efficiency we refer to the beautiful article of Neuhaus [36], who followed the lines of ABGK [3]. He calculated the efficient score function and then he proves the efficiency of the underlying survival tests for the semiparametric direction γ ◦ F . We will indicate how his construction fits in the methodology of Theorem 7.2 which gives us further insight in this result. Under centered regression coefficients the parametric score function l (4.11) can be expressed by the martingales Mi of the counting processes Ni , see (5.6) and (5.7). Combining the results of chapter 4 and 5 we arrive at n n n 1/2 Z ∞ X dM2 dM1 1 2 − (7.13) γ◦F l= cni Mγ (Xi , ∆i , ∞) = n n n 2 1 0 i=1 which differs from Neuhaus [36], (3.35), by a minus sign since group 1 and 2 are interchanged. Again the score function l can be projected in a set of score functions arising from paths in H0II . In contrast to part 7.1 it is necessary to vary the direction γ and the projection procedure is not merely a manipulation of regression coefficients. Observe that L2 -differentiable paths (of i.i.d. survival times) within H0II admit score functions Z ∞ n 1 X 1 h [dM1 + dM2 ] (7.14) Z(h) := √ Mh (Xi , ∆i , ∞) = √ n i=1 n 0 at F n where a further hazard derivative h serves as a nuisance parameter. The calculation is similar to (7.13). Neuhaus calculated the projection l2 of l in the space generated by (7.14) (at ¯ for the least in the asymptotic set-up). This projection is given by l2 = Z(h) special choice 1/2 G ¯2 − G ¯1 ¯h = n1 n2 γ◦F . (7.15) ¯ n2 G ¯ i is defined by G ¯ i = 1 − Gi and G ¯ = n1 G ¯ 1 + n2 G ¯ 2. The function G n n The efficient score function of (7.13) is then Z ∞ ¯1 ¯2 1 n1 1/2 G n2 1/2 G l1 = l − l2 = √ γ0 ◦ F ( ) ¯ dM2 − ( n1 ) G ¯ dM1 . (7.16) n2 n 0 G 39 The efficiency of semiparametric survival tests The efficient score function l1 is of parametric nature. The semiparametric version is Z ∞ γ0 ◦ Fˆn (t−) dLn (t) (7.17) Tn = 0 given by the log-rank process Ln and with the weight function (4.18) based on the Kaplan-Meier estimator. Gill [9], Theorem 4.2.1, proves that l2 − Tn −→ 0 (7.18) holds in probability under H0II and mild assumptions, see also Fleming and Harrington [8]. Recall from section 5 that the variance estimator Vn is still consistent under H0II . Together with (7.18) the asymptotic efficiency of the semiparametric survival tests ϕn , see (5.33), based on (7.17) follows for onesided hazard alternatives in direction γ0 ◦ F . According to Theorem 6.4 the same holds for their conditional counterparts of general studentized permutaion test type. For G1 6= G2 the form (7.12) of ARE has to be modified. Note that the discussion above proves that the ARE coincides with the efficacies of Gill [9], sect. 5, who was already able to compare the power functions of competing underlying survival tests with linear statistics. By the results of Neuhaus we are now able to compare the asymptotic power of ϕn with the asymptotic envelope power function w.r.t. H0II level α tests which is of course more general. 40 Chapter 8 Omnibus tests and related tests Similar to section 7 asymptotically efficient two-sided survival tests can be obtained for the testing problem (4.5) with F1 = F2 against two-sided alternatives. The tests given by the statistics |Tn | are then asymptotically efficient within the class of asymptotically unbiased tests for the underlying semiparametric model given by direction γ ◦ F . Also the formula (7.12) for the asymptotic relative Pitman efficiency ARE remains unchanged under G1 = G2 . These tests have the disadvantage that they have poor power when ARE is small. For those directions with ARE = 0 they are not consistent. In order to overcome this difficulty goodness of fit tests can be applied. 8.1 Two-sided goodness of fit tests For the motivation of this section we refer to Gill [9], sect. 5.4, and ABGK [3]. To give the motivation let us consider first the uncensored two-sample case. Often classical goodness of fit test statistics are given by Z t sup | w (dFˆn1 − dFˆn2 ) | (8.1) t −∞ which is based on the weighted difference of the underlying empirical processes Fˆni of group i = 1, 2. Also the sup-norm can be substituted by other seminorms. It is already known from the work of Khmaladze [24] that the martingale part of the Doob-Meyer decomposition of the empirical processes may serve as a good ingredient of test statistics. In the case of uncensored data the details are outlined in Janssen and Milbrodt [23] for the two-sample problem. In conclusion (8.1) is substituted by sup |Tn (t)| (8.2) t≥0 where Tn (t) is as in (5.11) a sequential version of our survival test statistic (4.16). Tests based on (8.2) are now appropriate for censored data. They are 41 Omnibus tests and related tests called tests of Rényi type and have already been studied by Gill [9], sect. 5.4, see also Fleming and Harrington [8]. The asymptotics is fully understood and we will only summarize the results and indicate how the test statistics (8.2) can be treated. • Under H0II a functional central limit theorem holds for t 7→ Tn (t) on D[0, ∞) where the limit is given by a time rescaled Brownian motion t 7→ B(V (t)). Here V (t) is given by the integrand of (5.32) on the domain [0, t]. Similarly to section 5 discrete martingale methods apply to the grid, confer Remark 5.5. • Unconditional asymptotic level α tests can thus be introduced via the critical values calculated for the limit process. • Similarily as in section 6 conditional versions of Rényi tests (permutation tests) work well. Again studentized versions should be prefered. In this case Theorem 6.4 carries over to Rényi tests and the conditional versions remain asymptotically equivalent. This result is due to Neuhaus [34]. Conditional functional central limit theorems were also established by Janssen and Mayer [22]. 8.2 Projection tests in survival analysis Although goodness of fit tests are usually consistent under fixed alternatives their quality is strongly influenced by the choice of weight functions or the underlying seminorm on the space of trajectories. The judgement of goodness of fit tests relies on a principle component decomposition of the asymptotic power function given by sequences of local alternatives, see Janssen and Milbrodt [23] and Janssen [18], [20]. However, already in the case of uncensored data the analytic calculations of the principle components may be difficult, see also Shorack and Wellner [39], chapter V. At this stage another class of adaptive tests called projection tests is most promising. They have the advantage that the statistician can select a whole cone or subspace of alternatives (not only of dimension one treated earlier in section 4 and 7) which can be seperated from the null hypothesis with sufficiently high probability. We will only present the methodology and refer the details from the literature. The basic idea is due to Behnen and Neuhaus [5], section 3.2(C), who developed projection tests for models given by tangents (score functions). The extention to hazard rate models and censored data was done by Mayer [29], [30]. As example we will treat the one-sided two-sample testing problem with stochastically larger alternatives. Two-sided tests can be treated similarly. It is now the idea to replace the semiparametric direction γ ◦ F of (2.17) and (4.10) by a cone of relevant alternatives. The scientist has to specify a 42 Omnibus tests and related tests finite number r (not too many) of semiparametric directions γi ◦ F , 1 ≤ i ≤ r, derived from the stochastically larger alternatives. The hazard rate derivatives γi ◦ F stand for directions of high preference. For instance the statistician likes to discover differences of the relative risk (hazard rates) for • proportional hazards (constant over time) • early survival times • central survival times • late survival times. (The example for the γi , 1 ≤ i ≤ 4, may be taken from the list of figure 2.1.) For a fixed choice γi : [0, 1] → R let us now denote by ( ) r X V + := γβ ◦ F := βi γi ◦ F, βi ≥ 0, β = (β1 , ..., βr ) (8.3) i=1 the cone of relevant hazard rate derivatives. A concrete model is just (2.17) with γβ ◦ F where now β ∈ [0, ∞)r (and also F ) is an additional nuisance parameter. Testing H0II can be expressed by testing the local coordinates β = 0 against β ∈ V + \ {0} . (8.4) Let us attach an additional index γ with the convention Tn =: Tn (γ) at the ordinary survival statistic (5.10) with weights (4.18). We briefly summarize the solution of Mayer [29], [30] which works along the lines of Behnen and Neuhaus [5]. The projection test for (8.4) is an asymptotic likelihood ratio test within the asymptotic limit model under local alternatives. By the projection method of Behnen and Neuhaus [5] the vector β is estimated by βˆ = (βˆ1 , ..., βˆr ). The likelihood ratio type test statistic is then r r X X ˆ Tn ( βi γi ) = βˆi Tn (γi ) . i=1 (8.5) i=1 This test statistic can be motivated as likelihood ratio test statistic for (8.5) within the asymptotic Gaussian shift model. The choice of asymptotic or permutation based critical values now depends as in (5.33) on a proper variance estimation Vˆn under H0II . This is solved in the work of Mayer [29], [30]. The test ϕn where Tn and Vn are replaced by (8.5) and some estimator Vˆn is called projection test. The region with high power is then spread out over the cone V + of alternatives. In the case of two-sided alternatives (with a subspace instead of the cone V + ) the tests are Neyman smooth type tests for uncensored data. 43 Omnibus tests and related tests Mayer showed that these projection tests are asymptotically admissible and that Monte-Carlo simulations support them. He also proved (similarly to section 6) that permutation versions of studentized statistics share the same asymptotic properties as their unconditional counterparts. 44 Bibliography [1] Andersen; P.K., Borgan, Ø. (1985). Counting Process Models for Life History Data: A Review. Scand. J. Statist. 12, 97-158. [2] Andersen, P.K.; Borgan, Ø.; Gill, R.D.; Keiding, N. (1982). Linear nonparametric tests for comparison of counting processes with application to censored survival data (with discussion). Int. Statist. Rev. 50, 219-258. [3] Andersen, P.K.; Borgan, Ø.; Gill, R.D.; Keiding, N. (1993). Statistical models based on counting processes. Springer, New York. [4] Balakrishnan, N., Rao, C.R. (2004) Advances in Survival Analysis. Handbook of Statistics 23. Elsevier, Amsterdam, 251-262. [5] Behnen, K.; Neuhaus, G. (1989). Rank tests with estimated scores and their application. Teubner-Skripten zur Mathematischen Stochastik. Stuttgart. [6] Bickel, P.; Klaasen, C.; Ritov, Y.; Wellner, J. (1993). Efficient and adaptive estimation for semiparametric models. John Hopkins Series in Math. Sciences, The John Hopkins Univ. Press, Baltimore. [7] Efron, B.; Johnstone, I. (1990). Fisher‘s information in terms of the hazard rate. Ann. Statist. 18, 38-62. [8] Fleming, T.R.; Harrington, D.P. (1991). Counting processes and survival analysis. Wiley, New York. [9] Gill, R.D. (1980). Censoring and stochastic integrals. Math. Centre Tracts 124, Mathematisch Centrum, Amsterdam. [10] Hall, P.; Heyde, C.C. (1980) Martingale limit theory and its application. Probability and Mathematical Statistics. Academic Press, New York. [11] Hájek, J.; Šidák, Z. (1967). Theory of rank tests. New York-London: Academic Press, Prague. [12] Hájek, J.; Šidák, Z.; Sen, P.K. (1999). Theory of rank tests. 2nd ed., Academic Press. xiv, Orlando. [13] Harrington, D.P.; Fleming, T.R. (1982). A class of rank test procedures for censored survival data. Biometrika 69, 553-566. [14] Heller, G.; Venkatraman, E.S. (1996). Resampling procedures to compare two survival distributions in the presence of right-censored data. Biometrics 52, 1204-1213. 45 Bibliography [15] Janssen, A. (1989). Local asymptotic normality for randomly censored data with applications to rank tests. Statist. Neerlandica 43, 109-125. [16] Janssen, A. (1991). Conditional rank tests for randomly censored data. Ann. Stat. 19, 1434-1456. [17] Janssen, A. (1994). On local odds and hazard rate models in survival analysis. Statistics and Probability Letters 20, 355-365. [18] Janssen, A. (1995). Principal component decomposition of non-parametric tests. Probability Theory and Related Fields 101, 193-209. [19] Janssen, A. (1998). Zur Asymptotic nichtparametrischer Tests. Lecture Notes. Skripten zur Mathem. Statistik 29, Münster. [20] Janssen, A. (2000). Global power functions of goodness of fit tests. Ann. Stat. 28, 239-253. [21] Janssen, A.; Neuhaus, G. (1997) Two-sample rank tests for censored data with non-predictable weights. J. Stat. Plann. Inference 60, 45-59. [22] Janssen, A.; Mayer, C.-D. (2001) Conditional Studentized survival tests for randomly censored models. Scand. J. Stat. 28, 283-293. [23] Janssen, A.; Milbrodt, H. (1993) Rényi type goodness of fit tests with adjusted principal direction of alternatives. Scand. J. Stat. 20, 177-194. [24] Khmaladze, E.V. (1981) Martingale approach in the theory of goodness-of-fit tests. Theory Probab. Appl. 26, 240-257. [25] Klein, J.P.; Moeschberger, M.L. (2003) Survival analysis. Techniques for censored and truncated data. 2nd ed. Statistics for Biology and Health. New York. [26] Lan, K.-K.G.; Wittes, J. (1990). Linear Rank Tests for Survival Data: Equivalence of Two Formulations. The American Statistican (Teacher’s Corner), Vol. 40, 23-26. [27] LeCam, L.; Yang, L.G. (1988). On the preservation of local asymptotic normality under information loss. Ann. Stat. 16, 483-520. [28] LeCam, L.; Yang, L.G. (2000). Asymptotics in statistics. Some basic concepts. Second edition. Springer series in statistics, New York. [29] Mayer, C.-D. (1996). Projektionstests für das Zweistichprobenproblem mit zensierten Daten. Dissertation, Heinrich-Heine Universität Düsseldorf. [30] Mayer, C.-D. (1998). Projection-type rank tests for randomly right censored data. Technical Report, University of Düsseldorf. [31] McLeish, D.L. (1974). Dependent central limit theorems and invariance principles. Ann. Prob. 2, 620-628. [32] Moser, M. (1994). Completeness of time-ordered indicators in censored data models. Stat. Probab. Lett. 21, 163-166. [33] Neuhaus, G. (1988). Asymptotically optimal rank tests for the two-sample problem with randomly censored data. Comm. Statist. Therory Methods 17, 2037-2058. 46 Bibliography [34] Neuhaus, G. (1991). Some linear and nonlinear rank tests for competing risks models. Commun. Stat., Theory Methods 20, 667-701. [35] Neuhaus, G. (1993). Conditional rank tests for the two-sample problem under random censorship. Ann. Stat. 21, 1760-1779. [36] Neuhaus, G. (2000). A method of constructing rank tests in survival analysis. Journal of Statistical Planning and Inference 91, 481-497. [37] Ritov, Y.; Wellner, J.A. (1988). Censoring, martingales, and the Cox model. Statistical inference from stochastic processes, Contemp. Math. 80, 191-219. [38] Schumacher, M.; Schulgen, G. (2002) Methodik klinischer Studien. Methodische Grundlagen der Planung, Durchführung und Auswertung. Springer Verlag. Berlin. [39] Shorack, G.R.; Wellner, J.A. (1986). Empirical Processes with Applications to Statistics. Wiley, New York. [40] Strasser, H. (1985). Mathematical theory of statistics. Statistical experiments and asymptotic decision theory. De Gruyter, Berlin. [41] Völker, D. (2003). Finit optimale nichtparametrische Tests für Lebensdauerzeiten. Dissertation, Westfälische Wilhelms-Universität Münster. [42] Witting, H. (1985). Mathematische Statistik I. Parametrische Verfahren bei festem Stichprobenumfang. B.G. Teubner, Stuttgart. [43] Witting, H.; Müller-Funk, U. (1995) Mathematische Statistik II. Asymptotische Statistik: Parametrische Modelle und nichtparametrische Funktionale. B.G. Teubner, Stuttgart. 47
© Copyright 2024