Addendum to Integrated Conditional Moment Tests for Parametric Conditional Distributions Herman J. Bierens June 7, 2015 Abstract In this addendum to Bierens, H. J., and L. Wang (2012): ”Integrated Conditional Moment Tests for Parametric Conditional Distributions”, Econometric Theory 28, 328362. [BW hereafter], I will provide the proof of Mercer’s theorem for the complex case (Lemma 3 in BW), including the Hilbert-Schmidt theorem regarding the existence of eigenvalues and eigenfunctions. Moreover, Lemma 4 in BW appears to be incorrect. Therefore, a revised version of Lemma 4 will be provided as well. Furthermore, on the basis of this revised Lemma 4 I will derive upper bounds of the asymptotic critical values of the SICM test. 1. Introduction The main purpose of this addendum to Bierens and Wang (2012) [BW hereafter] is to provide a complete proof of Mercer’s theorem (Lemma 3 in BW), including the underlying Hilbert-Schmidt theorem regarding the existence of eigenvalues and eigenfunction of complex-valued continuous symmetric positive definite kernels. For the proof of Lemma 3 BW referred to an unpublished paper, HadinejadMahram et al. (2002), which has disappeared from the internet, and a published paper by Krein (1998). The latter author derived the complex Hilbert-Schmidt and Mercer theorems as by-products of linear operator theory, which I am not familiar with, and likely the same applies to most of my fellow econometricians. Therefore, in this addendum I will provide my own proofs of the complex HilbertSchmidt and Mercer theorems. The same problem occurred with the (real) version of Mercer theorem in Bierens and Ploberger (1997), for which the proof was a reference to an exercise in a textbook. Therefore, I decided to derive this proof myself. See Bierens (2014b). The proofs in the current addendum are complex adaptations of the ones in Bierens (2014b). Also, it appears that Lemma 4 is incorrect. Therefore, in this addendum I will provide a revised version of Lemma 4. On the basis of the results in BW and in this addendum I have been able to update my first consistent model specification testing paper, Bierens (1982), by deriving the asymptotic null distribution of the test, which is similar to BW, and upper bounds of the critical values on the basis of the revised Lemma 4. See Bierens (2015) and section 7 below. In this addendum I will use the same notations as in BW, except that in integrals with respect to a probability measure µ on set B I will use the notation dµ(β) instead of µ(dβ) because µ(β) can be interpreted as a distribution function. The proofs in this addendum employ Hilbert space theory at the level of Bierens (2014a), linear algebra at the level of Bierens (2004, Appendix I), complex calculus at the level of Bierens (2004, Appendix III), and measure and probability theory at the level of Bierens (2004, Ch. 1-3). 2. Complex covariance functions In Lemma 3 in BW the covariance function is of the form Γ(β1 , β2 ) = E[Z(β1 )Z(β2 )] = E [(Re[Z(β1 )] + i. Im[Z(β1 )]) (Re[Z(β2 )] − i. Im[Z(β2 )])] = E [Re[Z(β1 )]. Re[Z(β2 )]] + E [Im[Z(β1 )]. Im[Z(β2 )]] +i.E [Im[Z(β1 )]. Re[Z(β2 )]] − i.E [Re[Z(β1 )]. Im[Z(β2 )]] , (2.1) where Z(β) is a complex-valued zero-mean continuous Gaussian process on a compact subset B of a Euclidean space. This covariance function is symmetric positive semidefinite in the following sense. First, symmetry in the complex case means that Γ(β1 , β2 ) = Γ(β2 , β1 ), 2 (2.2) which follows straightforwardly from (2.1). In particular, writing (2.3) Γ(β1 , β2 ) = Re[Γ(β1 , β2 )] + i. Im[Γ(β1 , β2 )], the symmetry condition (2.2) implies that Re[Γ(β1 , β2 )] = Re[Γ(β2 , β1 )], Im[Γ(β1 , β2 )] = − Im[Γ(β2 , β1 )] (2.4) Im[Γ(β, β)] = 0. (2.5) and thus Second, positive semidefiniteness with respect to a probability measure µ on B means that Z Z ϕ(β1 )Γ(β1 , β2 )ϕ(β2 )dµ(β1 )dµ(β2 ) ≥ 0, (2.6) for all ϕ ∈ L2C (µ), where: Definition 2.1. L2C (µ) denotes the RHilbert space of all Borel measurable complexvalued functions ϕ on B satisfying |ϕ(β)|2 dµ(β) < ∞, endowed with the innerR product hϕ1 , ϕ2 i = ϕ1 (β)ϕ2 (β)dµ(β) and associated norm sZ sZ p ||ϕ|| = hϕ, ϕi = ϕ(β)ϕ(β)dµ(β) = |ϕ(β)|2 dµ(β) and metric ||ϕ1 − ϕ2 ||. In particular, in the case (2.1) it can be shown, after some tedious but straightforward complex calculations, that Z Z ϕ(β1 )Γ(β1 , β2 )ϕ(β2 )dµ(β1 )dµ(β2 ) "µZ ¶# 2 =E Re[ϕ(β)] Re[Z(β)]dµ(β) +E +E ∙µZ + Z "µZ ¶2 # Im[ϕ(β)] Im[Z(β)]dµ(β) Im[ϕ(β)] Re[Z(β)]dµ(β) ¶2 # ≥ 0. Re[ϕ(β)] Im[Z(β)]dµ(β) 3 (2.7) Moreover, the covariance function Γ(β1 , β2 ) in (2.1) is continuous because Re[Z(β)] and Im[Z(β)] are a.s. continuous. In the mathematical literature such a function Γ(β1 , β2 ) is called a kernel, and I will do so too. The interpretation of Γ(β1 , β2 ) as a covariance function of a continuous complexvalued Gaussian process is irrelevant for the Hilbert-Schmidt and Mercer theorems. All we need to require is that: Assumption 2.1. The kernel Γ(β1 , β2 ) on B × B is complex-valued, continuous, and symmetric positive semidefinite with respect to a probability measure µ on B, where B is a compact subset of a Euclidean space, and Assumption 2.2. RR |Γ(β1 , β2 )|2 dµ(β1 )dµ(β2 ) > 0. The latter excludes the case that Γ(β1 , β2 ) is identical zero. 3. The Hilbert-Schmidt theorem for complex kernels 3.1. The eigenvalue-eigenfunction problem The general eigenvalue problem is to find a scalar λ and a function ψ ∈ L2C (µ) normalized to unit norm, ||ψ|| = 1, such that1 Z (3.1) Γ(β1 , β2 )ψ(β2 )dµ(β2 ) = λψ(β1 ) for all β1 ∈ B. Taking complex-conjugates in (3.1), the latter is equivalent to Z ψ(β2 ).Γ(β2 , β1 )dµ(β2 ) = λψ(β1 ). (3.2) If such a pair (λ, ψ) exists then by (2.6), (3.1), (3.2) and the normalization ||ψ|| = 1, Z Z ψ(β1 )Γ(β1 , β2 )ψ(β2 )dµ(β1 )dµ(β2 ) 0 ≤ Z Z = λ ψ(β1 )ψ(β1 )dµ(β1 ) = λ |ψ(β)|2 dµ(β) = λ. 1 In Lemma 3 in BW, (3.1) was incorrectly stated as λψ(β1 ) = 4 R Γ(β1 , β2 )ψ(β2 )dµ(β2 ). Thus, eigenvalues of positive semidefinite kernels are real valued and nonnegative. Moreover, if a solution of (3.1) exists with λ > 0 then the corresponding eigenfunction ψ is continuous on B because Γ(β1 , β2 ) is continuous on B × B. Now denote ¯2 Z ¯Z ¯ ¯ ¯ ¯ G(λ, ψ) = ¯ Γ(β1 , β2 )ψ(β2 )dµ(β2 ) − λψ(β1 )¯ dµ(β1 ) for λ ∈ (0, ∞), ψ ∈ L2C (µ) with ||ψ|| = 1. Then eigenvalue problem (3.1) for λ > 0 is equivalent2 to the problem: Find a λ > 0 and a ψ ∈ L2C (µ) with ||ψ|| = 1 such that G(λ, ψ) = 0. (3.3) Observe that Z µZ ¶ ψ(β2 ).Γ(β2 , β)dµ(β2 ) − λψ(β) G(λ, ψ) = ¶ µZ × Γ(β, β2 )ψ(β2 )dµ(β2 ) − λψ(β) dµ(β) ¶ Z µZ ψ(β1 ).Γ(β1 , β)dµ(β1 ) − λψ(β) = ¶ µZ × Γ(β, β2 )ψ(β2 )dµ(β2 ) − λψ(β) dµ(β) Z Z ψ(β1 )Γ2 (β1 , β2 )ψ(β1 )dµ(β1 )dµ(β2 ) = Z −2λ ψ(β1 )Γ(β1 , β2 )ψ(β2 )dµ(β2 ) + λ2 where Γ2 (β1 , β2 ) = Z Γ(β1 , β)Γ(β, β2 )dµ(β) Minimizing G(λ, ψ) to λ yields Z Z λ= ψ(β1 )Γ(β1 , β2 )ψ(β2 )dµ(β1 )dµ(β2 ) 2 See Remark 4.1 in Bierens(2014b). 5 (3.4) and substituting this solution in G(λ, ψ) yields Z Z ψ(β1 )Γ2 (β1 , β2 )ψ(β2 )dµ(β1 )dµ(β2 ) G(ψ) = min G(λ, ψ) = λ>0 ¶2 µZ Z ψ(β1 )Γ(β1 , β2 )ψ(β2 )dµ(β1 )dµ(β2 ) . (3.5) − Thus, the eigenfunction problem (3.3) is equivalent to the following problem: Find a ψ ∈ L2C (µ) with ||ψ|| = 1 such that G(ψ) = 0. Then the corresponding eigenvalue λ is given by (3.4). (3.6) 3.2. The maximum eigenvalue problem The solution (3.4) suggests that, possibly, the eigenfunction ψ1 corresponding to the largest eigenvalue λ1 , and λ1 itself, can be determined by Z Z ψ1 = arg max ψ(β1 )Γ(β1 , β2 )ψ(β2 )dµ(β1 )dµ(β2 ), (3.7) ψ∈L2C (µ),||ψ||=1 Z Z ψ1 (β1 )Γ(β1 , β2 )ψ1 (β2 )dµ(β1 )dµ(β2 ). (3.8) λ1 = If so, we must have that G(ψ1 ) = 0, or equivalently, Z Z ψ1 (β1 )Γ2 (β1 , β2 )ψ1 (β2 )dµ(β1 )dµ(β2 ) = λ21 . (3.9) Indeed, this conjecture is correct. Theorem 3.1. (a) Under Assumption 2.1 the pair λ1 , ψ1 determined by (3.7) and (3.8) is the maximum eigenvalue with corresponding eigenfunction of the kernel Γ(β1 , β2 ) in Assumption 2.1. (b) Under the additional Assumption 2.2, λ1 > 0. (c) Otherwise, λ1 = 0 implies that Γ(β1 , β2 ) ≡ 0 on B × B. Proof. This theorem will be proved by converting the maximum eigenvalue problem involved in real terms as in the addendum to Bierens and Ploberger (1997) [see Bierens (2014b)] for real-valued kernels, using the properties of the real Hilbert space L2 (µ). 6 Definition 3.1. RL2 (µ) is the Hilbert space of Borel measurable real functions f on B satisfying f (β)2 dµ(β) < ∞, endowed with innerproduct Z hf, gi = f (β)g(β)dµ(β) and associated norm ||f || = p hf, f i and metric ||f − g||. Lemma 3.1. The Hilbert space L2 (µ) has an orthonormal base, say {ϕj (β)}∞ j=1 , 2 so that every f ∈ L (µ) has the series representation f (β) = ∞ X cj ϕj (β), (3.10) j=1 where cj = hf, ϕj i satisfying P∞ 2 j=1 cj = R f (β)2 dµ(β) < ∞. Note that the series representation (3.10) holds with probability 1, in the sense that Ã( )! n X µ β ∈ B : lim cj ϕj (β) = f (β) = 1, n→∞ j=1 rather than exactly for all β ∈ B. C.f. Bierens(2014a,b). Since the solution ψ1 of (3.7) is an element of L2C (µ), and therefore Re[ψ1 ] and Im[ψ1 ] are elements of L2 (µ), ψ1 has the series representation ψ1 (β) = ∞ X (cj + i.dj ) ϕj (β), where (3.11) j=1 cj = hRe[ψ1 ], ϕj i , dj = hIm[ψ1 ], ϕj i Z ∞ X ¡ 2 ¢ 2 cj + dj = ψ1 (β)ψ1 (β)dµ(β) = 1. j=1 In particular, denoting for n ∈ N, ψ1,n (β) = n X (cn,j + i.dn,j ) ϕj (β), where j=1 cj dj p , d , = P n,j n 2 2 2 2 (c + d ) (c + d ) i i i=1 i i=1 i cn,j = pPn 7 (3.12) it follows that lim n→∞ Z |ψ1 (β) − ψ1,n (β)|2 dµ(β) = 0 (3.13) as is not hard to verify. See Bierens (2014b) for a similar result. This implies the following result. Lemma 3.2. Let ψ1 in (3.11) be a solution of (3.7) and let ψ1,n be defined by (3.12). Then Z Z lim ψ1,n (β1 )Γ(β1 , β2 )ψ1,n (β2 )dµ(β1 )dµ(β2 ) n→∞ Z Z = ψ1 (β1 )Γ(β1 , β2 )ψ1 (β2 )dµ(β1 )dµ(β2 ). Proof. To prove Lemma 3.2, observe first that ¯Z Z ¯ ¯ ψ1 (β1 )Γ(β1 , β2 )ψ1 (β2 )dµ(β1 )dµ(β2 ) ¯ ¯ Z Z ¯ − ψ1,n (β1 )Γ(β1 , β2 )ψ1,n (β2 )dµ(β1 )dµ(β2 )¯¯ ¯Z Z ³ ´ ¯ ≤ ¯¯ ψ1 (β1 ) − ψ1,n (β1 ) Γ(β1 , β2 )ψ1 (β2 )dµ(β1 )dµ(β2 ) Z Z + ψ1,n (β1 )Γ(β1 , β2 )ψ1 (β2 )dµ(β1 )dµ(β2 ) ¯ Z Z ¯ ψ1,n (β1 )Γ(β1 , β2 )ψ1,n (β2 )dµ(β1 )dµ(β2 )¯¯ − ¯Z ³ ¯ ¶ ´ µZ ¯ ¯ ¯ ≤¯ ψ1 (β1 ) − ψ1,n (β1 ) Γ(β1 , β2 )ψ1 (β2 )dµ(β2 ) dµ(β1 )¯¯ ¯Z µZ ¯ ¶ ¯ ¯ ¯ +¯ ψ1,n (β1 )Γ(β1 , β2 )dµ(β1 ) (ψ1 (β2 ) − ψ1,n (β2 )) dµ(β2 )¯¯ sZ ¯ ¯2 ¯ ¯ ≤ ¯ψ1 (β1 ) − ψ1,n (β1 )¯ dµ(β1 ) s ¯ ¯2 Z Z ¯ ¯ ¯ ¯ dµ(β1 ) Γ(β , β )ψ (β )dµ(β ) × 1 2 1 2 2 ¯ ¯ 8 s Z + ¯Z ¯2 ¯ ¯ ¯ ψ1,n (β1 )Γ(β1 , β2 )dµ(β1 )¯ dµ(β2 ) ¯ ¯ sZ |ψ1 (β2 ) − ψ1,n (β2 )|2 dµ(β2 ) × ≤2 sup (β1 ,β2 )∈B×B |Γ(β1 , β2 )| × sZ |ψ1 (β) − ψ1,n (β)|2 dµ(β) where the third inequality follows from the Cauchy-Schwartz inequality for inner products, |hx, yi| ≤ ||x||.||y||, which also holds for the complex case hx, yi = hy, xi. To prove the last inequality, observe that by the same Cauchy-Schwartz inequality, sZ sZ ¯Z ¯ ¯ ¯ ¯ Γ(β1 , β2 )ψ1 (β2 )dµ(β2 )¯ ≤ |Γ(β1 , β2 )|2 dµ(β2 ) |ψ1 (β2 )|2 dµ(β2 ) ¯ ¯ sZ |Γ(β1 , β2 )|2 dµ(β2 ) = ≤ sup (β1 ,β2 )∈B×B |Γ(β1 , β2 )| < ∞ where the last inequality follows from the uniform continuity of Γ(β1 , β2 ) on B×B. Hence, s ¯ ¯2 Z Z ¯ ¯ ¯ Γ(β1 , β2 )ψ1 (β2 )dµ(β2 )¯ dµ(β1 ) ≤ sup |Γ(β1 , β2 )| < ∞ ¯ ¯ (β1 ,β2 )∈B×B and similarly s ¯ ¯2 Z Z ¯ ¯ ¯ ψ1,n (β1 )Γ(β1 , β2 )dµ(β1 )¯ dµ(β2 ) ≤ ¯ ¯ sup (β1 ,β2 )∈B×B |Γ(β1 , β2 )| < ∞. Lemma 3.2 follows now from (3.13). The following lemma is also a well-known result, related to Lemma 3.1. 9 2 Lemma 3.3. Given the orthonormal base {ϕj (β)}∞ j=1 of L (µ) in Lemma 3.1 , every Borel measurable real function g(β1 , β2 ) on B × B satisfying Z Z g(β1 , β2 )2 dµ(β1 )dµ(β2 ) < ∞ has the series representation g(β1 , β2 ) = ∞ ∞ X X ci,j ϕi (β1 )ϕj (β2 ), (3.14) i=1 j=1 where ci,j = Z Z ∞ ∞ X X c2i,j i=1 j=1 ϕi (β1 )g(β1 , β2 )ϕj (β2 )dµ(β1 )dµ(β2 ), with Z Z = g(β1 , β2 )2 dµ(β1 )dµ(β2 ) < ∞. Similar to Lemma 3.1 the series representation (3.14) holds with probability 1, in the µ½ sense that ¾¶ Pn1 Pn2 µ×µ (β1 , β2 ) ∈ B × B : lim i=1 j=1 ci,j ϕi (β1 )ϕj (β2 ) = g(β1 , β2 ) min(n1 ,n2 )→∞ =1 rather than exactly for all (β1 , β2 ) ∈ B × B, where µ × µ is the product measure defined as follows. e1 and β e2 be independent random drawings from the distriDefinition 3.2. Let β bution of µ. Then the product measure µ × µ is the probability measure on B × B e1 , β e2 ). induced by (β Lemma 3.3 implies that Re[Γ(β1 , β2 )] = ∞ ∞ X X αi,j ϕi (β1 )ϕj (β2 ), where (3.15) i=1 j=1 αi,j = ∞ ∞ X X i=1 j=1 2 αi,j = Z Z Z Z ϕi (β1 ) Re[Γ(β1 , β2 )]ϕj (β2 )dµ(β1 )dµ(β2 ), (Re[Γ(β1 , β2 )])2 dµ(β1 )dµ(β2 ) < ∞ 10 (3.16) Im[Γ(β1 , β2 )] = ∞ ∞ X X γi,j ϕi (β1 )ϕj (β2 ), where (3.17) i=1 j=1 γi,j = ∞ X ∞ X 2 γi,j = i=1 j=1 Z Z Z Z ϕi (β1 ) Im[Γ(β1 , β2 )]ϕj (β2 )dµ(β1 )dµ(β2 ), (Re[Γ(β1 , β2 )])2 dµ(β1 )dµ(β2 ) < ∞ (3.18) Note that by (2.4) and (2.5), αi,j = αj,i , γi,j = −γj,i , γi,i = 0. (3.19) Hence, Γ(β1 , β2 ) = ∞ ∞ X X (3.20) (αi,j + i.γi,j ) ϕi (β1 )ϕj (β2 ). i=1 j=1 Γ2 (β1 , β2 ) = Z Γ(β1 , β)Γ(β, β2 )dµ(β) à ! Z X ∞ ∞ X (αi1 ,j1 + i.γi1 ,j1 ) ϕi1 (β1 )ϕj1 (β) = × = à i1 =1 j1 =1 ∞ ∞ X X (αj2 ,i2 j2 =1 i2 =1 ∞ X ∞ X ∞ ∞ X X ! + i.γj2 ,i2 ) ϕj2 (β)ϕi2 (β2 ) dµ(β) (αi1 ,j1 + i.γi1 ,j1 ) (αj2 ,i2 + i.γj2 ,i2 ) ϕi1 (β1 )ϕi2 (β2 ) i1 =1 j1 =1 j2 =1 i2 =1 × Z ϕj1 (β)ϕj2 (β)dµ(β) Ã∞ ! ∞ X ∞ X X (αi1 ,j + i.γi1 ,j ) (αj,i2 + i.γj,i2 ) ϕi1 (β1 )ϕi2 (β2 ) = = i1 =1 i2 =1 j=1 ∞ ∞ ∞ XXX (αi1 ,j αj,i2 i1 =1 i2 =1 j=1 ∞ X ∞ X ∞ X − γi1 ,j γj,i2 ) ϕi1 (β1 )ϕi2 (β2 ) (αi1 ,j γj,i2 + γi1 ,j αj,i2 ) ϕi1 (β1 )ϕi2 (β2 ). +i. i1 =1 i2 =1 j=1 11 (3.21) Combining (3.11) and (3.20) it follows that Z Z ψ1 (β1 )Γ(β1 , β2 )ψ1 (β2 )dµ(β1 )dµ(β2 ) !à ∞ ∞ ! Z Z ÃX ∞ XX (cm − i.dm ) ϕm (β1 ) (αi,j + i.γi,j ) ϕi (β1 )ϕj (β2 ) = × = Ã∞ X m=1 (ck + i.dk ) ϕk (β2 ) dµ(β1 )dµ(β2 ) k=1 ∞ ∞ X X m=1 k=1 × = = = Z (cm − i.dm ) (ck + i.dk ) ϕi (β1 )ϕm (β1 )dµ(β1 ) ∞ ∞ X X m=1 k=1 ∞ X ∞ X m=1 k=1 ∞ ∞ X X = cm αm,k ck + m=1 k=1 ∞ ∞ X X ϕj (β2 )ϕk (β2 )dµ(β2 ) ∞ ∞ X X m=1 k=1 dm αm,k dk − (cm αm,k dk − dm αm,k ck ) + i. cm αm,k ck + m=1 k=1 (αi,j + i.γi,j ) i=1 j=1 ((cm ck + dm dk ) + i.(cm dk − dm ck )) (αm,k + i.γm,k ) +i. m=1 k=1 ∞ ∞ X X Z ∞ X ∞ X (cm − i.dm ) (ck + i.dk ) (αm,k + i.γm,k ) m=1 k=1 ∞ ∞ X X = i=1 j=1 ! (cm , dm ) µ ∞ ∞ X X m=1 k=1 ∞ ∞ X X αm,k −γm,k γk,m αm,k m=1 k=1 (cm γm,k ck m=1 k=1 ∞ ∞ X X dm αm,k dk − 2 ¶µ cm γm,k dk + m=1 k=1 ∞ ∞ XX ck dk + dm γm,k dk ) cm γm,k dk m=1 k=1 ¶ where the last two equalities follows from the fact that by (3.19), ∞ ∞ X X m=1 k=1 ∞ ∞ X X m=1 k=1 cm αm,k dk = ∞ ∞ X X dm αm,k ck , m=1 k=1 ∞ ∞ X X dm γm,k ck = − m=1 k=1 12 ∞ ∞ X X cm γm,k ck , dm γm,k ck ∞ X ∞ X ∞ X ∞ X cm γm,k ck = 0, m=1 k=1 dm γm,k dk = 0, m=1 k=1 and from the easy equality ∞ ∞ X X cm αm,k ck m=1 k=1 ∞ ∞ X X = + (cm , dm ) m=1 k=1 ∞ ∞ X X m=1 k=1 µ dm αm,k dk − 2 αm,k −γm,k γk,m αm,k ¶µ ∞ ∞ X X cm γm,k dk m=1 k=1 ck dk ¶ . Thus, denoting Am,k = µ αm,k −γm,k γk,m αm,k ¶ , bm = (cm , dm )0 (3.22) we have Z Z ψ1 (β1 )Γ(β1 , β2 )ψ1 (β2 )dµ(β1 )dµ(β2 ) = ∞ ∞ X X b0m Am,k bk m=1 k=1 Similarly, Z Z ψ1,n (β1 )Γ(β1 , β2 )ψ1,n (β2 )dµ(β1 )dµ(β2 ) = n n X X b0n,m Am,k bn,k (3.23) m=1 k=1 where bn,m = (cn,m , dn,m )0 . C.f. (3.12). Now stack the bn,m ’s in an 2n × 1 vector xn , and recall from (3.12) that 0 xn xn = 1. Moreover, denote ⎞ ⎛ A1,1 A1,2 · · · A1,n−1 A1,n ⎜ A2,1 A2,2 · · · A2,n−1 A2,n ⎟ ⎟ ⎜ ⎟ ⎜ .. .. .. .. ... (3.24) An = ⎜ ⎟, . . . . ⎟ ⎜ ⎝ An−1,1 An−1,2 · · · An−1,n−1 An−1,n ⎠ An,1 An,2 · · · An,n−1 An,n 13 which is a symmetric 2n × 2n matrix.3 Then (3.23) reads Z Z ψ1,n (β1 )Γ(β1 , β2 )ψ1,n (β2 )dµ(β1 )dµ(β2 ) = x0n An xn , (3.25) whereas obviously, x0n An xn ≤ Z Z ψ1 (β1 )Γ(β1 , β2 )ψ1 (β2 )dµ(β1 )dµ(β2 ). (3.26) P Similarly, replacing ψ1,n (β) by ψn (β) = nj=1 (y1,j +i.y2,j )ϕj (β), where the yi,j ’s P P 2 2 + nj=1 y2,j = 1, and denoting are arbitrary subject to the restriction nj=1 y1,j yn = (y1,1 , y2,1 , y1,2 , y2,2 , ..., y1,n , y2,n ), we have Z Z 0 sup yn An yn ≤ ψ1 (β1 )Γ(β1 , β2 )ψ1 (β2 )dµ(β1 )dµ(β2 ). (3.27) 0 y =1 yn ∈R2n .:yn n Recall from linear algebra that the maximum eigenvalue λn of An is equal to λn = y 0 An y, sup (3.28) y∈R2n :y 0 y=1 so that by (3.25), (3.26) and (3.27), Z Z ψ1,n (β1 )Γ(β1 , β2 )ψ1,n (β2 )dµ(β1 )dµ(β2 ) ≤ λn Z Z ≤ λn ≤ ψ1 (β1 )Γ(β1 , β2 )ψ1 (β2 )dµ(β1 )dµ(β2 ). (3.29) Consequently, it follows from Lemma 3.2 that RR ψ1 (β1 )Γ(β1 , β2 )ψ1 (β2 )dµ(β1 )dµ(β2 ) = limn→∞ λn , where λn is Lemma 3.4. the maximum eigenvalue of the matrix An in (3.24). Next, let us focus on the case Γ2 (β1 , β2 ). Obviously, Lemma 3.2 carries over if we replace Γ by Γ2 . 3 The symmetry follows from µ αk,m Ak,m = γk,m −γk,m αk,m ¶ = µ 14 αm,k −γm,k γm,k αm,k ¶ = A0m,k . Lemma 3.5. Let ψ1 in (3.11) be a solution of (3.7) and let ψ1,n be defined by (3.12). Then Z Z lim ψ1,n (β1 )Γ2 (β1 , β2 )ψ1,n (β2 )dµ(β1 )dµ(β2 ) n→∞ Z Z = ψ1 (β1 )Γ2 (β1 , β2 )ψ1 (β2 )dµ(β1 )dµ(β2 ). Moreover, after some tedious complex calculus exercises it can be shown from (3.21) and (3.11) that Z Z ψ1 (β1 )Γ2 (β1 , β2 )ψ1 (β2 )dµ(β1 )dµ(β2 ) = ∞ ∞ X ∞ X X (ck , dk ) k=1 m=1 j=1 × µ αk,j αj,m − γk,j γj,m −αk,j γj,m − γk,j αj,m αk,j γj,m + γk,j αj,m αk,j αj,m − γk,j γj,m Ã∞ ! ∞ X ∞ X X = b0k Ak,j Aj,m bm , k=1 m=1 ¶µ cm dm ¶ (3.30) j=1 where Ak,j , Aj,m and bm are defined in (3.22). Furthermore, similar to (3.23) we have Z Z ψ1,n (β1 )Γ2 (β1 , β2 )ψ1,n (β2 )dµ(β1 )dµ(β2 ) Ã∞ ! n X n X X = b0n,k Ak,j Aj,m bn,m (3.31) k=1 m=1 j=1 where bn,m = (cn,m , dn,m )0 . C.f. (3.12). Stacking the bn,m ’s in a 2n × 1 vector xn as before, and denoting ⎞ ⎛ C1,1 (n) · · · C1,n (n) ∞ X ⎟ ⎜ . . . .. .. .. Ak,j Aj,m , Cn = ⎝ ⎠ , where Ck,m (n) = j=n+1 Cn,1 (n) · · · Cn,n (n) 15 we can write the right-hand side of (3.31) as à n ! à ∞ ! n X n X n n X X X X b0n,k Ak,j Aj,m bn,m + b0n,k Ak,j Aj,m bn,m k=1 m=1 = j=1 0 2 xn An xn + k=1 m=1 j=n+1 x0n Cn xn . Because x0n xn = 1 the term x0n A2n xn is dominated by the maximum eigenvalue of A2n , which is the square of the maximum eigenvalue λn of An , and x0n Cn xn is dominated by the trace of Cn , where trace(Cn ) = n X trace (Cm,m (n)) m=1 n X = 2 = 2 ∞ X m=1 j=n+1 ∞ n X X m=1 j=n+1 ≤ 2 à αm,j αj,m − 2 ∞ n X X ¡ 2 ¢ 2 αm,j + γm,j ∞ ∞ X X 2 αm,j + j=n+1 m=1 γm,j γj,m m=1 j=n+1 ∞ ∞ X X j=n+1 m=1 2 γm,j ! . Due to (3.16) and (3.18) the latter converges to zero as n → ∞. Thus, it has been shown that Z Z 2 ψ1,n (β1 )Γ2 (β1 , β2 )ψ1,n (β2 )dµ(β1 )dµ(β2 ) ≤ λn + o(1). It follows now from Lemmas 3.4 and 3.5 that Z Z ψ1 (β1 )Γ2 (β1 , β2 )ψ1 (β2 )dµ(β1 )dµ(β2 ) ¶2 µZ Z ψ1 (β1 )Γ(β1 , β2 )ψ1 (β2 )dµ(β1 )dµ(β2 ) . ≤ (3.32) However, note that the function G(ψ) in (3.5) is always nonnegative, which implies that Z Z ψ1 (β1 )Γ2 (β1 , β2 )ψ1 (β2 )dµ(β1 )dµ(β2 ) µZ Z ¶2 ≥ ψ1 (β1 )Γ(β1 , β2 )ψ1 (β2 )dµ(β1 )dµ(β2 ) . (3.33) 16 Part (a) of Theorem 3.1 now follows from (3.32) and (3.33). Finally, observe from (3.28) and (3.29) that the sequence λn is monotonic nondecreasing and bounded, hence limn→∞ λn = supn≥1 λn . If the latter supremum is zero, then for all n ≥ 1, An = O2n,2n , hence the αi,j ’s and γi,j ’s are all zero and thus by (3.20), Γ(β1 , β2 ) ≡ 0 on B × B. This proves parts (b) and (c) of Theorem 3.1. 3.3. The other eigenvalues and eigenfunctions Given that λ1 > 0, let Γ(2) (β1 , β2 ) = Γ(β1 , β2 ) − λ1 ψ1 (β1 )ψ1 (β2 ), which is symmetric and continuous on B×B. To prove that Γ(2) (β1 , β2 ) is positive semidefinite, let φ ∈ L2C (µ) be arbitrary. We can write φ = hφ, ψ1 i ψ1 + r, where hr, ψ1 i = 0, hence Z Z φ(β1 )Γ(2) (β1 , β2 )φ(β2 )dµ(β1 )dµ(β2 ) Z Z = r(β1 )Γ(β1 , β2 )r(β2 )dµ(β1 )dµ(β2 ) ≥ 0, as is not hard to verify. Then the maximum eigenvalue λ2 of Γ(2) (β1 , β2 ) with corresponding eigenfunction ψ2 can be derived in the same way as before, which are an eigenvalue and corresponding eigenfunction of Γ(β1 , β2 ) as well, with Z ψ1 (β)ψ2 (β)dµ(β) = 0 if λ2 > 0. R To see this, note that ψ1 (β1 )Γ(2) (β1 , β2 )dµ(β1 ) = 0, hence Z Z Z ψ1 (β1 )Γ(2) (β1 , β2 )ψ2 (β2 )dµ(β1 )dµ(β2 ) λ2 ψ1 (β1 )ψ2 (β1 )dµ(β1 ) = = 0. Now λ2 ψ2 (β1 ) = = = Z Z Z Γ(2) (β1 , β2 )ψ2 (β2 )dµ(β2 ) Γ(β1 , β2 )ψ2 (β2 )dµ(β2 ) − λ1 ψ1 (β1 ) Γ(β1 , β2 )ψ2 (β2 )dµ(β2 ). 17 Z ψ1 (β2 )ψ2 (β2 )dµ(β2 ) Repeating this construction n times yield eigenvalues λ1 ≥ λ2 ≥ .... ≥ λn of Γ(β1 , β2 ) with corresponding orthonormal eigenfunctions ψm , m = 1, 2, ..., n. At this point the next eigenvalue λn+1 ≤ λn and eigenfunction ψn+1 are the maximum eigenvalue and corresponding eigenfunction of the kernel (n) Γ (β1 , β2 ) = Γ(β1 , β2 ) − n X λm ψm (β1 )ψm (β2 ). (3.34) m=1 Suppose that for some n, λn > 0 but λn+1 = 0. Then the maximum eigenvalue of Γ(n) (β1 , β2 ) is zero, hence by part (c) of Theorem 3.1, Γ(n) (β1 , β2 ) ≡ 0 on B × B and thus n X λm ψm (β1 )ψm (β2 ). Γ(β1 , β2 ) ≡ m=1 Then any function φ in the orthogonal complement of span({ψm }nm=1 ), i.e., the space © ª Un = φ ∈ L2C (µ) : hφ, ψm i = 0 for m = 1, 2, ..., n , is an eigenfunction of Γ(β1 , β2 ) with zero eigenvalue. Since Un is a Hilbert space itself contained in L2C (µ), it is possible to select an orthonormal basis {ψm }∞ m=n+1 for U, and the extended orthonormal sequence {ψm }∞ m=1 is then an orthonormal basis of L2C (µ), i.e., L2C (µ) = span({ψm }∞ (3.35) m=1 ). If there does not exists an n such that λn = 0 then we can repeat this construction indefinitely, yielding a non-increasing sequence {λn }∞ n=1 of positive eigenvalues of Γ(β1 , β2 ) with corresponding orthonormal sequence {ψm }∞ m=1 of eigenfunctions. However, that does not mean that Γ(β1 , β2 ) has only positive eigenvalues. If the Hilbert space U∞ = {φ ∈ L2C (µ) : hφ, ψm i = 0 for all m ∈ N} is non-trivial, in the sense that it is larger than the singleton {0}, then by Mercer’s theorem below, all the function in U∞ are eigenfunctions of Γ(β1 , β2 ) with zero eigenvalues. 3.4. The Hilbert-Schmidt theorem Summarizing, the following main result has been proved. Theorem 3.2. (Hilbert-Schmidt Theorem) Under Assumption 2.1 the eigenvalue problem: ”Find a scalar λ and a function ψ ∈ L2C (µ) normalized to unit norm, 18 R ||ψ|| = 1, such that Γ(β1 , β2 )ψ(β2 )dµ(β2 ) = λψ(β1 ) for all β1 ∈ B” has count4 able many solutions {λm , ψm }∞ m=1 , i.e., Z (3.36) Γ(β1 , β2 )ψm (β2 )dµ(β2 ) ≡ λm ψm (β1 ), where the eigenvalues λm are real valued and nonnegative and the eigenfunctions ψm are orthonormal. Moreover, the eigenfunctions corresponding to the positive eigenvalues are continuous on B. If all the eigenvalues are zero then Γ(β1 , β2 ) ≡ 0 on B × B. This theorem is called after David Hilbert and his Ph.D. student Erhard Schmidt who published a series of papers in the period 1904-1908 regarding the existence of eigenvalues and corresponding eigenfunctions for real-valued kernels on a rectangle [a, b] × [a, b]. Their results are nowadays referred to as the HilbertSchmidt theorem. See for example Bernkopf (1966) and Siegmund-Schultze (1986) and the references therein. The complex case in Theorem 3.2 is not a new results, of course. See for example Krein (1998). However, the proof of Theorem 3.2 is different from what I have seen in the literature, where the proof is based on operator theory. 4. Mercer’s theorem for complex kernels The original Mercer’s theorem is due to Mercer (1909), who proved it for realvalued kernels on the rectangle [a, b] × [a, b]. The current version of Mercer’ theorem can be restated somewhat shorter than in Lemma 3 in BW because some parts are already covered by Theorem 3.2. Theorem 4.1. ( Mercer’s Theorem) Under Assumptions 2.1 and 2.2 the complex kernel Γ(β1 , β2 ) involved has the series representation Γ(β1 , β2 ) = ∞ X (4.1) λm ψm (β1 )ψm (β2 ), m=1 where the λm ’s are the eigenvalues of Γ(β1 , β2 ) and the ψm ’s are the corresponding orthonormal eigenfunctions. Then in addition to the results in Theorem 3.2 the following hold. 4 In Lemma 3 in BW, (3.36) was incorrectly stated as λm ψm (β1 ) = 19 R Γ(β1 , β2 )ψm (β2 )dµ(β2 ). P (a) The eigenvalues satisfy ∞ m=1 λm < ∞. (b) The convergence of the right-hand side of (4.1) is uniform on B × B, i.e., ¯ ¯ n ¯ ¯ X ¯ ¯ (4.2) lim sup λm ψm (β1 )ψm (β2 )¯ . ¯Γ(β1 , β2 ) − n→∞ (β ,β )∈B×B ¯ ¯ 1 2 m=1 (c) The orthonormal sequence {ψm }∞ m=1 of eigenfunctions, including the eigenfunctions with zero eigenvalues, is complete in L2C (µ), i.e., L2C (µ) = span({ψm }∞ m=1 ). Proof. Let {ψm }∞ m=1 be the sequence of all eigenfunctions, thus including those with zero eigenvalues. Denote S2 = span({ψk (β1 )ψm (β2 )}∞ k,m=1 ), which is a subspace of the Hilbert space L2C (µ×µ) of all square-integrable complexvalued Borel measurable functions on B×B endowed with the usual innerproduct and associated norm and metric, and note that Γ ∈ L2C (µ × µ). By the projection theorem, the projection of Γ on S2 takes the form Γ(β1 , β2 ) = ∞ ∞ X X ck,m ψm (β1 )ψk (β2 ), where m=1 k=1 ck,m = Z Z = λk Z ψm (β1 )Γ(β1 , β2 )ψk (β2 )dµ(β1 )dµ(β2 ) ψm (β1 )ψk (β1 )dµ(β1 ) = λk 1(k = m), where as in BW, 1(.) denotes the indicator function. Hence, the projection of Γ on S2 is ∞ X Γ(β1 , β2 ) = λm ψm (β1 )ψm (β2 ), m=1 with projection residual R(β1 , β2 ) = Γ(β1 , β2 ) − Γ(β1 , β2 ) ∈ S2⊥ , where S2⊥ is the orthogonal complement of S2 . If R(β1 , β2 ) is continuous and symmetric positive semidefinite then by Theorem 3.2, R has an eigenfunction ϕ ∈ S2⊥ . But then ϕ is also an eigenfunction of Γ, 20 and therefore already contained in S2 . As in the proof of Mercer’s theorem in the real-valued case in Bierens (2014b), we then must have that R(β1 , β2 ) = 0 on B × B. It is obvious that R(β1 , β2 ) is symmetric. To prove that R(β1 , β2 ) is positive semidefinite, let f ∈ L2C (µ) be arbitrary. Project f on S1 = span({ψm }∞ m=1 ), ⊥ and let f ∈ S be the projection and f ∈ S be the projection residual. Then 1 1 2 1 R R R(β1 , β2 ) f1 (β2 )dµ(β2 ) = 0 and Γ(β1 , β2 )f2 (β2 )dµ(β2 ) = 0, hence Z Z f (β1 )R(β1 , β2 )f (β2 )dµ(β1 )dµ(β2 ) Z Z = f2 (β1 )R(β1 , β2 )f2 (β2 )dµ(β1 )dµ(β2 ) Z Z = f2 (β1 )Γ(β1 , β2 )f2 (β2 )dµ(β1 )dµ(β2 ) ≥ 0. (4.3) as is not hard to verify. Thus, R(β1 , β2 ) is positive semidefinite. To prove that R(β1 , β2 ) is continuous it suffices to prove that Γ(β1 , β2 ) is continuous, as follows. Note that (4.3) implies that R(β, β) ≥ 0 for all β ∈ B, which in its turn implies that for all β ∈ B, ∞ X m=1 2 λm |ψm (β)| = ∞ X λm ψm (β)ψm (β) = Γ(β, β) m=1 = Γ(β, β) ≤ sup Γ(β, β) < ∞. (4.4) β∈B Integrating β out yields Next, observe that P∞ m=1 λm < ∞, which is just part (a) of Theorem 4.1. ³ ´ |ψm (β1 ) − ψm (β2 )|2 = (ψm (β1 ) − ψm (β2 )) ψm (β1 ) − ψm (β2 ) (4.5) = ψm (β1 )ψm (β1 ) − ψm (β2 )ψm (β1 ) −ψm (β1 )ψm (β2 ) + ψm (β2 )ψm (β2 ) = |ψm (β1 )|2 + |ψm (β2 )|2 −ψm (β1 )ψm (β2 ) − ψm (β2 )ψm (β1 ) hence, ψm (β1 )ψm (β2 ) + ψm (β2 )ψm (β1 ) ≤ |ψm (β1 )|2 + |ψm (β2 )|2 , where the left-hand side is real valued. Similarly, replacing − by + in (4.5) yields, ψm (β1 )ψm (β2 ) + ψm (β2 )ψm (β1 ) ≥ |ψm (β1 )|2 + |ψm (β2 )|2 . 21 Thus ´ 1³ ψm (β1 )ψm (β2 ) + ψm (β2 )ψm (β1 ) 2 1 1 |ψm (β1 )|2 + |ψm (β2 )|2 . ≤ 2 2 |ψm (β1 )ψm (β2 )| ≤ (4.6) It follows now from (4.4) and (4.6) that for all (β1 , β2 ) ∈ B × B, ∞ X ∞ ∞ 1X 1X 2 λ|ψm (β1 )ψm (β2 )| ≤ λm |ψm (β1 )| + λm |ψm (β2 )|2 2 2 m=1 m=1 m=1 ≤ sup Γ(β, β) < ∞. (4.7) β∈B By the same argument as in the proof of Mercer’s theorem for real-valued kernels in Bierens(2014b) it follows that (4.7) implies ¯ ¯ n ¯ ¯ X ¯ ¯ (4.8) sup λm ψm (β1 )ψm (β2 )¯ = 0 lim ¯Γ(β1 , β2 ) − n→∞ (β ,β )∈B×B ¯ ¯ 1 2 m=1 Pn which in its turn implies, by the continuity of m=1 λm ψm (β1 )ψm (β2 ) for all n ∈ N, that Γ(β1 , β2 ) is continuous on B × B, and so is R(β1 , β2 ). But then R(β1 , β2 ) ≡ 0 on B × B, hence Γ(β1 , β2 ) ≡ Γ(β1 , β2 ). (4.9) Part (b) of Theorem 4.1 follows now from (4.8) and (4.9). 2 As to part (c), suppose that {ψm }∞ m=1 is not complete in LC (µ). Then the ⊥ ∞ orthogonal complement S1 of S1 = span({ψm }m=1 ) contains at least one nonzero function ϕ with unit norm. Since now (4.1) holds exactly on B × B this ϕ is an eigenvalue of Γ with zero eigenvalue, but then ϕ is already included in {ψm }∞ m=1 . 2 Therefore, S1⊥ = {0} and thus {ψm }∞ is complete in L (µ). m=1 C This completes the proof of Theorem 4.1. Remark 4.1. Similar to Remark 5.2 in Bierens (2014b), the condition in Assumption 2.1 that B is compact is only used in the proof of Theorem 3.2 to guarantee Z Z |Γ(β1 , β2 )|2 dµ(β1 )dµ(β2 ) < ∞ 22 (4.10) and is only used in the proof of Theorem 4.1 to guarantee that supβ∈B Γ(β, β) < ∞. Therefore, Mercer’s theorem carries over to probability measures µ on unbounded domains B as long as (4.10) holds and Γ(β, β) is uniformly bounded. 5. Lemma 4 revised Lemma 4 in BW claims that, with Z(β) a complex-valued continuous Gaussian process on a compact subset B of a Euclidean space and µ a probability measure on B, Z ∞ X 2 λm e0m em , (5.1) |Z(β)| dµ(β) = m=1 where the λm ’s are the eigenvalues of the covariance kernel h i Γ(β1 , β2 ) = E Z(β1 )Z(β2 ) and the em ’s are independently N2 [0, I2 ] distributed. However, it follows from Mercer’s theorem that ∙Z ¸ Z ∞ X 2 E λm , |Z(β)| dµ(β) = Γ(β, β)dµ(β) = m=1 £R ¤ P whereas (5.1) implies E |Z(β)|2 dµ(β) = 2 ∞ m=1 λm . Apart from this impossibility result, the main flaw in the original proof of Lemma 4 is due to equation (A.6) in BW, which reads 0 E[Z2 (β1 )Z2 (β2 ) ] = ∞ X λm Qm (β1 )Qm (β2 ), . (5.2) m=1 where Z2 (β) = instead of the correct expression µ Re[Z(β)] Im[Z(β)] E[Z2 (β1 )Z2 (β2 )0 + Z2∗ (β1 )Z2∗ (β2 )0 ] = ¶ ∞ X m=1 23 , λm Qm (β1 )Qm (β2 ), (5.3) (5.4) where Z2∗ (β) = µ Im[Z(β)] − Re[Z(β)] ¶ . Actually, the following corrected version of Lemma 4 is related to Bierens and Ploberger (1997, Theorem 3): Theorem 5.1. (Revised Lemma 4 in BW) Let Z(β) be a complex-valued continuous Gaussian process on a compact subset B of a Euclidean space and let µ be a probability measure on B. Then there exists a nonnegative sequence ωm satisfying P∞ m=1 ωm < ∞ such that Z ∞ X 2 ωm ε2m , |Z(β)| dµ(β) = m=1 where the εm ’s are independent standard normally distributed. Proof. Let {λm }∞ m=1 be the sequence of eigenvalues of the covariance kernel h i Γ(β1 , β2 ) = E Z(β1 )Z(β2 ) with corresponding sequence {ψm (β)}∞ m=1 of orthonormal eigenfunctions P∞ (relative ∞ to µ). By the completeness of {ψm (β)}m=1 we can write Z(β) = m=1 gm ψm (β) R a.e. µ,5 where gm = Z(β)ψm (β)dµ(β). Consequently Z 2 |Z(β)| dµ(β) = ∞ X m=1 |gm |2 . (5.5) Since Z(β) is zero-mean Gaussian, the gm ’s are jointly zero-mean complexvalued normally distributed. Moreover, by Mercer’s theorem and Z Z h i ψk (β2 )E Z(β2 )Z(β1 ) ψm (β1 )dµ(β1 )dµ(β2 ) E [gk gm ] = Z Z = ψk (β2 )Γ(β1 , β2 )ψm (β1 )dµ(β1 )dµ(β2 ) Z Z ∞ X λj = ψk (β2 )ψj (β1 )ψj (β2 ).ψm (β1 )dµ(β1 )dµ(β2 ) j=1 5 I.e., µ ({β ∈ B : Z(β) = P∞ m=1 gm ψm (β)}) = 1. 24 ∞ X = λj j=1 ∞ X = Z ψk (β2 )ψj (β2 )dµ(β2 ) Z ψj (β1 )ψm (β1 )dµ(β1 ) λj I(k = j).I(m = j) j=1 (5.6) = λm I(k = m). By joint normality, (5.6) implies that the sequence {gm }∞ m=1 is independent, and 0 so is the sequence Gm = (Re[gm ], Im[gm ]) . This is a well-known result, but will be proved in Lemma 5.1 below.6 Each Gm is bivariate zero mean normally distributed, i.e., Gm ∼ N2 [0, Σm ]. Using the well-known decomposition Σm = Qm Ωm Q0m , where Ωm = diag(ω1,m , ω2,m ) is the diagonal matrix of eigenvalues of Σm and Qm is the orthogonal matrix of the two corresponding eigenvectors, we can write ¶ µ √ ω1,m e1.m 0 Qm Gm = √ , ω2,m e2.m were the sequence (e1.m , e2.m )0 is i.i.d. N2 [0, I2 ]. Now |gm |2 = gm gm = G0m Gm = Qm Q0m Gm = ω1,m e21.m + ω2,m e22.m and by (5.6) and Mercer’s theorem, ω1,m + ω2,m = λm and ∞ X ω1,m + m=1 ∞ X ω2,m = m=1 ∞ X E[gm gm ] = m=1 ∞ X m=1 λm < ∞. Thus (5.5) now reads Z 2 |Z(β)| dµ(β) = ∞ X ω1,m e21.m m=1 + ∞ X ω2,m e22.m . m=1 Finally, denoting for m ∈ N, ω2m−1 = ω1,m , ω2m = ω2,m , ε2m−1 = e1.m , ε2m = e2.m , for example, the result of Theorem 5.1 follows. As said before, the claim that (5.6) implies that the sequence {gm }∞ m=1 is 0 independent, and so is the sequence Gm = (Re[gm ], Im[gm ]) , is a well-known 6 Because I could not find a formal proof in the literature. 25 result. However, as appears from the proof of the following lemma, this result is far from obvious. Lemma 5.1. Let {gm }∞ m=1 be a sequence of zero-mean complex-valued jointly Gaussian random variables satisfying E [gm gk ] = 0 for k 6= m. (5.7) Then the sequence Gm = (Re[gm ], Im[gm ])0 , m ∈ N, is independent. Proof. By the joint normality of the sequence {Gm }∞ m=1 it suffices to verify that for m 6= k, E [Gk G0m ] = O, as follows. First, note that gm gk = (Re[gm ] Re[gk ] + Im[gm ] Im[gk ]) +i. (Re[gm ] Im[gk ] − Im[gm ] Re[gk ]) , hence (5.7) implies E (Re[gm ] Re[gk ]) + E (Im[gm ] Im[gk ]) = 0, E (Re[gm ] Im[gk ]) − E (Im[gm ] Re[gk ]) = 0. (5.8) It follows straightforwardly from (5.8) that ∙µ ¶µ ¶¸ Re[gk ] − Im[gk ] Re[gm ] Im[gm ] E = O, Im[gk ] − Im[gm ] Re[gm ] Re[gk ] which can be written as E [Gk G0m ] + P2 E [Gk G0m ] P20 ∙ = E (Gk , P2 Gk ) where P2 = µ 0 1 −1 0 ¶ µ G0m G0m P20 ¶¸ = O, . Next, observe from (5.8) that ¶ µ E (Re[gk ] Re[gm ]) E (Re[gk ] Im[gm ]) 0 E [Gk Gm ] = E (Im[gk ] Re[gm ]) E (Im[gk ] Im[gm ]) µ ¶ E (Re[gk ] Re[gm ]) E (Re[gk ] Im[gm ]) = , E (Re[gk ] Im[gm ]) −E (Re[gk ] Re[gm ]) 26 (5.9) hence E [Gk G0m ] is symmetric, with eigenvalues q λ1 = (E (Re[gm ] Re[gk ]))2 + (E (Re[gm ] Im[gk ]))2 , λ2 = −λ1 , Therefore, E [Gm G0k ] can be written as E [Gm G0k ] = λ1 Qk,m µ 1 0 0 −1 ¶ Q0k,m , (5.10) where Qk,m is an orthogonal 2 × 2 matrix. Substituting the expression (5.10) in (5.9) yields ¶ ¶ µµ µ ¶ 1 0 1 0 0 0 0 + Qk,m P2 Qk,m Qk,m P2 Qk,m = O. λ1 (5.11) 0 −1 0 −1 Since P2 and Qk,m are orthogonal, the matrix Q0k,m P2 Qk,m is orthogonal, which in the 2 × 2 case takes the general form ¶ µ cos(φ) sin(φ) 0 (5.12) Qk,m P2 Qk,m = − sin(φ) cos(φ) for some φ ∈ [0, 2π]. It is now easy to verify that equation (5.11) reads ¶ µ cos(φ) − sin(φ) 2λ1 cos(φ) = O. − sin(φ) − cos(φ) (5.13) Since the matrix in (5.13) is non-zero for all φ ∈ [0, 2π], it follows now that λ1 cos(φ) = 0, so that either λ1 = 0 or φ ∈ {π/2, 3π/4} . (5.14) Suppose that φ = π/2. Then by (5.12), Q0k,m P2 Qk,m = P2 . Again, without loss of generality we may assume that for some θ ∈ [0, 2π], µ ¶ cos(θ) − sin(θ) . Qk,m = sin(θ) cos(θ) 27 (5.15) Then it is easy to verify that ¶ µ −2 cos(θ) sin(θ) 1 − 2 sin2 (θ) 0 , Qk,m P2 Qk,m = −2 cos(θ) sin(θ) 1 − 2 cos2 (θ) which is obviously unequal to P2 for all θ ∈ [0, 2π]. Thus, the equality (5.15) is not possible, hence φ 6= π/2. Similarly, φ = 3π/4 implies Q0k,m P2 Qk,m = P20 , which is also not possible, so that φ 6= 3π/4 as well. Consequently, it follows from (5.14) that λ1 = 0, which by (5.10) implies that E [Gm G0k ] = O. 6. Upper bounds of the critical values of the SICM test Similar to Bierens and Ploberger (1997, Theorem 7) it follows from Theorem 5.1 that the following result holds. Theorem 6.1. Let the conditions of Theorem 5.1 hold, and let χ21 = sup n≥1 n 1X 2 ε n m=1 m Then "µZ # ¶−1 Z ∙P∞ ¸ ωm ε2m 2 m=1 Pr Γ(β, β)dµ(β) |Z(β)| dµ(β) > t = Pr P∞ > t ≤ Pr[χ21 > t] m=1 ωm for all t > 0. Therefore, for α ∈ (0, 1) and t(α) such that Pr[χ21 > t(α)] = α, ¸ ∙Z Z 2 Pr |Z(β)| dµ(β) > t(α). Γ(β, β)dµ(β) ≤ α. The values of t(α) for α = 0.01, α = 0.05 and α = 0.10 have been calculated in Bierens and Ploberger (1997), i.e., t(0.01) = 6.81, t(0.05) = 4.26, t(0.10) = 3.23 (6.1) To apply these upper bounds of Rthe asymptotic critical values to the SICM test we need a consistent estimate of Γ(β, β)dµ(β). Recall from Section 3 in BW that the empirical process on which the SICM test is based takes the form n ´ 1 X³ (s) 0 0˜ b exp(i.τ Yj ) − exp(i.τ Yj ) exp(i.ξ 0 Xj ) Zn (τ, ξ) = √ n j=1 28 where Yj and Xj are (made) bounded random vectors, Y˜j is a random drawing from the estimated conditional distribution F (y|Xj ; ˆθ) of Yj , and (τ, ξ) ∈ Υ × Ξ. The corresponding estimated covariance function takes the form b(s) Γ n ((τ1 , ξ1 ), (τ2 , ξ2 )) n ´ 1 X³ 0 0 ˜ exp(i.τ1 Yj ) − exp(i.τ1 Yj ) exp(i.ξ10 Xj ) = n j=1 ´ ³ × exp(−i.τ20 Yj ) − exp(−i.τ20 Y˜j ) exp(−i.ξ20 Xj ) n 1 X³ = exp(i.(τ1 − τ2 )0 Yj ) + exp(i.(τ1 − τ2 )0 Y˜j ) n j=1 − exp(i.τ10 Yj ) exp(−i.τ20 Y˜j ) − ´ exp(i.τ10 Y˜j ) exp(−i.τ20 Yj ) × exp(i.(ξ1 − ξ2 )0 Xj ) exp(−i.ξ20 Xj ), hence b(s) Γ n ((τ, ξ), (τ, ξ)) n ´ 1 X³ = 2 − exp(i.τ 0 (Yj − Y˜j )) − exp(−i.τ 0 (Yj − Y˜j )) n j=1 1X cos(τ 0 (Yj − Y˜j )) =2−2 n j=1 n Now let Υ = [−c, c]m and Ξ = [−c, c]k and let µ R (s) measure on Υ×Ξ. Then the expression for Tbn (c) = in equation (31) in BW, whereas bn(s) (c) = R Z 1 b(s) Γ n ((τ, ξ), (τ, ξ)) dµ(τ, ξ) = 2 − 2 n probability ¯be the uniform ¯2 ¯ b(s) ¯ ¯Zn (τ, ξ)¯ dµ(τ, ξ) is given m n Y X j=1 i=1 ³ ´ ˜ sin c(Yi,j − Yi,j ) c(Yi,j − Y˜i,j ) , where Yi,j and Y˜i,j are component i of Yj and Y˜j , respectively. The upper bounds (s) bn(s) (c), (6.1) are now applicable to the standardized SICM test statistic Tbn (c)/R 29 i.e., under the null hypothesis, h i bn(s) (c) > 6.81 ≤ 0.01 lim sup Pr Tbn(s) (c)/R n→∞ h i (s) (s) b b lim sup Pr Tn (c)/Rn (c) > 4.36 ≤ 0.05 n→∞ h i (s) (s) b b lim sup Pr Tn (c)/Rn (c) > 3.23 ≤ 0.10 n→∞ 7. Concluding remarks In the mathematical literature the Hilbert-Schmidt and Mercer theorems are nowadays usually derived as by-products of linear operator theory,7 which however is way over my head. Therefore, in this addendum to BW I have presented alternative proofs which only require elementary Hilbert space theory and basic knowledge of linear algebra and complex calculus. I do not claim any originality. It seems likely that these proofs are already done in this way somewhere, but if so I am not aware of any references. The only originality claim I can make is that I figured out these proofs all by myself. Note that we could have used moment generating functions rather than characteristic functions because all the variables involved are bounded or made bounded. Then we could have used the Hilbert-Schmidt and Mercer theorems in Bierens and Ploberger (1997) and its addendum Bierens (2014b). Finally, note that recently, in Bierens and Wang (2014), the SICM test has been generalized to parametric conditional distributions of stationary time series models, by combining the weighted ICM testing idea in Bierens (1984) with the approach in the current paper under review. References Bernkopf, M. (1966): ”The Development of Function Spaces with Particular Reference to their Origins in Integral Equation Theory”, Archive for History of Exact Sciences 3, 1-96. Bierens, H. J. (1982): ”Consistent Model Specification Tests”, Journal of Econometrics 20, 105-134. 7 See for example Krein (1998), among others. 30 Bierens, H. J. (1984): ”Model Specification Testing of Time Series Regressions”, Journal of Econometrics 26, 323-353. Bierens, H. J. (1990): ”A Consistent Conditional Moment Test of Functional Form”, Econometrica 58, 1443-1458. Bierens, H. J. (2004): Introduction to the Mathematical and Statistical Foundations of Econometrics, Cambridge University Press. Bierens, H. J. (2014a): ”The Hilbert Space Theoretical Foundation of SemiNonparametric Modeling”, Chapter 1 in: J. Racine, L. Su and A. Ullah (eds), The Oxford Handbook of Applied Nonparametric and Semiparametric Econometrics and Statistics, Oxford University Press. Bierens, H. J. (2014b): ”Addendum to Asymptotic Theory of Integrated Conditional Moment Tests”, http://grizzly.econ.psu.edu/~hbierens/ADDENDUM_ BP1997.PDF. Bierens, H. J. (2015): ”Addendum to Consistent Model Specification Tests”, http://grizzly.econ.psu.edu/~hbierens/ADDENDUM_B1982.PDF. Bierens, H. J., and W. Ploberger (1997): ”Asymptotic Theory of Integrated Conditional Moment Tests”, Econometrica 65, 1129-1151. Bierens, H. J., and L. Wang (2012): ”Integrated Conditional Moment Tests for Parametric Conditional Distributions”, Econometric Theory 28, 328-362. Bierens, H. J., and L. Wang (2014): ”Weighted Simulated Integrated Conditional Moment Tests for Parametric Conditional Distributions of Stationary Time Series Processes”, forthcoming in Econometric Reviews. Hadinejad-Mahram, H., D. Dahlhaus and D. Blomker (2002): ”KarhunenLoeve Expansion of Vector Random Processes”, Technical Report No. IKT-NT 1019, Communications Technological Laboratory, Swiss Federal Institute of Technology, Zurich. Krein, M.G. (1998): ”Compact Linear Operators on Functional Spaces with Two Norms”, Integral Equations and Operator Theory 30, 140-162. Mercer, J. (1909): ”Functions of Positive and Negative Type and their Connection with the Theory of Integral Equations”, Philosophical Transactions of the Royal Society A 209, 415-446. Siegmund-Schultze, R. (1986), ”Der Beweis des Hilbert-Schmidt-Theorems”, Archive for History of Exact Sciences 36, 251-270. 31
© Copyright 2025