Nonparametric adaptive estimation for pure jump L´ evy processes.

Nonparametric adaptive
estimation for pure jump
L´
evy processes.
Fixed sample step versus
high frequency data.
Fabienne Comte(1)
Joint works with V. Genon-Catalot(1)
(1)
MAP5, UMR 8145, Universit´e Paris Descartes.
Introduction:
L´evy processes for modelling purposes in many areas: Eberlein
and Keller (1995), Barndorff-Nielsen and Shephard (2001), Cont
and Tankov (2004).
Bertoin (1996) or Sato (1999): comprehensive study for these
processes.
Distribution of a L´evy process: specified by its characteristic triple
(drift, Gaussian component and L´
evy measure.)
Rather than by the distribution of its independent increments
(intractable) ⇒
standard parametric approach by likelihood methods difficult task.
⇒ nonparametric methods.
L´evy measure interesting to estimate because specifies the jumps
behavior.
Nonparametric estimation of the L´
evy measure
Recent contributions:
Basawa and Brockwell (1982): non decreasing L´evy processes
and observations of jumps with size larger than some positive ε, or
discrete observations with fixed sampling interval.
Nonparametric estimators of a distribution function linked with the
L´evy measure.
See also Hoffmann and Krell (2007).
Figueroa-L´
opez and Houdr´
e (2006): a continuous-time
observation of a general L´evy process and study penalized
projection estimators of the L´evy density.
Neumann and Reiss (2009).
real-valued L´evy processes of pure jump type, i.e. without drift
and Gaussian component.
Assumption: the L´evy measure admits a density n(x) on R.
Context: Process discretely observed with sampling interval ∆.
Let (Lt ) denote the underlying L´evy process and
(Zk∆ = Lk∆ − L(k−1)∆ , k = 1, . . . , n)
the observed random variables (i.i.d.)
Characteristic function of L∆ = Z1∆ is given by:
Z
ψ∆ (u) = E(exp iuZ1∆ ) = exp (∆ (eiux − 1)n(x)dx)
R
where the unknown function is the L´evy density n(x).
Z
′
ψ∆
(u) = iE(Z1∆ exp iuZ1∆ ) = i∆ eiux xn(x)dx ψ∆ (u).
R
(1)
Thus, if
R
R
|x|n(x)dx < ∞, we get with g(x) = xn(x):
Z
′
ψ∆
(u)
∗
iux
g (u) = e g(x)dx = −i
.
∆ψ∆ (u)
(2)
Nonparametric estimation strategy using empirical estimators of
the characteristic functions and Fourier inversion.
See also Watteel-Kulperger (2003) and Neumann-Reiss (2009).
⇒ Estimate g ∗ (u) by using empirical counterparts of ψ∆ (u)
′
and ψ∆
(u) = iE(Z1 eiuZ1 ) only.
⇒ Problem of estimating g = deconvolution-type problem.
But: Problem of deconvolution from (2) is not standard
Both the numerator and the denominator are estimated
⇒ Deconvolution in presence of unknown error density.
+ Have to be estimated from the same data.
Estimator of ψ∆ (u) is not a simple empirical counterpart.
Truncated version analogous to the one used in Neumann (1997)
and Neumann and Reiss (2007).
Technical assumptions up to now:
R
(H1)
|x|n(x)dx < ∞.
R
R
(H2(p)) For p integer, R |x|p−1 |g(x)|dx < ∞.
(H3)
The function g belongs to L2 (R).
Our estimation procedure is based on the i.i.d. r.v.
Z∆
i = Li∆ − L(i−1)∆ , i = 1, . . . , n,
(3)
with common characteristic function ψ∆ (u).
Moments of Z1∆ linked with g:
Proposition 1 Let p ≥ 1 integer. Under (H2)(p), E(|Z1∆ |p ) < ∞.
R k−1
Moreover, setting, for k = 1, . . . p, Mk = R x
g(x)dx, we have
E(Z1∆ ) = ∆M1 ,
E[(Z1∆ )2 ] = ∆M2 + ∆2 M1 ,
and more generally,
l
E[(Z∆
1 ) ] = ∆ Ml + o(∆) for all l = 1, . . . , p.
Control of ψ∆ .
∀x ∈ R, cψ (1 + x2 )−∆β/2 ≤ |ψ∆ (x)| ≤ Cψ (1 + x2 )−∆β/2 ,
(H4)
for some given constants cψ , Cψ and β ≥ 0.
Also considered in Neumann and Reiss (2007).
For the adaptive version of our estimator, we need additional
assumptions for g:
(H5)
and
(H6)
There exists some positive a such that
R ∗
|g (x)|2 (1 + x2 )a dx < +∞,
R
x2 g 2 (x)dx < +∞.
Independent assumptions for ψ∆ and g: there may be no
relation at all between these two functions.
Examples.
Compound Poisson processes.
Lt =
Nt
X
Yi , Yi i.i.d. with density f
i=1
(Yi ) independent of Nt , Nt ∼ Poisson(c).
P(L∆ = 0) = e−c∆
n(x) = cf (x).
e−2c∆ ≤ |ψ∆ (u)| ≤ 1.
The L´
evy Gamma process.
Lt ∼ Γ(βt, α)
n(x) = βx−1 e−αx 1(x > 0).
β∆
α
ψ∆ (u) =
.
α − iu
Bilateral Gamma process. K¨
uchler and Tappe (2008).
(1)
(2)
(1)
(2)
Lt = Lt − Lt , Lt and Lt independent and L´evy-Gamma.
Parameters (β ′ , α′ ; β, α).
Special case: β ′ = β and α′ = α.
Variance-Gamma (Madan and Seneta, 1990).
Lt = WZt , (W ) Brownian motion independent of Z
and Z L´evy-Gamma.
• Bilateral Gamma. n(x) = x−1 g(x)
′
g(x) = β ′ e−α x 1(x > 0) − βe−α|x| 1(x > 0).
ψ∆ (u) =
α
α − iu
β∆ ′
α
α′ + iu
β ′ ∆
.
Notations
∗
∗
u the Fourier transform of the function u: u (y) =
kuk2 =
< u, v >=
Z
Z
R
eiyx u(x)dx,
|u(x)|2 dx,
u(x)v(x)dx with zz = |z|2 .
For any integrable and square-integrable functions u, u1 , u2 ,
(u∗ )∗ (x) = 2πu(−x) and hu1 , u2 i = (2π)−1 hu∗1 , u∗2 i.
(4)
Definition of the estimator.
′
ψ
θ∆ (x)
∗
∆ (x)
g (x) = −i
=
,
∆ψ∆ (x)
∆ψ∆ (x)
(5)
with
ixZ1∆
ψ∆ (x) = E(e
), θ∆ (x) =
′
−iψ∆
(x)
=
∆ ixZ1∆
E(Z1 e
).
n
n
X
X
∆
1
1
ixZk
∆ ixZ∆
ˆ
ˆ
ψ∆ (x) =
e
, θ∆ (x) =
Zk e k .
n
n
k=1
k=1
Although |ψ∆ (x)| > 0 for all x, this is not true for ψˆ∆ .
As Neumann (1997) and Neumann and Reiss (2007), truncate 1/ψˆ∆
1
1
=
1I|ψˆ∆ (x)|>κψ n−1/2 .
˜
ˆ
ψ∆ (x)
ψ∆ (x)
1
g
ˆm (x) =
2π
Z
πm
−πm
e
−ixu
θˆ∆ (u)
du.
˜
∆ψ∆ (u)
(6)
Projection formulation.
√
sin(πx)
and ϕm,j (x) = mϕ(mx − j).
ϕ(x) =
πx
ϕ∗m,j (x)
Sm
eixj/m
= √
1I[−πm,πm] (x).
m
(7)
= Span{ϕm,j , j ∈ Z} = {h ∈ L2 (R), supp(h∗ ) ⊂ [−mπ, mπ]}.
{ϕm,j }j∈Z orthonormal basis
(Sm )m∈Mn the collection of linear spaces,
Mn = {1, . . . , mn }
and mn ≤ n is the maximal admissible value of m, subject to
constraints to be given later.
Consider gm orthogonal projection of g on Sm
Z
X
gm =
am,j (g)ϕm,j with am,j (g) =
ϕm,j (x)g(x)dx = hϕm,j , gi.
R
j∈Z
and
hϕm,j , gi =
1 ∗
hϕm,j , g∗ i.
2π
The estimator can be defined by:
g
ˆm =
X
ˆ
am,j ϕm,j , with ˆ
am,j
j∈Z
or
a
ˆm,j
1
=
2πn∆
1
=
2π∆
Z
n
X
k=1
Z∆
k
Z
e
ixZ∆
k
ϕ∗m,j (−x)
θˆ∆ (x)
dx.
˜
ψ∆ (x)
ϕ∗m,j (−x)
dx,
˜
ψ∆ (x)
Let t ∈ Sm of the collection (Sm )m∈Mn , and define
Z
n
∗
X
1 1
t
(−x)
2
∆
ixZ∆
γn (t) = ktk −
dx,
Zk
e k
˜
π∆ n
ψ∆ (x)
(8)
k=1
Consider γn (t) as an approximation of the theoretical contrast
Z
n
∗
X
1
1
th
2
∆
ixZk∆ t (−x)
dx,
γn (t) = ktk −
Zk
e
π∆ n
ψ∆ (x)
k=1
E(γnth (t)) = ktk2 − 2hg, ti = kt − gk2 − kgk2 minimal for t = g.
We have also
g
ˆm = Argmint∈Sm γn (t).
(9)
We define:
Φψ (m) =
Z
πm
−πm
(analogy with deconvolution setting)
dx
,
|ψ∆ (x)|2
Proposition 2 Under (H1)-(H2)(4)-(H3), for all m:
4
E1/2 [(Z∆
1 ) ]Φψ (m)
E(kg − g
ˆm k ) ≤ kg − gm k + K
.
2
n∆
2
where K is a constant.
2
(10)
Discussion about the rates kg − gm k2 =
Z
|x|≥πm
|g ∗ (x)|2 dx.
Suppose that g belongs to the Sobolev class
Z
S(a, L) = {f, |f ∗ (x)|2 (x2 + 1)a dx ≤ L}.
Then, the bias term satisfies
kg − gm k2 = O(m−2a ).
Under (H4), the bound of the variance term satisfies
R πm
2β∆+1 2
dx/|ψ
(x)|
∆
m
−πm
=O
.
n∆
n∆
The optimal choice for m is O((n∆)1/(2β∆+2a+1 ) and the resulting
rate for the risk is
(n∆)−2a/(2β∆+2a+1 ).
Sampling interval ∆ explicitly appears in the exponent of the
rate.
⇒ for positive β, the rate is worse for large ∆ than for small ∆.
Thus we can state the following corollary of Proposition 2:
Corollary 1 Under assumptions (H1)-(H2(4))-(H3)-(H5), then
E(kˆ
gm −gk2 ) = O(n∆)−2a/(2β∆+2a+1 ) when m = O((n∆)1/(2β∆+2a+1 ).
Study of the adaptive estimator
We have to select an adequate value of m.
pen(m) = κ(1 + E[(Z1∆ )2 ]/∆)
Φψ (m)
.
n∆
(11)
We set
m
ˆ = arg min {γn (ˆ
gm ) + pen(m)} ,
m∈Mn
and study first the “risk” of gˆm
ˆ.
Theorem 1 Assume that assumptions (H1)-(H2)(8)-(H3)-(H7)
hold. Then
2
E(kˆ
gm
ˆ − gk ) ≤ C inf
m∈Mn
where K is a constant.
2
ln
(n)
kg − gm k2 + pen(m) + K
,
n∆
Assumption on the collection of models Mn = {1, . . . , mn }, mn ≤ n:
(H7)
∃ε, 0 < ε < 1, m2β∆
≤ Cn1−ε ,
n
where C is a fixed constant and β is defined by (H4).
For instance, Assumption (H7) is fulfilled if:
1. pen(mn ) ≤ C. In such a case, we have mn ≤ C(n∆)1/(2β∆+1) .
2. ∆ is small enough to ensure 2β∆ < 1. Take Mn = {1, . . . , n}.
(H7) = problem because depends on the unknown β
But concrete implementation requires the knowledge of mn .
Analogous deconvolution with unknown error density.
In the compound Poisson model, β = 0 and nothing is needed.
Otherwise one should at least know if ψ∆ is in a class of polynomial
decay. Estimator of β (see e.g. Diggle et Hall (1993))?
To get an estimator, we replace the theoretical penalty by:
! R πm
n
˜∆ (x)|2
X
dx/|
ψ
1
−πm
∆ 2
pd
en(m) = κ′ 1 +
.
(Z
)
i
n∆2
n
i=1
In that case we can prove:
Theorem 2 Assume that assumptions (H1)-(H2)(8)-(H3)-(H7)
hold and let g˜ = gˆm
be the estimator defined with
b
b
b
m
b = arg minm∈M (γn (ˆ
gm ) + pd
en(m)). Then
n
2
E(k˜
g − gk ) ≤ C inf
m∈Mn
2
kg − gm k + pen(m) +
2
′ ln (n)
K∆
n
′
where K∆
is a constant depending on ∆ (and on fixed quantities but
not on n).
If g belongs to the Sobolev ball S(a, L), and under (H4), the rate is
automatically of order O((n∆)−2a/(2β∆+2a+1) ).
High frequency context: ∆ small, n∆ large,
ψ∆ (u) = 1 + O(∆).
Define
g˘m = arg min γ¯n (t)
(n)
t∈Sm
where
γn (t) = ktk2 −
¯
1
πn∆
n Z
X
Zk exp (iuZk )t∗ (−u)du.
(12)
k=1
Then
E(¯
γn (t)) = ktk2 − 2ht, gi−
2
2π
Z
(ψ∆ (u) − 1)g∗ (u)t∗ (−u)du. (13)
As ψ∆ (u) − 1 = O(∆), the last term above is negligible if ∆ is
small, and ktk2 − 2ht, gi = kt − gk2 − kgk2 .
Deterministic residual term.
Z
1
(ψ∆ (u) − 1)g ∗ (u)t∗ (−u)du
Rn (t) =
2π
New risk bound:
Proposition 3 Assume that (H2) holds, then
2
2
E(kg − g
˘m k ) ≤ 3kg − gm k +
⇒ ∆ small, n∆ large.
2[E(Z21 )/∆]
m
+ C∆2
n∆
Z
πm
−πm
u2 |g∗ (u))|2 du.
Z
S(a, L) = g ∈ (L1 ∩ L)2 (R), (1 + u2 )a |g ∗ (u)|2 du ≤ L .
kg − gm k2 =
Z
|u|≤πm
|g ∗ (u)|2 du ≤ L(πm)−2a .
Proposition 4 Assume that (H2) holds, that M2 < +∞. Then if g
belongs to S(a, L) and if n → +∞, ∆ → 0 and n∆2 ≤ 1, we have
E(kg − g˘m k2 ) ≤ O((n∆)−2a/(2a+1) ).
If a ≥ 1, then the constraints n∆2 ≤ 1 becomes n∆3 = o(1).
m
˘ = arg min (¯
γn (˘
gm ) + pen(m))
˘
m∈Mn
with pen(m)
˘
=κ
Denote by
1
n∆
n
X
i=1
Z2i
!
m
.
n∆
penth (m) = E(pen(m)) = κ(E(Z12 )/∆)
m
.
n∆
Theorem 3 Assume that (H1)-H2(8)-(H3) are fulfilled and that n
is large and ∆ is small, n∆ tends to infinity when n tends to
infinity. Then
2
2
E(kg − g
˘m
kg − gm k + penth (m)
˘ k ) ≤ C inf
m∈Mn
Z
C∆2 πmn 2 ∗
+
u |g (u))|2 du.
2π −πmn