Hawkes processes and application in neuroscience and genomic

Hawkes processes and application in neuroscience and
genomic
Patricia Reynaud-Bouret
CNRS, Universit´
e de Nice Sophia Antipolis
Journ´ees Aussois 2015, Mod´elisation Math´ematique et Biodiversit´e
P.Reynaud-Bouret
Hawkes
Aussois 2015
1 / 129
Hawkes processes and application in neuroscience and
genomic
Patricia Reynaud-Bouret
CNRS, Universit´
e de Nice Sophia Antipolis
Journ´ees Aussois 2015, Mod´elisation Math´ematique et Biodiversit´e
P.Reynaud-Bouret
Hawkes
Aussois 2015
1 / 129
Table of Contents
1
Point process and Counting process
2
Multivariate Hawkes processes and Lasso
3
Probabilistic ingredients
4
Back to Lasso
5
PDE and point processes
P.Reynaud-Bouret
Hawkes
Aussois 2015
2 / 129
Point process and Counting process
Table of Contents
1
Point process and Counting process
Introduction
Poisson process
More general counting process and conditional intensity
Classical statistics for counting processes
2
Multivariate Hawkes processes and Lasso
3
Probabilistic ingredients
4
Back to Lasso
5
PDE and point processes
P.Reynaud-Bouret
Hawkes
Aussois 2015
3 / 129
Point process and Counting process
Introduction
Definition
Point process
N= random countable set of points of X (usually R).
P.Reynaud-Bouret
Hawkes
Aussois 2015
4 / 129
Point process and Counting process
Introduction
Definition
Point process
N= random countable set of points of X (usually R).
!
NA number of points of N in A, dNt = T point de N δT .
"
!
f (t)dNt = T ∈N f (T )
If X = R+ and no accumulation, the counting process is Nt = N[0,t] .
th
.
.
.
.
.
.
.
.
.
.
.
...
.÷
.
.
.
.
.
.
.
.
i
.
.
.
.
H
.
P.Reynaud-Bouret
:-.
.
.
.
2
,
.
Hawkes
Aussois 2015
4 / 129
Point process and Counting process
Introduction
Examples
disintegration of radioactive atoms
date / duration / position of phone calls
date / position/ magnitude of earthquakes
P.Reynaud-Bouret
Hawkes
Aussois 2015
5 / 129
Point process and Counting process
Introduction
Examples
disintegration of radioactive atoms
date / duration / position of phone calls
date / position/ magnitude of earthquakes
position / size / age of trees in a forest
discovery time / position/ size of petroleum fields
breakdowns
P.Reynaud-Bouret
Hawkes
Aussois 2015
5 / 129
Point process and Counting process
Introduction
Examples
disintegration of radioactive atoms
date / duration / position of phone calls
date / position/ magnitude of earthquakes
position / size / age of trees in a forest
discovery time / position/ size of petroleum fields
breakdowns
time of the event ”a bee came out of its hive”, ”a monkey climbed in
the tree”, ”an agent buys a financial asset”, ”someone clicks on the
website” ....
P.Reynaud-Bouret
Hawkes
Aussois 2015
5 / 129
Point process and Counting process
Introduction
Examples
disintegration of radioactive atoms
date / duration / position of phone calls
date / position/ magnitude of earthquakes
position / size / age of trees in a forest
discovery time / position/ size of petroleum fields
breakdowns
time of the event ”a bee came out of its hive”, ”a monkey climbed in
the tree”, ”an agent buys a financial asset”, ”someone clicks on the
website” ....
Everything has been noted, probably link via a model between those
quantities. No reason at all that they may be identically distributed and/or
independent (not i.i.d.).
P.Reynaud-Bouret
Hawkes
Aussois 2015
5 / 129
Point process and Counting process
Introduction
Life times
Historically, first evident records of statistics start with Graunt in 1662 and
its life table.
NB : at the same time, Pascal and Euler started Probability ... Halley,
Huyghens and many others also recorded life tables
”Why are people dying ?” → record of the time of deaths in London at
that time.
P.Reynaud-Bouret
Hawkes
Aussois 2015
6 / 129
Point process and Counting process
Introduction
Life times
Historically, first evident records of statistics start with Graunt in 1662 and
its life table.
”Why are people dying ?” → record of the time of deaths in London at
that time.
Even if considered i.i.d., the interesting quantity is the hazard rate
q(t), with
q(t)dt ≃ P( die just after t given that alive in t)
If f density, q(t) = f (t)/S(t), with
S(t) =
#
+∞
f (s)ds.
t
P.Reynaud-Bouret
Hawkes
Aussois 2015
6 / 129
Point process and Counting process
Introduction
Life times
Historically, first evident records of statistics start with Graunt in 1662 and
its life table.
”Why are people dying ?” → record of the time of deaths in London at
that time.
Even if considered i.i.d., the interesting quantity is the hazard rate
q(t),
q = cte : do not ”age”, no memory → exponential variable
q increases : better young than old
q
q decreases : better old than young.
classical shape for
human
Ould
mortality
old
aging
→
age
P.Reynaud-Bouret
Hawkes
Aussois 2015
6 / 129
Point process and Counting process
Introduction
Life times
Historically, first evident records of statistics start with Graunt in 1662 and
its life table.
”Why are people dying ?” → record of the time of deaths in London at
that time.
Even if considered i.i.d., the interesting quantity is the hazard rate
q(t),
not even clearly i.i.d. : people may move out before disease, may
contaminate each other etc... → what is the interesting quantity ?
P.Reynaud-Bouret
Hawkes
Aussois 2015
6 / 129
Point process and Counting process
Introduction
Neuroscience and neuronal unitary activity
P.Reynaud-Bouret
Hawkes
Aussois 2015
7 / 129
Point process and Counting process
Introduction
Neuronal data and Unitary Events
P.Reynaud-Bouret
Hawkes
Aussois 2015
8 / 129
Point process and Counting process
Introduction
Genomics and Transcription Regulatory Elements
P.Reynaud-Bouret
Hawkes
Aussois 2015
9 / 129
Point process and Counting process
Poisson process
Definition
Poisson processes
for all integer n, for all A1 , . . . , An disjoint measurable subsets of X,
NA1 , . . . , NAn are independent random variables.
P.Reynaud-Bouret
Hawkes
Aussois 2015
10 / 129
Point process and Counting process
Poisson process
Definition
:a÷
ah
iii.
ix:i !
:
'
.
.
Poisson processes
for all integer n, for all A1 , . . . , An disjoint measurable subsets of X,
NA1 , . . . , NAn are independent random variables.
for all measurable subset A of X, NA obeys a Poisson law with
parameter depending on A and denoted ℓ(A).
P.Reynaud-Bouret
Hawkes
Aussois 2015
10 / 129
Point process and Counting process
Poisson process
Definition
Poisson processes
for all integer n, for all A1 , . . . , An disjoint measurable subsets of X,
NA1 , . . . , NAn are independent random variables.
for all measurable subset A of X, NA obeys a Poisson law with
parameter depending on A and denoted ℓ(A).
If X = R and ℓ([0, t]) = λt, then
P(Nt = 0) = P(T1 > t)
0
P.Reynaud-Bouret
£
j
Hawkes
.
.
.
.
.
Aussois 2015
10 / 129
Point process and Counting process
Poisson process
Definition
Poisson processes
for all integer n, for all A1 , . . . , An disjoint measurable subsets of X,
NA1 , . . . , NAn are independent random variables.
for all measurable subset A of X, NA obeys a Poisson law with
parameter depending on A and denoted ℓ(A).
If X = R and ℓ([0, t]) = λt, then
P(Nt = 0) = P(T1 > t) = e −λt
H Poisson
PINEH
P.Reynaud-Bouret
dt=o
with
distribution
Hawkes
.
.
Q,÷i0
parameter
.
Aussois 2015
10 / 129
Point process and Counting process
Poisson process
Definition
Poisson processes
for all integer n, for all A1 , . . . , An disjoint measurable subsets of X,
NA1 , . . . , NAn are independent random variables.
for all measurable subset A of X, NA obeys a Poisson law with
parameter depending on A and denoted ℓ(A).
If X = R and ℓ([0, t]) = λt, then
P(Nt = 0) = P(T1 > t) = e −λt
→ first time is an exponential variable
P.Reynaud-Bouret
Hawkes
Aussois 2015
10 / 129
Point process and Counting process
Poisson process
Definition
Poisson processes
for all integer n, for all A1 , . . . , An disjoint measurable subsets of X,
NA1 , . . . , NAn are independent random variables.
for all measurable subset A of X, NA obeys a Poisson law with
parameter depending on A and denoted ℓ(A).
If X = R and ℓ([0, t]) = λt, then
P(Nt = 0) = P(T1 > t) = e −λt
→ first time is an exponential variable
All ”intervals” are i.i.d and exponentially distributed.
P.Reynaud-Bouret
Hawkes
Aussois 2015
10 / 129
Point process and Counting process
Poisson process
Definition
Poisson processes
for all integer n, for all A1 , . . . , An disjoint measurable subsets of X,
NA1 , . . . , NAn are independent random variables.
for all measurable subset A of X, NA obeys a Poisson law with
parameter depending on A and denoted ℓ(A).
If X = R and ℓ([0, t]) = λt, then
P(Nt = 0) = P(T1 > t) = e −λt
→ first time is an exponential variable
All ”intervals” are i.i.d and exponentially distributed.
No memory ⇐⇒ independence to the past
P.Reynaud-Bouret
Hawkes
Aussois 2015
10 / 129
Point process and Counting process
Poisson process
If λ not constant
If ℓ([0, t]) =
"t
0
λ(s)ds, let N be a Poisson process in R2+ and
That
@
a!
.
OP
.
:
.
-
P.Reynaud-Bouret
Hawkes
Aussois 2015
11 / 129
Point process and Counting process
Poisson process
If λ not constant
If ℓ([0, t]) =
"t
0
λ(s)ds, let N be a Poisson process in R2+ and
number
fens
o°=
0:i
Poisson
Ftakretu
:
OP
of
area
.
.
"
:
i
:3
:mkEnwmei¥
.
.
AH
.
.
.
-
P.Reynaud-Bouret
Hawkes
Aussois 2015
11 / 129
Point process and Counting process
Poisson process
If λ not constant
If ℓ([0, t]) =
"t
0
λ(s)ds, let N be a Poisson process in R2+ and
N)
ECM )
ECM )
::
.
Ham )
in ] )
ECM )
If λ(.) bounded by M
P.Reynaud-Bouret
Hawkes
Aussois 2015
11 / 129
Point process and Counting process
Poisson process
If λ not constant
If ℓ([0, t]) =
"t
0
λ(s)ds, let N be a Poisson process in R2+ and
Rejected Point
HGID )
gT.ru
0
Ham )
-
accepted
points
ECM )
If λ(.) bounded by M
P.Reynaud-Bouret
WGN )
ECM )
ECM )
Hawkes
Aussois 2015
11 / 129
Point process and Counting process
More general counting process and conditional intensity
Conditional intensity
Need of the same sort of interpretation as hazard rate but given ”the
Past” !
P.Reynaud-Bouret
Hawkes
Aussois 2015
12 / 129
Point process and Counting process
More general counting process and conditional intensity
Conditional intensity
Need of the same sort of interpretation as hazard rate but given ”the
Past” !
dN
$ %&t '
Nbr observed points
in [t, t + dt]
P.Reynaud-Bouret
=
intensity
& '$ %
λ(t) dt
$ %& '
Expected Number
given the past before t
Hawkes
+
$noise
%& '
Martingales
differences
Aussois 2015
12 / 129
Point process and Counting process
More general counting process and conditional intensity
Conditional intensity
Need of the same sort of interpretation as hazard rate but given ”the
Past” !
dN
$ %&t '
Nbr observed points
in [t, t + dt]
λ(t)
=
P.Reynaud-Bouret
=
intensity
& '$ %
λ(t) dt
$ %& '
Expected Number
given the past before t
+
$noise
%& '
Martingales
differences
instantaneous frequency
Hawkes
Aussois 2015
12 / 129
Point process and Counting process
More general counting process and conditional intensity
Conditional intensity
Need of the same sort of interpretation as hazard rate but given ”the
Past” !
dN
$ %&t '
Nbr observed points
in [t, t + dt]
λ(t)
=
=
P.Reynaud-Bouret
=
intensity
& '$ %
λ(t) dt
$ %& '
Expected Number
given the past before t
+
$noise
%& '
Martingales
differences
instantaneous frequency
random, depends on the Past (previous points ...)
Hawkes
Aussois 2015
12 / 129
Point process and Counting process
More general counting process and conditional intensity
Conditional intensity
Need of the same sort of interpretation as hazard rate but given ”the
Past” !
dN
$ %&t '
Nbr observed points
in [t, t + dt]
λ(t)
=
=
=
=
intensity
& '$ %
λ(t) dt
$ %& '
Expected Number
given the past before t
+
$noise
%& '
Martingales
differences
instantaneous frequency
random, depends on the Past (previous points ...)
characterizes the distribution when exists
NB : if λ(t) deterministic → Poisson process
P.Reynaud-Bouret
Hawkes
Aussois 2015
12 / 129
Point process and Counting process
More general counting process and conditional intensity
Thinning
Let N be a Poisson process in R2+ .
In
oil
.÷
=
.
""
.
'
'
ftp.ik
.
.
λ(t) is predictable, i.e. depends on the past only, can be drawn from left
to right if one discovers one point at a time. Usually ”c-`a-g” for ”continu
`a gauche” (left continuous)
P.Reynaud-Bouret
Hawkes
Aussois 2015
13 / 129
Point process and Counting process
More general counting process and conditional intensity
Thinning
Let N be a Poisson process in R2+ .
€p~
÷
ay
dtlltipasde
points
0
s€,¥HEm]
.
)
For simulation (Ogata’s thinning), λ(t) bounded if no points appear
P.Reynaud-Bouret
Hawkes
Aussois 2015
13 / 129
Point process and Counting process
More general counting process and conditional intensity
Thinning
Let N be a Poisson process in R2+ .
min
ydttitpinhoppeanyytu
•
ELMH
I
'
For simulation (Ogata’s thinning), λ(t) bounded if no points appear
P.Reynaud-Bouret
Hawkes
Aussois 2015
13 / 129
Point process and Counting process
More general counting process and conditional intensity
One life time
N = {T }, T of hazard rate q.
'
9kt
'
.
.
,
,
P.Reynaud-Bouret
Hawkes
Aussois 2015
14 / 129
Point process and Counting process
More general counting process and conditional intensity
One life time
N = {T }, T of hazard rate q.
.
941
.
.
tttsaoatayst
'M
.tt?dsi.=explnsLH
to )
It
.
expfftqlslds )
Plt7t-P(
=
.
.
=expfs
=S(t
.
)
:
λ(t) = q(t)1T ≥t
P.Reynaud-Bouret
t
Hawkes
Aussois 2015
14 / 129
Point process and Counting process
More general counting process and conditional intensity
Poissonian interaction
Parents : Ui either (uniform) i.i.d. or (homogeneous) Poisson process
Each parent gives birth according to a Poisson process of intensity h
originated in Ui .
Process of the children ?
then
aIy¥÷
eE¥
P.Reynaud-Bouret
Hawkes
Aussois 2015
15 / 129
Point process and Counting process
More general counting process and conditional intensity
Poissonian interaction
Parents : Ui either (uniform) i.i.d. or (homogeneous) Poisson process
Each parent gives birth according to a Poisson process of intensity h
originated in Ui .
"
Process of the children ?
"
int
fhlt
λ(t) =
!
i /Ui <t
P.Reynaud-Bouret
h(t − Ui )
←
htttithttud
Ey
Hawkes
Aussois 2015
15 / 129
Point process and Counting process
More general counting process and conditional intensity
Poissonian interaction
Parents : Ui either (uniform) i.i.d. or (homogeneous) Poisson process
Each parent gives birth according to a Poisson process of intensity h
originated in Ui .
With some orphans of rate ν ?
Othtt U )+h(t Ud
.
-
,
←
0
utan
i÷
;
.
i
.
.
;
a
P.Reynaud-Bouret
Hawkes
Aussois 2015
15 / 129
Point process and Counting process
More general counting process and conditional intensity
Poissonian interaction
Parents : Ui either (uniform) i.i.d. or (homogeneous) Poisson process
Each parent gives birth according to a Poisson process of intensity h
originated in Ui .
With some orphans of rate ν ?
.
.÷
=
i.in
'
D
;
11
I
11
i
λ(t) = ν +
!
ai ;
i /Ui <t
P.Reynaud-Bouret
I
I
/
l
I
I I
11
I 1
'
'
11
i
1
,
1 1
11
.
h(t − Ui )
Hawkes
Aussois 2015
15 / 129
Point process and Counting process
More general counting process and conditional intensity
Hawkes process : linear case
A self-exciting process introduced by Hawkes in the 70’s to model
earthquakes.
Ancestors appear at rate ν.
Each point gives birth according to h etc etc (aftershocks)
€i%
=iii
:-,
,
P.Reynaud-Bouret
Hawkes
,
.
Aussois 2015
,
16 / 129
Point process and Counting process
More general counting process and conditional intensity
Hawkes process : linear case
A self-exciting process introduced by Hawkes in the 70’s to model
earthquakes.
Ancestors appear at rate ν.
Each point gives birth according to h etc etc (aftershocks)
vk.mil#HiIY?Dh
λ(t) = ν +
!
T <t
P.Reynaud-Bouret
h(t − T ) = ν +
"t
−∞ h(t
Hawkes
− u)dNu
Aussois 2015
16 / 129
Point process and Counting process
More general counting process and conditional intensity
Hawkes and Modelisation
models any sort of self contamination (earthquakes, clicks, etc)
P.Reynaud-Bouret
Hawkes
Aussois 2015
17 / 129
Point process and Counting process
More general counting process and conditional intensity
Hawkes and Modelisation
models any sort of self contamination (earthquakes, clicks, etc)
can be marked : position or magnitude of earthquakes, neuron on
which the spike happens (see later)
P.Reynaud-Bouret
Hawkes
Aussois 2015
17 / 129
Point process and Counting process
More general counting process and conditional intensity
Hawkes and Modelisation
models any sort of self contamination (earthquakes, clicks, etc)
can be marked : position or magnitude of earthquakes, neuron on
which the spike happens (see later)
can therefore model interaction also (see later)
P.Reynaud-Bouret
Hawkes
Aussois 2015
17 / 129
Point process and Counting process
More general counting process and conditional intensity
Hawkes and Modelisation
models any sort of self contamination (earthquakes, clicks, etc)
can be marked : position or magnitude of earthquakes, neuron on
which the spike happens (see later)
can therefore model interaction also (see later)
"
generally, modelling people ”want” stationnarity : if h < 1, the
branching procedure ends ⇐⇒ existence of stationnary version.
P.Reynaud-Bouret
Hawkes
Aussois 2015
17 / 129
Point process and Counting process
More general counting process and conditional intensity
Hawkes and Modelisation
models any sort of self contamination (earthquakes, clicks, etc)
can be marked : position or magnitude of earthquakes, neuron on
which the spike happens (see later)
can therefore model interaction also (see later)
"
generally, modelling people ”want” stationnarity : if h < 1, the
branching procedure ends ⇐⇒ existence of stationnary version.
for genomics and neuroscience, need of inhibition : possible to use
# t
λ(t) = Φ(
h(t − u)dNu ),
−∞
with
Φ 1-Lipschitz positive and h of any sign. Stationnarity condition
"
|h| < 1.
In particular, Φ = (.)+ but loss of the branching structure ( ?)
P.Reynaud-Bouret
Hawkes
Aussois 2015
17 / 129
Point process and Counting process
Classical statistics for counting processes
Likelihood
Informally the likelihood of the observation N should give the
probability to see (something near) N.
P.Reynaud-Bouret
Hawkes
Aussois 2015
18 / 129
Point process and Counting process
Classical statistics for counting processes
Likelihood
Informally the likelihood of the observation N should give the
probability to see (something near) N. Formally, it is somehow the
density wrt (here) Poisson, but without depending on the intensity of
the reference Poisson.
P.Reynaud-Bouret
Hawkes
Aussois 2015
18 / 129
Point process and Counting process
Classical statistics for counting processes
Likelihood
Informally the likelihood of the observation N should give the
probability to see (something near) N.
If N = {t1 , ..., tn }, we want for N ′ of intensity λ(.) observed on [0, T ],
P(N ′ has n points, each Ti in [ti − dti , ti ])
.
,
P.Reynaud-Bouret
itit
.
.
.
.
Hawkes
Aussois 2015
18 / 129
Point process and Counting process
Classical statistics for counting processes
Likelihood
Informally the likelihood of the observation N should give the
probability to see (something near) N.
If N = {t1 , ..., tn }, we want for N ′ of intensity λ(.) observed on [0, T ],
P(N ′ has n points, each Ti in [ti − dti , ti ])
.
,
itit
.
.
.
.
P(no point in [0, t1 −dt1 ], 1 point in [t1 −dt1 , t1 ], ..., no point in[tn , T ])
P.Reynaud-Bouret
Hawkes
Aussois 2015
18 / 129
Point process and Counting process
Classical statistics for counting processes
Likelihood(2)
;
jin
.jx£+:e
,
,
it
.
.
.
.
(
!T
(
P(no point in[tn , T ](past at tn ) = e − tn λ(s)ds
P.Reynaud-Bouret
Hawkes
H
Aussois 2015
19 / 129
Point process and Counting process
Classical statistics for counting processes
Likelihood(3)
=y4odsIeoec
;f
,
if
.
.
.
,
it
.
.
.
.
H
(
! tn +dtn
(
λ(s)ds
P(1 point in[tn − dtn , tn ](past at tn − dtn ) ≃ λ(tn )dtn e − tn
P.Reynaud-Bouret
Hawkes
Aussois 2015
20 / 129
Point process and Counting process
Classical statistics for counting processes
Likelihood(4)
P(no point in [0, t1 − dt1 ], 1 point in [t1 − dt1 , t1 ], ..., no point in[tn , T ])
)
= E 1no point in [0,t1 −dt1 ] ...11 point in [tn −dtn ,tn ] ×
(
*
(
P(no point in[tn , T ](past at tn )
)
* !T
= E 1no point in [0,t1 −dt1 ] ...11 point in [tn −dtn ,tn ] e − tn λ(s)ds
Point process and Counting process
Classical statistics for counting processes
Likelihood(4)
P(no point in [0, t1 − dt1 ], 1 point in [t1 − dt1 , t1 ], ..., no point in[tn , T ])
)
= E 1no point in [0,t1 −dt1 ] ...11 point in [tn −dtn ,tn ] ×
(
*
(
P(no point in[tn , T ](past at tn )
)
* !T
= E 1no point in [0,t1 −dt1 ] ...11 point in [tn −dtn ,tn ] e − tn λ(s)ds
Point process and Counting process
Classical statistics for counting processes
Likelihood(4)
P(no point in [0, t1 − dt1 ], 1 point in [t1 − dt1 , t1 ], ..., no point in[tn , T ])
)
= E 1no point in [0,t1 −dt1 ] ...11 point in [tn −dtn ,tn ] ×
(
*
(
P(no point in[tn , T ](past at tn )
)
* !T
= E 1no point in [0,t1 −dt1 ] ...11 point in [tn −dtn ,tn ] e − tn λ(s)ds
(
)
*
(
= E 1no point in [0,t1 −dt1 ] ...P(1 point in[tn − dtn , tn ](past at tn − dtn ) ×
e−
)
≃ E 1no point in
P.Reynaud-Bouret
!T
tn
λ(s)ds
*
[tn−1 ,tn −dtn ] ×
!T
λ(tn )dtn e − tn −dtn λ(s)ds
[0,t1 −dt1 ] ...1no point in
Hawkes
Aussois 2015
21 / 129
Point process and Counting process
Classical statistics for counting processes
Likelihood and log-likelihood
(Pseudo)Likelihood
+
λ(ti )e −
!T
λ(s)ds
0
i
dependance on the past not explicit (ex Past before 0 for Hawkes)
na
:*
IHI
:p
Eyton
t.IN
.
P.Reynaud-Bouret
Hawkes
Aussois 2015
22 / 129
Point process and Counting process
Classical statistics for counting processes
Likelihood and log-likelihood
(Pseudo)Likelihood
+
λ(ti )e −
!T
0
λ(s)ds
i
dependance on the past not explicit (ex Past before 0 for Hawkes)
LogLikelihood
ℓλ (N) =
P.Reynaud-Bouret
#
T
]
λ(s)dNs
look
0
Hawkes
−
#
T
λ(s)ds
0
Aussois 2015
22 / 129
Point process and Counting process
Classical statistics for counting processes
Maximum likelihood estimator
If model for λ → λθ , θ ∈ Rd , how to choose the best θ according to the
observed data N on [0, T ] ?
.÷Xh
P.Reynaud-Bouret
Hawkes
...
Aussois 2015
23 / 129
Point process and Counting process
Classical statistics for counting processes
Maximum likelihood estimator
If model for λ → λθ , θ ∈ Rd , how to choose the best θ according to the
observed data N on [0, T ] ?
*I
D=
los
likely
.
likely
,÷€"bglH#o5ko&
7
:
Imre
n
-
observed
points
then MLE = θˆ = arg maxθ ℓλθ (N).
P.Reynaud-Bouret
Hawkes
Aussois 2015
23 / 129
Point process and Counting process
Classical statistics for counting processes
Maximum likelihood estimator(2)
If d, number of unknown parameter, fixed one can show usually nice
asymptotic properties (in T ... ) :
consistency (θˆ → θ)
asymptotic normality
efficiency (smallest asymptotic variance)
(cf. Andersen, Borgan, Gill and Keiding)
P.Reynaud-Bouret
Hawkes
Aussois 2015
24 / 129
Point process and Counting process
Classical statistics for counting processes
Maximum likelihood estimator(2)
If d, number of unknown parameter, fixed one can show usually nice
asymptotic properties (in T ... ) :
consistency (θˆ → θ)
asymptotic normality
efficiency (smallest asymptotic variance)
(cf. Andersen, Borgan, Gill and Keiding)
iterate
:*
In practice, T and d are fixed → if d large wrt T , i.e. in the non
asymptotic regime, over fitting.
P.Reynaud-Bouret
Hawkes
Aussois 2015
24 / 129
Point process and Counting process
Classical statistics for counting processes
Goodness-of-fit test
If Λt =
"t
0
λ(s)ds, then
Λt is a nondecreasing process, predictable and its (pseudo)inverse
exists.
It is the compensator of the counting process Nt , i.e. (Nt − Λt )t
martingale.
.
P.Reynaud-Bouret
D:
At
.
Hawkes
Aussois 2015
25 / 129
Point process and Counting process
Classical statistics for counting processes
Goodness-of-fit test
If Λt =
"t
0
λ(s)ds, then
Λt is a nondecreasing process, predictable and its (pseudo)inverse
exists.
It is the compensator of the counting process Nt , i.e. (Nt − Λt )t
martingale.
NΛ−1 is a Poisson process of intensity 1.
t
atrp
jj
'
.
.
.
.
P.Reynaud-Bouret
.
'
'
'
.
Hawkes
Aussois 2015
25 / 129
Point process and Counting process
Classical statistics for counting processes
Goodness-of-fit test
If Λt =
"t
0
λ(s)ds, then
Λt is a nondecreasing process, predictable and its (pseudo)inverse
exists.
It is the compensator of the counting process Nt , i.e. (Nt − Λt )t
martingale.
NΛ−1 is a Poisson process of intensity 1.
t
→ test that N (observed) has the desired intensity λ.
P.Reynaud-Bouret
Hawkes
Aussois 2015
25 / 129
Multivariate Hawkes processes and Lasso
Table of Contents
1
Point process and Counting process
2
Multivariate Hawkes processes and Lasso
A model for neuroscience and genomics
Parametric estimation
What can we do against over fitting ? (Adaptation)
Lasso criterion
Simulations
Real data analysis
3
Probabilistic ingredients
4
Back to Lasso
5
PDE
and point processes
P.Reynaud-Bouret
Hawkes
Aussois 2015
26 / 129
Multivariate Hawkes processes and Lasso
A model for neuroscience and genomics
Biological framework
(a) One neuron
P.Reynaud-Bouret
(b) Connected neurons
Hawkes
Aussois 2015
27 / 129
Multivariate Hawkes processes and Lasso
A model for neuroscience and genomics
Synaptic integration
without synchronization
seuil d’excitation
du neurone
potentiel de repos
du neurone
diff´
erentes
dendrites
signaux d’entr´
ees
P.Reynaud-Bouret
Hawkes
Aussois 2015
28 / 129
Multivariate Hawkes processes and Lasso
A model for neuroscience and genomics
Synaptic integration
without synchronization
seuil d’excitation
du neurone
potentiel de repos
du neurone
diff´
erentes
dendrites
signaux d’entr´
ees
P.Reynaud-Bouret
Hawkes
Aussois 2015
28 / 129
Multivariate Hawkes processes and Lasso
A model for neuroscience and genomics
Synaptic integration
without synchronization
seuil d’excitation
du neurone
potentiel de repos
du neurone
diff´
erentes
dendrites
signaux d’entr´
ees
P.Reynaud-Bouret
Hawkes
Aussois 2015
28 / 129
Multivariate Hawkes processes and Lasso
A model for neuroscience and genomics
Synaptic integration
without synchronization
seuil d’excitation
du neurone
potentiel de repos
du neurone
diff´
erentes
dendrites
signaux d’entr´
ees
P.Reynaud-Bouret
Hawkes
Aussois 2015
28 / 129
Multivariate Hawkes processes and Lasso
A model for neuroscience and genomics
Synaptic integration
without synchronization
seuil d’excitation
du neurone
potentiel de repos
du neurone
diff´
erentes
dendrites
signaux d’entr´
ees
P.Reynaud-Bouret
Hawkes
Aussois 2015
28 / 129
Multivariate Hawkes processes and Lasso
A model for neuroscience and genomics
Synaptic integration
without synchronization
seuil d’excitation
du neurone
potentiel de repos
du neurone
diff´
erentes
dendrites
signaux d’entr´
ees
P.Reynaud-Bouret
Hawkes
Aussois 2015
28 / 129
Multivariate Hawkes processes and Lasso
A model for neuroscience and genomics
Synaptic integration
without synchronization
seuil d’excitation
du neurone
potentiel de repos
du neurone
diff´
erentes
dendrites
signaux d’entr´
ees
P.Reynaud-Bouret
Hawkes
Aussois 2015
28 / 129
Multivariate Hawkes processes and Lasso
A model for neuroscience and genomics
Synaptic integration
without synchronization
seuil d’excitation
du neurone
potentiel de repos
du neurone
diff´
erentes
dendrites
signaux d’entr´
ees
P.Reynaud-Bouret
Hawkes
Aussois 2015
28 / 129
Multivariate Hawkes processes and Lasso
A model for neuroscience and genomics
Synaptic integration
without synchronization
seuil d’excitation
du neurone
potentiel de repos
du neurone
diff´
erentes
dendrites
signaux d’entr´
ees
P.Reynaud-Bouret
Hawkes
Aussois 2015
28 / 129
Multivariate Hawkes processes and Lasso
A model for neuroscience and genomics
Synaptic integration
without synchronization
seuil d’excitation
du neurone
potentiel de repos
du neurone
diff´
erentes
dendrites
signaux d’entr´
ees
P.Reynaud-Bouret
Hawkes
Aussois 2015
28 / 129
Multivariate Hawkes processes and Lasso
A model for neuroscience and genomics
Synaptic integration
without synchronization
potentiel d’action
seuil d’excitation
du neurone
potentiel de repos
du neurone
diff´
erentes
dendrites
signaux d’entr´
ees
P.Reynaud-Bouret
Hawkes
Aussois 2015
28 / 129
Multivariate Hawkes processes and Lasso
A model for neuroscience and genomics
Synaptic integration
with synchronization
seuil d’excitation
du neurone
potentiel de repos
du neurone
diff´
erentes
dendrites
signaux d’entr´
ees
P.Reynaud-Bouret
Hawkes
Aussois 2015
29 / 129
Multivariate Hawkes processes and Lasso
A model for neuroscience and genomics
Synaptic integration
with synchronization
seuil d’excitation
du neurone
potentiel de repos
du neurone
diff´
erentes
dendrites
signaux d’entr´
ees
P.Reynaud-Bouret
Hawkes
Aussois 2015
29 / 129
Multivariate Hawkes processes and Lasso
A model for neuroscience and genomics
Synaptic integration
with synchronization
seuil d’excitation
du neurone
potentiel de repos
du neurone
diff´
erentes
dendrites
signaux d’entr´
ees
P.Reynaud-Bouret
Hawkes
Aussois 2015
29 / 129
Multivariate Hawkes processes and Lasso
A model for neuroscience and genomics
Synaptic integration
with synchronization
potentiel d’action
seuil d’excitation
du neurone
potentiel de repos
du neurone
diff´
erentes
dendrites
signaux d’entr´
ees
P.Reynaud-Bouret
Hawkes
Aussois 2015
29 / 129
Multivariate Hawkes processes and Lasso
A model for neuroscience and genomics
Synaptic integration
with synchronization
potentiel d’action
seuil d’excitation
du neurone
potentiel de repos
du neurone
diff´
erentes
dendrites
signaux d’entr´
ees
P.Reynaud-Bouret
Hawkes
Aussois 2015
29 / 129
Multivariate Hawkes processes and Lasso
A model for neuroscience and genomics
Synaptic integration
with synchronization
potentiel d’action
seuil d’excitation
du neurone
potentiel de repos
du neurone
diff´
erentes
dendrites
signaux d’entr´
ees
P.Reynaud-Bouret
Hawkes
Aussois 2015
29 / 129
Multivariate Hawkes processes and Lasso
A model for neuroscience and genomics
Synaptic integration
with synchronization
potentiel d’action
seuil d’excitation
du neurone
potentiel de repos
du neurone
diff´
erentes
dendrites
signaux d’entr´
ees
P.Reynaud-Bouret
Hawkes
Aussois 2015
29 / 129
Multivariate Hawkes processes and Lasso
A model for neuroscience and genomics
Synaptic integration
with synchronization
potentiel d’action
seuil d’excitation
du neurone
potentiel de repos
du neurone
diff´
erentes
dendrites
signaux d’entr´
ees
P.Reynaud-Bouret
Hawkes
Aussois 2015
29 / 129
Multivariate Hawkes processes and Lasso
A model for neuroscience and genomics
Synaptic integration
with synchronization
potentiel d’action
potentiel d’action
seuil d’excitation
du neurone
potentiel de repos
du neurone
diff´
erentes
dendrites
signaux d’entr´
ees
P.Reynaud-Bouret
Hawkes
Aussois 2015
29 / 129
Multivariate Hawkes processes and Lasso
A model for neuroscience and genomics
Point process and genomics
There are several ”events” of different types on the DNA that may ”work”
together in synergy.
P.Reynaud-Bouret
Hawkes
Aussois 2015
30 / 129
Multivariate Hawkes processes and Lasso
A model for neuroscience and genomics
Point process and genomics
There are several ”events” of different types on the DNA that may ”work”
together in synergy.
Motifs
= words in the DNA-alphabet {actg}.
P.Reynaud-Bouret
Hawkes
Aussois 2015
30 / 129
Multivariate Hawkes processes and Lasso
A model for neuroscience and genomics
Point process and genomics
There are several ”events” of different types on the DNA that may ”work”
together in synergy.
Motifs
= words in the DNA-alphabet {actg}.
How can statistician suggest functional motifs based on the statistical
properties of their occurrences ?
Unexpected frequency → Markov models (see for a review Reinert,
Schbath, Waterman (2000))
P.Reynaud-Bouret
Hawkes
Aussois 2015
30 / 129
Multivariate Hawkes processes and Lasso
A model for neuroscience and genomics
Point process and genomics
There are several ”events” of different types on the DNA that may ”work”
together in synergy.
Motifs
= words in the DNA-alphabet {actg}.
How can statistician suggest functional motifs based on the statistical
properties of their occurrences ?
Unexpected frequency → Markov models (see for a review Reinert,
Schbath, Waterman (2000))
Poor or rich regions → scan statistics (see, for instance, Robin
Daudin (1999) or Stefanov (2003))
P.Reynaud-Bouret
Hawkes
Aussois 2015
30 / 129
Multivariate Hawkes processes and Lasso
A model for neuroscience and genomics
Point process and genomics
There are several ”events” of different types on the DNA that may ”work”
together in synergy.
Motifs
= words in the DNA-alphabet {actg}.
How can statistician suggest functional motifs based on the statistical
properties of their occurrences ?
Unexpected frequency → Markov models (see for a review Reinert,
Schbath, Waterman (2000))
Poor or rich regions → scan statistics (see, for instance, Robin
Daudin (1999) or Stefanov (2003))
If two motifs are part of a common biological process, the space
between their occurrences (not necessarily consecutive) should be
somehow fixed → favored or avoided distances (Gusto, Schbath
(2005))
P.Reynaud-Bouret
Hawkes
Aussois 2015
30 / 129
Multivariate Hawkes processes and Lasso
A model for neuroscience and genomics
Point process and genomics
There are several ”events” of different types on the DNA that may ”work”
together in synergy.
Motifs
= words in the DNA-alphabet {actg}.
How can statistician suggest functional motifs based on the statistical
properties of their occurrences ?
Unexpected frequency → Markov models (see for a review Reinert,
Schbath, Waterman (2000))
Poor or rich regions → scan statistics (see, for instance, Robin
Daudin (1999) or Stefanov (2003))
If two motifs are part of a common biological process, the space
between their occurrences (not necessarily consecutive) should be
somehow fixed → favored or avoided distances (Gusto, Schbath
(2005)) pairwise study.
P.Reynaud-Bouret
Hawkes
Aussois 2015
30 / 129
Multivariate Hawkes processes and Lasso
A model for neuroscience and genomics
Why real multivariate point processes in genomics ?
TRE
Transcription Regulatory Elements = ”everything” that may enhance or
repress gene expression
P.Reynaud-Bouret
Hawkes
Aussois 2015
31 / 129
Multivariate Hawkes processes and Lasso
A model for neuroscience and genomics
Why real multivariate point processes in genomics ?
TRE
Transcription Regulatory Elements = ”everything” that may enhance or
repress gene expression
promoter, enhancer, silencer, histone modifications, replication origin
on the DNA.... They should interact but how ? Can we have a
statistical guess ?
P.Reynaud-Bouret
Hawkes
Aussois 2015
31 / 129
Multivariate Hawkes processes and Lasso
A model for neuroscience and genomics
Why real multivariate point processes in genomics ?
TRE
Transcription Regulatory Elements = ”everything” that may enhance or
repress gene expression
promoter, enhancer, silencer, histone modifications, replication origin
on the DNA.... They should interact but how ? Can we have a
statistical guess ?
There are methods (ChIP-chip experiments, ChIP-seq experiments)
where after preprocessing the data one has access to the (almost
exact) positions of several type of TREs at one time, and this under
different experimental conditions. (ENCODE)
P.Reynaud-Bouret
Hawkes
Aussois 2015
31 / 129
Multivariate Hawkes processes and Lasso
A model for neuroscience and genomics
Why real multivariate point processes in genomics ?
TRE
Transcription Regulatory Elements = ”everything” that may enhance or
repress gene expression
promoter, enhancer, silencer, histone modifications, replication origin
on the DNA.... They should interact but how ? Can we have a
statistical guess ?
There are methods (ChIP-chip experiments, ChIP-seq experiments)
where after preprocessing the data one has access to the (almost
exact) positions of several type of TREs at one time, and this under
different experimental conditions. (ENCODE)
On the real line = DNA if the 3D structure of the DNA is negligible
(typically interaction range between points ≤ 10 kB)
P.Reynaud-Bouret
Hawkes
Aussois 2015
31 / 129
Multivariate Hawkes processes and Lasso
A model for neuroscience and genomics
Why real multivariate point processes in genomics ?
TRE
Transcription Regulatory Elements = ”everything” that may enhance or
repress gene expression
promoter, enhancer, silencer, histone modifications, replication origin
on the DNA.... They should interact but how ? Can we have a
statistical guess ?
There are methods (ChIP-chip experiments, ChIP-seq experiments)
where after preprocessing the data one has access to the (almost
exact) positions of several type of TREs at one time, and this under
different experimental conditions. (ENCODE)
On the real line = DNA if the 3D structure of the DNA is negligible
(typically interaction range between points ≤ 10 kB)
If the real structure → 3D point processes on graphs... ( ? ?)
Why just DNA ? RNA etc ...
P.Reynaud-Bouret
Hawkes
Aussois 2015
31 / 129
Multivariate Hawkes processes and Lasso
A model for neuroscience and genomics
Point processes and conditional intensity
dN
$ %&t '
Nbr observed points
in [t, t + dt]
λ(t)
P.Reynaud-Bouret
=
=
=
intensity
& '$ %
λ(t) dt
$ %& '
Expected Number
given the past before t
+
$noise
%& '
Martingales
differences
instantaneous frequency
random, depends on previous points
Hawkes
Aussois 2015
32 / 129
Multivariate Hawkes processes and Lasso
A model for neuroscience and genomics
Local independence graphs
(Didelez (2008))
1
3
1
2
P.Reynaud-Bouret
3
1
2
Hawkes
3
2
Aussois 2015
33 / 129
Multivariate Hawkes processes and Lasso
A model for neuroscience and genomics
Local independence graphs
(Didelez (2008))
1
3
1
2
3
1
2
3
2
If one is able to infer a local independence graph, then we may have access
to ”functional connectivity”.
&→ needs a ”model” .
P.Reynaud-Bouret
Hawkes
Aussois 2015
33 / 129
Multivariate Hawkes processes and Lasso
A model for neuroscience and genomics
Multivariate Hawkes processes
t
N1
N2
N3
P.Reynaud-Bouret
Hawkes
Aussois 2015
34 / 129
Multivariate Hawkes processes and Lasso
A model for neuroscience and genomics
Multivariate Hawkes processes
t
N1
N2
N3
λ(1) (t)
P.Reynaud-Bouret
=
ν
$%&'
spontaneous
Hawkes
Aussois 2015
34 / 129
Multivariate Hawkes processes and Lasso
A model for neuroscience and genomics
Multivariate Hawkes processes
t
h
N1
1->1
N2
N3
λ(1) (t)
=
ν
$%&'
spontaneous
P.Reynaud-Bouret
+
,
h1→1 (t − T )
T <t
pt de N1
$
%&
self-interaction
Hawkes
'
Aussois 2015
34 / 129
Multivariate Hawkes processes and Lasso
A model for neuroscience and genomics
Multivariate Hawkes processes
t
h
N1
1->1
x
N2
N3
λ(1) (t)
=
ν
$%&'
spontaneous
P.Reynaud-Bouret
+
,
h1→1 (t − T )
T <t
pt de N1
$
%&
self-interaction
Hawkes
'
Aussois 2015
34 / 129
Multivariate Hawkes processes and Lasso
A model for neuroscience and genomics
Multivariate Hawkes processes
t
h
N1
x
1->1
x
N2
N3
λ(1) (t)
=
ν
$%&'
spontaneous
P.Reynaud-Bouret
+
,
h1→1 (t − T )
T <t
pt de N1
$
%&
self-interaction
Hawkes
'
Aussois 2015
34 / 129
Multivariate Hawkes processes and Lasso
A model for neuroscience and genomics
Multivariate Hawkes processes
t
h
N1
x
1->1
x
x
h
N2
2->1
N3
λ(1) (t)
=
ν
$%&'
spontaneous
P.Reynaud-Bouret
+
,
h1→1 (t − T )
T <t
pt de N1
%&
$
self-interaction
Hawkes
'
+
,
hm→1 (t − T )
m ̸= 1,
T < t,
pt de Nm
%&
$
interaction
Aussois 2015
'
34 / 129
Multivariate Hawkes processes and Lasso
A model for neuroscience and genomics
Multivariate Hawkes processes
t
h
N1
x
1->1
x
x
h
N2
2->1
x
N3
λ(1) (t)
=
ν
$%&'
spontaneous
P.Reynaud-Bouret
+
,
h1→1 (t − T )
T <t
pt de N1
%&
$
self-interaction
Hawkes
'
+
,
hm→1 (t − T )
m ̸= 1,
T < t,
pt de Nm
%&
$
interaction
Aussois 2015
'
34 / 129
Multivariate Hawkes processes and Lasso
A model for neuroscience and genomics
Multivariate Hawkes processes
t
h
N1
x
1->1
x
x
h
N2
2->1
x
h
N3
λ(1) (t)
=
ν
$%&'
spontaneous
P.Reynaud-Bouret
+
,
h1→1 (t − T )
T <t
pt de N1
%&
$
self-interaction
Hawkes
x
3->1
'
+
,
hm→1 (t − T )
m ̸= 1,
T < t,
pt de Nm
%&
$
interaction
Aussois 2015
'
34 / 129
Multivariate Hawkes processes and Lasso
A model for neuroscience and genomics
Multivariate Hawkes processes
t
h
N1
x
1->1
x
x
h
N2
2->1
x
h
N3
λ(1) (t)
=
ν
$%&'
spontaneous
P.Reynaud-Bouret
+
,
h1→1 (t − T )
T <t
pt de N1
%&
$
self-interaction
Hawkes
x
3->1
'
+
x
,
hm→1 (t − T )
m ̸= 1,
T < t,
pt de Nm
%&
$
interaction
Aussois 2015
'
34 / 129
Multivariate Hawkes processes and Lasso
A model for neuroscience and genomics
Multivariate Hawkes processes
t
h
N1
x
1->1
x
x
h
N2
2->1
x
h
N3
λ(1) (t)
=
ν
$%&'
spontaneous
P.Reynaud-Bouret
+
,
h1→1 (t − T )
T <t
pt de N1
%&
$
self-interaction
Hawkes
x
3->1
'
+
x
x
,
hm→1 (t − T )
m ̸= 1,
T < t,
pt de Nm
%&
$
interaction
Aussois 2015
'
34 / 129
Multivariate Hawkes processes and Lasso
A model for neuroscience and genomics
More formally
(r )
Only excitation (all the hℓ
are positive) : for all r ,
M # t−
,
(r )
(ℓ)
λ(r ) (t) = νr +
hℓ (t − u)dNu .
ℓ=1
−∞
Branching)/ Cluster representation,
stationary process if the spectral
*
" (r )
radius of
hℓ (t)dt is < 1.
P.Reynaud-Bouret
Hawkes
Aussois 2015
35 / 129
Multivariate Hawkes processes and Lasso
A model for neuroscience and genomics
More formally
(r )
Only excitation (all the hℓ
are positive) : for all r ,
M # t−
,
(r )
(ℓ)
λ(r ) (t) = νr +
hℓ (t − u)dNu .
ℓ=1
−∞
Branching)/ Cluster representation,
stationary process if the spectral
*
" (r )
radius of
hℓ (t)dt is < 1.
Interaction, for instance, (in general any 1-Lipschitz function,
Br´emaud Massouli´e 1996)
.
M # t−
,
(ℓ)
(r )
(r )
.
λ (t) = νr +
hℓ (t − u)dNu
ℓ=1
P.Reynaud-Bouret
−∞
Hawkes
+
Aussois 2015
35 / 129
Multivariate Hawkes processes and Lasso
A model for neuroscience and genomics
More formally
(r )
Only excitation (all the hℓ
are positive) : for all r ,
M # t−
,
(r )
(ℓ)
λ(r ) (t) = νr +
hℓ (t − u)dNu .
ℓ=1
−∞
Branching)/ Cluster representation,
stationary process if the spectral
*
" (r )
radius of
hℓ (t)dt is < 1.
Interaction, for instance, (in general any 1-Lipschitz function,
Br´emaud Massouli´e 1996)
.
M # t−
,
(ℓ)
(r )
(r )
.
λ (t) = νr +
hℓ (t − u)dNu
ℓ=1
−∞
+
Exponential (Multiplicative
no guarantee of a stationary
.
- shapeMbut
, # t− (r )
(ℓ)
.
hℓ (t − u)dNu
version ...) λ(r ) (t) = exp νr +
P.Reynaud-Bouret
ℓ=1
Hawkes
−∞
Aussois 2015
35 / 129
Multivariate Hawkes processes and Lasso
Parametric estimation
Previous works
Maximum likelihood estimates eventually (Ogata, Vere-Jones etc
mainly for sismology, Chornoboy et al., for neuroscience, Gusto and
Schbath for genomics)
Parametric tests for the detection of edge + Maximum likelihood +
exponential formula + spline estimation (Carstensen et al., in
genomics)
P.Reynaud-Bouret
Hawkes
Aussois 2015
36 / 129
Multivariate Hawkes processes and Lasso
Parametric estimation
Previous works
Maximum likelihood estimates eventually (Ogata, Vere-Jones etc
mainly for sismology, Chornoboy et al., for neuroscience, Gusto and
Schbath for genomics)
Parametric tests for the detection of edge + Maximum likelihood +
exponential formula + spline estimation (Carstensen et al., in
genomics)
Main problem not enough spikes ! so either over-fitting because d
(number of parameters) too large or bad estimation if d too small
(see later).
Also MLE are not that easy to compute (EM algorithm etc)
̸= least-squares if linear parametric model.
P.Reynaud-Bouret
Hawkes
Aussois 2015
36 / 129
Multivariate Hawkes processes and Lasso
Parametric estimation
Another parametric method : Least-squares
Observation on [0, T ]
A good estimate of the parameters should correspond to an intensity
candidate close to the true one.
P.Reynaud-Bouret
Hawkes
Aussois 2015
37 / 129
Multivariate Hawkes processes and Lasso
Parametric estimation
Another parametric method : Least-squares
Observation on [0, T ]
A good estimate of the parameters should correspond to an intensity
candidate close to the true one.
"T
Hence one would like to minimize ∥η − λ∥2 = 0 [η(t) − λ(t)]2 dt in η
P.Reynaud-Bouret
Hawkes
Aussois 2015
37 / 129
Multivariate Hawkes processes and Lasso
Parametric estimation
Another parametric method : Least-squares
Observation on [0, T ]
A good estimate of the parameters should correspond to an intensity
candidate close to the true one.
"T
Hence one would like to minimize ∥η − λ∥2 = 0 [η(t) − λ(t)]2 dt in η
"T
"T
or equivalently −2 0 η(t)λ(t)dt + 0 η(t)2 dt.
P.Reynaud-Bouret
Hawkes
Aussois 2015
37 / 129
Multivariate Hawkes processes and Lasso
Parametric estimation
Another parametric method : Least-squares
Observation on [0, T ]
A good estimate of the parameters should correspond to an intensity
candidate close to the true one.
"T
Hence one would like to minimize ∥η − λ∥2 = 0 [η(t) − λ(t)]2 dt in η
"T
"T
or equivalently −2 0 η(t)λ(t)dt + 0 η(t)2 dt.
But dNt randomly fluctuates around λ(t)dt and is observable.
P.Reynaud-Bouret
Hawkes
Aussois 2015
37 / 129
Multivariate Hawkes processes and Lasso
Parametric estimation
Another parametric method : Least-squares
Observation on [0, T ]
A good estimate of the parameters should correspond to an intensity
candidate close to the true one.
"T
Hence one would like to minimize ∥η − λ∥2 = 0 [η(t) − λ(t)]2 dt in η
"T
"T
or equivalently −2 0 η(t)λ(t)dt + 0 η(t)2 dt.
But dNt randomly fluctuates around λ(t)dt and is observable.
Hence minimize
γ(η) = −2
#
T
η(t)dNt +
0
#
T
η(t)2 dt,
0
for a model η = λa (t),
If multivariate minimize
P.Reynaud-Bouret
!
m
γm (η (m) ).
Hawkes
Aussois 2015
37 / 129
Multivariate Hawkes processes and Lasso
Parametric estimation
Least-square contrast on a toy model
Let [a, b] an interval of R+ .
P.Reynaud-Bouret
Hawkes
Aussois 2015
38 / 129
Multivariate Hawkes processes and Lasso
Parametric estimation
Least-square contrast on a toy model
Let [a, b] an interval of R+ .
η(t) =
,
α1a≤t−T ≤b = αN[t−b,t−a]
T ∈N
There is only one parameter α → minimizing γ
P.Reynaud-Bouret
Hawkes
Aussois 2015
38 / 129
Multivariate Hawkes processes and Lasso
Parametric estimation
Least-square contrast on a toy model
Let [a, b] an interval of R+ .
η(t) =
,
α1a≤t−T ≤b = αN[t−b,t−a]
T ∈N
There is only one parameter α → minimizing γ
γ(η) = −2α
P.Reynaud-Bouret
#
T
0
2
N[t−b,t−a] dNt + α
Hawkes
#
0
T
2
N[t−b,t−a]
dt
Aussois 2015
38 / 129
Multivariate Hawkes processes and Lasso
Parametric estimation
Least-square contrast on a toy model
Let [a, b] an interval of R+ .
η(t) =
,
α1a≤t−T ≤b = αN[t−b,t−a]
T ∈N
There is only one parameter α → minimizing γ
"T
N[t−b,t−a] dNt
α
ˆ = "0 T
2
0 N[t−b,t−a] dt
P.Reynaud-Bouret
Hawkes
Aussois 2015
38 / 129
Multivariate Hawkes processes and Lasso
Parametric estimation
Least-square contrast on a toy model
Let [a, b] an interval of R+ .
η(t) =
,
α1a≤t−T ≤b = αN[t−b,t−a]
T ∈N
There is only one parameter α → minimizing γ
"T
N[t−b,t−a] dNt
α
ˆ = "0 T
2
0 N[t−b,t−a] dt
b
a
P.Reynaud-Bouret
Hawkes
Aussois 2015
38 / 129
Multivariate Hawkes processes and Lasso
Parametric estimation
Least-square contrast on a toy model
Let [a, b] an interval of R+ .
η(t) =
,
α1a≤t−T ≤b = αN[t−b,t−a]
T ∈N
There is only one parameter α → minimizing γ
"T
N[t−b,t−a] dNt
α
ˆ = "0 T
2
0 N[t−b,t−a] dt
b
a
P.Reynaud-Bouret
Hawkes
Aussois 2015
38 / 129
Multivariate Hawkes processes and Lasso
Parametric estimation
Least-square contrast on a toy model
Let [a, b] an interval of R+ .
η(t) =
,
α1a≤t−T ≤b = αN[t−b,t−a]
T ∈N
There is only one parameter α → minimizing γ
"T
N[t−b,t−a] dNt
α
ˆ = "0 T
2
0 N[t−b,t−a] dt
b
a
P.Reynaud-Bouret
Hawkes
Aussois 2015
38 / 129
Multivariate Hawkes processes and Lasso
Parametric estimation
Least-square contrast on a toy model
Let [a, b] an interval of R+ .
η(t) =
,
α1a≤t−T ≤b = αN[t−b,t−a]
T ∈N
There is only one parameter α → minimizing γ
"T
N[t−b,t−a] dNt
α
ˆ = "0 T
2
0 N[t−b,t−a] dt
b
a
P.Reynaud-Bouret
Hawkes
Aussois 2015
38 / 129
Multivariate Hawkes processes and Lasso
Parametric estimation
Least-square contrast on a toy model
Let [a, b] an interval of R+ .
η(t) =
,
α1a≤t−T ≤b = αN[t−b,t−a]
T ∈N
There is only one parameter α → minimizing γ
"T
N[t−b,t−a] dNt
α
ˆ = "0 T
2
0 N[t−b,t−a] dt
The numerator is the number of pairs of points with delay in [a, b]
P.Reynaud-Bouret
Hawkes
Aussois 2015
38 / 129
Multivariate Hawkes processes and Lasso
Parametric estimation
Full multivariate processes and linear parametrisation
(r )
?
λ (t) = ν
(r )
+
M
,
,
ℓ=1 T <t,T in N (ℓ)
P.Reynaud-Bouret
Hawkes
(r )
hℓ (t − T ).
Aussois 2015
39 / 129
Multivariate Hawkes processes and Lasso
Parametric estimation
Full multivariate processes and linear parametrisation
(r )
?
λ (t) = ν
(r )
+
M
,
,
ℓ=1 T <t,T in N (ℓ)
(r )
hℓ (t − T ).
+ Piecewise constant model with parameter a
P.Reynaud-Bouret
Hawkes
Aussois 2015
39 / 129
Multivariate Hawkes processes and Lasso
Parametric estimation
Full multivariate processes and linear parametrisation
(r )
?
λ (t) = ν
(r )
+
M
,
,
ℓ=1 T <t,T in N (ℓ)
(r )
hℓ (t − T ).
+ Piecewise constant model with parameter a
By linearity,
?
λ(r ) (t) = (Rct )′ a,
P.Reynaud-Bouret
Hawkes
Aussois 2015
39 / 129
Multivariate Hawkes processes and Lasso
Parametric estimation
Full multivariate processes and linear parametrisation
(r )
?
λ (t) = ν
(r )
+
M
,
,
ℓ=1 T <t,T in N (ℓ)
(r )
hℓ (t − T ).
+ Piecewise constant model with parameter a
By linearity,
?
λ(r ) (t) = (Rct )′ a,
with Rct being the renormalized instantaneous count given by
′
(Rct ) =
P.Reynaud-Bouret
!
"
−1/2 (1) ′
−1/2 (M) ′
1, δ
(ct ) , ..., δ
(ct ) ,
Hawkes
Aussois 2015
39 / 129
Multivariate Hawkes processes and Lasso
Parametric estimation
Full multivariate processes and linear parametrisation
(r )
?
λ (t) = ν
(r )
+
M
,
,
ℓ=1 T <t,T in N (ℓ)
(r )
hℓ (t − T ).
+ Piecewise constant model with parameter a
By linearity,
?
λ(r ) (t) = (Rct )′ a,
with Rct being the renormalized instantaneous count given by
′
(Rct ) =
(ℓ)
and with ct
!
"
−1/2 (1) ′
−1/2 (M) ′
1, δ
(ct ) , ..., δ
(ct ) ,
being the vector of instantaneous count with delay of Nℓ i.e.
(ℓ) ′
(ct ) =
P.Reynaud-Bouret
!
"
(ℓ)
(ℓ)
N[t−δ,t) , ..., N[t−K δ,t−(K −1)δ) .
Hawkes
Aussois 2015
39 / 129
Multivariate Hawkes processes and Lasso
Parametric estimation
An heuristic for the least-square estimator
Informally, the link between the point process and its intensity can be
written as
(r )
dN (r ) (t) ≃ (Rct )′ a∗ dt + noise.
P.Reynaud-Bouret
Hawkes
Aussois 2015
40 / 129
Multivariate Hawkes processes and Lasso
Parametric estimation
An heuristic for the least-square estimator
Informally, the link between the point process and its intensity can be
written as
(r )
dN (r ) (t) ≃ (Rct )′ a∗ dt + noise.
Let
G=
#
T
Rct (Rct )′ dt,
0
the integrated covariation of the renormalized instantaneous count.
P.Reynaud-Bouret
Hawkes
Aussois 2015
40 / 129
Multivariate Hawkes processes and Lasso
Parametric estimation
An heuristic for the least-square estimator
Informally, the link between the point process and its intensity can be
written as
(r )
dN (r ) (t) ≃ (Rct )′ a∗ dt + noise.
Let
G=
#
T
Rct (Rct )′ dt,
0
the integrated covariation of the renormalized instantaneous count.
b(r ) :=
P.Reynaud-Bouret
#
T
0
(r )
Rct dN (r ) (t) ≃ Ga∗ + noise.
Hawkes
Aussois 2015
40 / 129
Multivariate Hawkes processes and Lasso
Parametric estimation
An heuristic for the least-square estimator
Informally, the link between the point process and its intensity can be
written as
(r )
dN (r ) (t) ≃ (Rct )′ a∗ dt + noise.
Let
G=
#
T
Rct (Rct )′ dt,
0
the integrated covariation of the renormalized instantaneous count.
b(r ) :=
#
T
0
(r )
Rct dN (r ) (t) ≃ Ga∗ + noise.
where in b lies again the number of couples with a certain delay
(cross-correlogram).
P.Reynaud-Bouret
Hawkes
Aussois 2015
40 / 129
Multivariate Hawkes processes and Lasso
Parametric estimation
Least-square estimate
G=
#
T
Rct (Rct )′ dt,
0
(r )
b
:=
#
T
Rct dN (r ) (t)
0
Least-square estimate
ˆ(r ) = G−1 b(r ) ,
a
→ simpler formula than Maximum Likelihood Estimators for similar
properties (except efficiency).
P.Reynaud-Bouret
Hawkes
Aussois 2015
41 / 129
Multivariate Hawkes processes and Lasso
Parametric estimation
What gain wrt cross correlogramm ?
1
5ms
3
2
5ms
only 2 non zero interaction functions over 9
P.Reynaud-Bouret
Hawkes
Aussois 2015
42 / 129
Multivariate Hawkes processes and Lasso
Parametric estimation
What gain wrt cross correlogramm ?
Histogram of dist2
Histogram of dist2
−0.03 −0.02 −0.01
0.00
1−>2
P.Reynaud-Bouret
0.01
0.02
0.03
40
Frequency
0
0
0
20
50
20
40
100
Frequency
60
Frequency
150
80
60
200
100
80
Histogram of dist2
−0.03 −0.02 −0.01
0.00
2−>3
Hawkes
0.01
0.02
0.03
−0.03 −0.02 −0.01
0.00
0.01
0.02
0.03
1−>3
Aussois 2015
42 / 129
Multivariate Hawkes processes and Lasso
Parametric estimation
4
0
1
2
interaction
3
4
3
2
interaction
0
1
2
0
1
interaction
3
4
What gain wrt cross correlogramm ?
0.000 0.005 0.010 0.015 0.020 0.025 0.030
0.000 0.005 0.010 0.015 0.020 0.025 0.030
0.000 0.005 0.010 0.015 0.020 0.025 0.030
1−>2
2−>3
1−>3
P.Reynaud-Bouret
Hawkes
Aussois 2015
42 / 129
Multivariate Hawkes processes and Lasso
What can we do against over fitting ? (Adaptation)
Adaptive statistics
If no model known, one usually wants to consider the largest possible
model (piecewise constant with hundreds of parameters etc)
If use MLE or OLS, each parameter estimate has a variance ≃ 1/T
P.Reynaud-Bouret
Hawkes
Aussois 2015
43 / 129
Multivariate Hawkes processes and Lasso
What can we do against over fitting ? (Adaptation)
Adaptive statistics
If no model known, one usually wants to consider the largest possible
model (piecewise constant with hundreds of parameters etc)
If use MLE or OLS, each parameter estimate has a variance ≃ 1/T
If there are d parameters, global variance ≃ d/T
P.Reynaud-Bouret
Hawkes
Aussois 2015
43 / 129
Multivariate Hawkes processes and Lasso
What can we do against over fitting ? (Adaptation)
Adaptive statistics
If no model known, one usually wants to consider the largest possible
model (piecewise constant with hundreds of parameters etc)
If use MLE or OLS, each parameter estimate has a variance ≃ 1/T
If there are d parameters, global variance ≃ d/T
Over-fitting ! ! ! !
P.Reynaud-Bouret
Hawkes
Aussois 2015
43 / 129
Multivariate Hawkes processes and Lasso
What can we do against over fitting ? (Adaptation)
Moindres carrés
Penalty and parameter choices
Contraste
Complexité du modèle
P.Reynaud-Bouret
Hawkes
Aussois 2015
44 / 129
Multivariate Hawkes processes and Lasso
What can we do against over fitting ? (Adaptation)
Penalty and parameter choices
1
2
Moindres carr
3
P.Reynaud-Bouret
Hawkes
Aussois 2015
44 / 129
Multivariate Hawkes processes and Lasso
What can we do against over fitting ? (Adaptation)
Penalty and parameter choices
1
2
Moindres carr
3
1
10
xité du modèle (dimension)
P.Reynaud-Bouret
Hawkes
270
Aussois 2015
44 / 129
Multivariate Hawkes processes and Lasso
What can we do against over fitting ? (Adaptation)
Penalty and parameter choices
Moindres carrés
1
3
2
Complexité du modèle
P.Reynaud-Bouret
Hawkes
Aussois 2015
44 / 129
Multivariate Hawkes processes and Lasso
What can we do against over fitting ? (Adaptation)
Moindres carrés
Penalty and parameter choices
Pénalité
Contraste
Complexité du modèle
P.Reynaud-Bouret
Hawkes
Aussois 2015
44 / 129
Multivariate Hawkes processes and Lasso
What can we do against over fitting ? (Adaptation)
Penalty and parameter choices
Contraste + Pénalité =
Critère pénalisé
1
Moindres carrés
3
2
Pénalité
Contraste
P.Reynaud-Bouret
Complexité du modèle
Hawkes
Aussois 2015
44 / 129
Multivariate Hawkes processes and Lasso
What can we do against over fitting ? (Adaptation)
Previous works
log-likelihood penalized by AIC (Akaike Criterion) (Ogata, Vere-Jones,
Gusto and Schbath) : choose the model with d parameters such that
ℓ(λθˆd ) + d
Generally works well if few models and largest dimension fixed
whereas T → ∞
P.Reynaud-Bouret
Hawkes
Aussois 2015
45 / 129
Multivariate Hawkes processes and Lasso
What can we do against over fitting ? (Adaptation)
Previous works
log-likelihood penalized by AIC (Akaike Criterion) (Ogata, Vere-Jones,
Gusto and Schbath) : choose the model with d parameters such that
ℓ(λθˆd ) + d
Generally works well if few models and largest dimension fixed
whereas T → ∞
least-squares+ ℓ0 penalty, (RB and Schbath)
γ(λθˆd ) + cˆd
with cˆ data-driven. Proof that if cˆ > κ, it works even if d grows with
T (moderately, log(T ) with d)
P.Reynaud-Bouret
Hawkes
Aussois 2015
45 / 129
Multivariate Hawkes processes and Lasso
What can we do against over fitting ? (Adaptation)
Previous works
log-likelihood penalized by AIC (Akaike Criterion) (Ogata, Vere-Jones,
Gusto and Schbath) : choose the model with d parameters such that
ℓ(λθˆd ) + d
Generally works well if few models and largest dimension fixed
whereas T → ∞
least-squares+ ℓ0 penalty, (RB and Schbath)
γ(λθˆd ) + cˆd
with cˆ data-driven. Proof that if cˆ > κ, it works even if d grows with
T (moderately, log(T ) with d)
→ possible mathematically to search for the ”zeros”
P.Reynaud-Bouret
Hawkes
Aussois 2015
45 / 129
Multivariate Hawkes processes and Lasso
What can we do against over fitting ? (Adaptation)
Previous works
log-likelihood penalized by AIC (Akaike Criterion) (Ogata, Vere-Jones,
Gusto and Schbath) : choose the model with d parameters such that
ℓ(λθˆd ) + d
Generally works well if few models and largest dimension fixed
whereas T → ∞
least-squares+ ℓ0 penalty, (RB and Schbath)
γ(λθˆd ) + cˆd
with cˆ data-driven. Proof that if cˆ > κ, it works even if d grows with
T (moderately, log(T ) with d)
→ possible mathematically to search for the ”zeros”
Problem only for univariate because computational time and memory
size awfully large !
P.Reynaud-Bouret
Hawkes
Aussois 2015
45 / 129
Multivariate Hawkes processes and Lasso
What can we do against over fitting ? (Adaptation)
Previous works
log-likelihood penalized by AIC (Akaike Criterion) (Ogata, Vere-Jones,
Gusto and Schbath) : choose the model with d parameters such that
ℓ(λθˆd ) + d
Generally works well if few models and largest dimension fixed
whereas T → ∞
least-squares+ ℓ0 penalty, (RB and Schbath)
γ(λθˆd ) + cˆd
Magill
with cˆ data-driven. Proof that if cˆ > κ, it works even if d grows with
T (moderately, log(T ) with d)
→ possible mathematically to search for the ”zeros”
Problem only for univariate because computational time and memory
size awfully large !
→ convex criterion
P.Reynaud-Bouret
Hawkes
Aussois 2015
45 / 129
Multivariate Hawkes processes and Lasso
What can we do against over fitting ? (Adaptation)
Other works
Maximum likelihood + exponential formula + ℓ1 ”group Lasso”
penalty (Pillow et al. in neuroscience) but no mathematical proof
Thresholding + tests for very particular bivariate models, oracle
inequality (Sansonnet)
P.Reynaud-Bouret
Hawkes
Aussois 2015
46 / 129
Multivariate Hawkes processes and Lasso
Lasso criterion
ℓ1 penalty
with N.R. Hansen (Copenhagen), and V. Rivoirard (Dauphine) (2012)
The Lasso criterion can be expressed independently for each sub-process
by :
Least Absolute Shrinkage
introduced
c-
byTlbswnanid996)=
Lasso criterion
.
And
selection
Operate
(r )
ˆ(r ) = argmina {γT (λa ) + pen(a)}
a
= argmina {−2a′ br + a′ Ga + 2(d(r ) )′ |a|}
with d′ |a| =
lasso
!
k
dk |ak |
ball
minimizing
⇐, minimizing klinloutm
Under Constraint
lannm
a.
on
curves
bilinear
.
solution
with
ago
issue
P.Reynaud-Bouret
level
ofthe
Hawkes
J
form
n
Aussois 2015
47 / 129
Multivariate Hawkes processes and Lasso
Lasso criterion
ℓ1 penalty
with N.R. Hansen (Copenhagen), and V. Rivoirard (Dauphine) (2012)
The Lasso criterion can be expressed independently for each sub-process
by :
Lasso criterion
(r )
ˆ(r ) = argmina {γT (λa ) + pen(a)}
a
= argmina {−2a′ br + a′ Ga + 2(d(r ) )′ |a|}
The crucial choice is the d(r ) , should be data-driven !
The theoretical validation : be able to state that our choice is the
best possible choice.
The practical validation : on simulated Hawkes processes (done), on
simulated neuronal networks, on real data (RNRP Paris 6 work in
progress)...
P.Reynaud-Bouret
Hawkes
Aussois 2015
47 / 129
Multivariate Hawkes processes and Lasso
Lasso criterion
Theoretical Validation
Recall that
(r )
b
#
=
T
Rct dN (r ) (t)
0
and
G=
#
T
Rct (Rct )′ dt.
0
Hansen,Rivoirard,RB
If G ≥ cI with c > 0 and if
#
)
*(
( T
(
Rct dN (r ) (t) − λ(r ) (t)dt ( ≤ d(r ) ,
0
then
⎧
⎨,
,
1
ˆ(r ) ∥2 ≤ ! inf
∥λ(r ) − Rct a
∥λ(r ) − Rct a∥2 +
a ⎩
c
r
r
P.Reynaud-Bouret
Hawkes
∀r
,
(r )
(di )2
i ∈supp(a)
Aussois 2015
⎫
⎬
⎭
.
48 / 129
Multivariate Hawkes processes and Lasso
Lasso criterion
Small proof (Univariate case)
Let us fix some a and let ηa (t) = Rc′t a, our candidate intensity.
"T
Since G = 0 Rct (Rct )′ dt, a′ Ga = ||ηa ||2
P.Reynaud-Bouret
Hawkes
Aussois 2015
49 / 129
Multivariate Hawkes processes and Lasso
Lasso criterion
Small proof (Univariate case)
Let us fix some a and let ηa (t) = Rc′t a, our candidate intensity.
"T
Since G = 0 Rct (Rct )′ dt, a′ Ga = ||ηa ||2
||λ − ηˆa ||2 = ||λ||2 + ||ηˆa || − 2
#
T
ηˆa (t)λ(t)dt
# T
2
= ||λ|| + γ(ηˆa ) + 2
ηˆa (t)(dNt − λ(t)dt)
0
0
P.Reynaud-Bouret
Hawkes
Aussois 2015
49 / 129
Multivariate Hawkes processes and Lasso
Lasso criterion
Small proof (Univariate case)
Let us fix some a and let ηa (t) = Rc′t a, our candidate intensity.
"T
Since G = 0 Rct (Rct )′ dt, a′ Ga = ||ηa ||2
2
2
||λ − ηˆa || = ||λ|| + ||ηˆa || − 2
#
T
ηˆa (t)λ(t)dt
# T
2
ηˆa (t)(dNt − λ(t)dt)
= ||λ|| + γ(ηˆa ) + 2
0
# T
2
′
ηˆa (t)(dNt − λ(t)dt)
≤ ||λ|| + γ(ηa ) + 2d (|a| − |ˆ
a|) + 2
0
0
P.Reynaud-Bouret
Hawkes
Aussois 2015
49 / 129
Multivariate Hawkes processes and Lasso
Lasso criterion
Small proof (Univariate case)
Let us fix some a and let ηa (t) = Rc′t a, our candidate intensity.
"T
Since G = 0 Rct (Rct )′ dt, a′ Ga = ||ηa ||2
2
2
||λ − ηˆa || = ||λ|| + ||ηˆa || − 2
#
T
ηˆa (t)λ(t)dt
# T
= ||λ||2 + γ(ηˆa ) + 2
ηˆa (t)(dNt − λ(t)dt)
0
# T
2
′
≤ ||λ|| + γ(ηa ) + 2d (|a| − |ˆ
a|) + 2
ηˆa (t)(dNt − λ(t)dt)
0
# T
2
′
[ηˆa (t) − ηa (t)](dNt − λ(t)dt)
≤ ||λ − ηa || + 2d (|a| − |ˆ
a|) + 2
0
0
P.Reynaud-Bouret
Hawkes
Aussois 2015
49 / 129
Multivariate Hawkes processes and Lasso
Lasso criterion
Small proof (Univariate case)
Let us fix some a and let ηa (t) = Rc′t a, our candidate intensity.
"T
Since G = 0 Rct (Rct )′ dt, a′ Ga = ||ηa ||2
||λ − ηˆa ||2 = ||λ||2 + ||ηˆa || − 2
#
T
ηˆa (t)λ(t)dt
# T
ηˆa (t)(dNt − λ(t)dt)
= ||λ||2 + γ(ηˆa ) + 2
0
# T
2
′
≤ ||λ|| + γ(ηa ) + 2d (|a| − |ˆ
a|) + 2
ηˆa (t)(dNt − λ(t)dt)
0
6
5# T
′
2
′
Rct (dNt − λ(t)dt (ˆ
a − a)
≤ ||λ − ηa || + 2d (|a| − |ˆ
a|) + 2
0
0
P.Reynaud-Bouret
Hawkes
Aussois 2015
49 / 129
Multivariate Hawkes processes and Lasso
Lasso criterion
Small proof (Univariate case)
Let us fix some a and let ηa (t) = Rc′t a, our candidate intensity.
"T
Since G = 0 Rct (Rct )′ dt, a′ Ga = ||ηa ||2
2
2
||λ − ηˆa || = ||λ|| + ||ηˆa || − 2
#
T
ηˆa (t)λ(t)dt
# T
2
ηˆa (t)(dNt − λ(t)dt)
= ||λ|| + γ(ηˆa ) + 2
0
# T
ηˆa (t)(dNt − λ(t)dt)
≤ ||λ||2 + γ(ηa ) + 2d′ (|a| − |ˆ
a|) + 2
0
0
ˆ|
≤ ||λ − ηa ||2 + 2d′ (|a| − |ˆ
a|) + 2d′ |a − a
P.Reynaud-Bouret
Hawkes
Aussois 2015
49 / 129
Multivariate Hawkes processes and Lasso
Lasso criterion
Small proof (Univariate case)(2)
ˆ|
||λ − ηˆa ||2 ≤ ||λ − ηa ||2 + 2d′ (|a| − |ˆ
a|) + 2d′ |a − a
P.Reynaud-Bouret
Hawkes
Aussois 2015
50 / 129
Multivariate Hawkes processes and Lasso
Lasso criterion
Small proof (Univariate case)(2)
ˆ|
||λ − ηˆa ||2 ≤ ||λ − ηa ||2 + 2d′ (|a| − |ˆ
a|) + 2d′ |a − a
,
ˆi |
≤ ||λ − ηa ||2 + 4
di |ai − a
i ∈supp(a)
P.Reynaud-Bouret
Hawkes
Aussois 2015
50 / 129
Multivariate Hawkes processes and Lasso
Lasso criterion
Small proof (Univariate case)(2)
ˆ|
||λ − ηˆa ||2 ≤ ||λ − ηa ||2 + 2d′ (|a| − |ˆ
a|) + 2d′ |a − a
,
ˆi |
≤ ||λ − ηa ||2 + 4
di |ai − a
i ∈supp(a)
⎛
ˆ|| ⎝
≤ ||λ − ηa ||2 + 4||a − a
P.Reynaud-Bouret
Hawkes
,
i ∈supp(a)
⎞1/2
d2i ⎠
Aussois 2015
50 / 129
Multivariate Hawkes processes and Lasso
Lasso criterion
Small proof (Univariate case)(2)
ˆ|
||λ − ηˆa ||2 ≤ ||λ − ηa ||2 + 2d′ (|a| − |ˆ
a|) + 2d′ |a − a
,
ˆi |
≤ ||λ − ηa ||2 + 4
di |ai − a
i ∈supp(a)
⎛
ˆ|| ⎝
≤ ||λ − ηa ||2 + 4||a − a
But
ˆ||2 ≤
||a − a
,
i ∈supp(a)
⎞1/2
d2i ⎠
1
1
ˆ)′ G(a − a
ˆ) = ||ηa − ηˆa ||2
(a − a
c
c
<
2;
||ηa − λ||2 + ||λ − ηˆa ||2
≤
c
P.Reynaud-Bouret
Hawkes
Aussois 2015
50 / 129
Multivariate Hawkes processes and Lasso
Lasso criterion
Small proof (Univariate case)(2)
KB :
ˆ|
||λ − ηˆa ||2 ≤ ||λ − ηa ||2 + 2d′ (|a| − |ˆ
a|) + 2d′ |a − a
,
ˆi |
≤ ||λ − ηa ||2 + 4
di |ai − a
'
i ∈supp(a)
too
Iuvfcuittev
( Futral .eu?iE2thw
⎛
'
⇐
.
Hence for all α > 0
ˆ|| ⎝
≤ ||λ − ηa ||2 + 4||a − a
,
i ∈supp(a)
;
< 8α
||λ − ηˆa ||2 ≤ ||λ − ηa ||2 + α ||ηa − λ||2 + ||λ − ηˆa ||2 +
c
P.Reynaud-Bouret
Hawkes
⎞1/2
d2i ⎠
,
d2i
i ∈supp(a)
Aussois 2015
50 / 129
Multivariate Hawkes processes and Lasso
Lasso criterion
What did we proved ? (Univariate)
Recall that
#
b=
T
Rct dN(t)
0
and
G=
#
T
Rct (Rct )′ dt.
0
Hansen,Rivoirard,RB
If G ≥ cI with c > 0 and if
#
( T
(
(
Rct (dN(t) − λ(t)dt) ( ≤ d,
0
then
ˆ∥2 ≤ ! inf
∥λ − Rct a
a
P.Reynaud-Bouret
⎧
⎨
⎩
∥λ − Rct a∥2 +
1
c
Hawkes
,
i ∈supp(a)
(di )2
⎫
⎬
⎭
.
Aussois 2015
51 / 129
Multivariate Hawkes processes and Lasso
Lasso criterion
Known support
If there is a true parameter a∗ , then λ(t) = Rc′t a∗ and
¯=
b
#
T
Rct λ(t)dt = Ga∗
0
P.Reynaud-Bouret
Hawkes
Aussois 2015
52 / 129
Multivariate Hawkes processes and Lasso
Lasso criterion
Known support
If there is a true parameter a∗ , then λ(t) = Rc′t a∗ and
¯=
b
#
T
Rct λ(t)dt = Ga∗
0
ˆS
If support of a∗ , S, known, the least-square estimate on S → a
P.Reynaud-Bouret
Hawkes
Aussois 2015
52 / 129
Multivariate Hawkes processes and Lasso
Lasso criterion
Known support
If there is a true parameter a∗ , then λ(t) = Rc′t a∗ and
¯=
b
#
T
Rct λ(t)dt = Ga∗
0
ˆS
If support of a∗ , S, known, the least-square estimate on S → a
ˆS ∥2 = (a∗ − a
ˆS )′ G(a∗ − a
ˆS )
∥λ − Rct a
′ −1 ¯
¯
= (b − b) G (b − b)
≤
P.Reynaud-Bouret
1 ¯
||b − b||2
c
Hawkes
Aussois 2015
52 / 129
Multivariate Hawkes processes and Lasso
Lasso criterion
Oracle inequality
Lasso criterion
(r )
ˆ(r ) = argmina {γT (λa ) + pen(a)}
a
= argmina {−2a′ br + a′ Ga + 2(d(r ) )′ |a|}
Hansen,Rivoirard,RB
If G ≥ cI with c > 0 and if
( (r )
(
(b − b
¯ (r ) ( ≤ d(r ) ,
∀r
then
⎧
⎨,
,
1
ˆ(r ) ∥2 ≤ ! inf
∥λ(r ) − Rct a
∥λ(r ) − Rct a∥2 +
a ⎩
c
r
r
P.Reynaud-Bouret
Hawkes
,
(r )
(di )2
i ∈supp(a)
Aussois 2015
⎫
⎬
⎭
.
53 / 129
Multivariate Hawkes processes and Lasso
Lasso criterion
Comments
If we can find d sharp and data-driven, then this is an oracle
inequality ! ! :
Only an oracle could know the support in advance
Up to constant, we can do as well as the oracle
P.Reynaud-Bouret
Hawkes
Aussois 2015
54 / 129
Multivariate Hawkes processes and Lasso
Lasso criterion
Comments
If we can find d sharp and data-driven, then this is an oracle
inequality ! ! :
Only an oracle could know the support in advance
Up to constant, we can do as well as the oracle
We do not pay anything here for the size, except G invertible
it’s a ”cheap” oracle inequality ... In fact could be done under more
relaxed assumption...
c large (and we can observe it !) → quite confident for good
reconstruction =quality indicator.
P.Reynaud-Bouret
Hawkes
Aussois 2015
54 / 129
Multivariate Hawkes processes and Lasso
Lasso criterion
Comments
If we can find d sharp and data-driven, then this is an oracle
inequality ! ! :
Only an oracle could know the support in advance
Up to constant, we can do as well as the oracle
We do not pay anything here for the size, except G invertible
it’s a ”cheap” oracle inequality ... In fact could be done under more
relaxed assumption...
c large (and we can observe it !) → quite confident for good
reconstruction =quality indicator.
True even if the Hawkes linear model not true ! ! ! and we just pay
”quality of approximation”
quality of approximation + unavoidable price due to estimation.
P.Reynaud-Bouret
Hawkes
Aussois 2015
54 / 129
Multivariate Hawkes processes and Lasso
Lasso criterion
Mathematical ”debts”
works for other models (anything that is linear), works for other basis
and dictionary ...
Choice of d : → martingale calculus
G invertible in some cases ? ? ? → branching structure
P.Reynaud-Bouret
Hawkes
Aussois 2015
55 / 129
Multivariate Hawkes processes and Lasso
Simulations
Simulation study - Estimation (T = 20,
M = 8, K = 8)
oedema
Interactions reconstructed with ’Adaptive Lasso’, ’Bernstein Lasso’ and ’Bernstein
Lasso+OLS’. Above graphs, estimation of spontaneous rates
P.Reynaud-Bouret
Hawkes
Aussois 2015
56 / 129
Multivariate Hawkes processes and Lasso
Simulations
Another example with inhibition
P.Reynaud-Bouret
Hawkes
Aussois 2015
57 / 129
Multivariate Hawkes processes and Lasso
Real data analysis
On real (genomic) data
Spont= 0.000385
8e−04
0e+00
4e−04
interaction
0e+00
−5e−04
interaction
5e−04
Spont= 0.000385
1000
2000
3000
4000
5000
0
1000
2000
3000
bps
bps
Spont= 9.9e−05
Spont= 9.9e−05
4000
5000
4000
5000
4e−04
interaction
−1.0
0e+00
2e−04
0.0
−0.5
interaction
0.5
6e−04
1.0
0
0
1000
2000
3000
4000
5000
bps
0
1000
2000
3000
bps
4290 genes and 1036 tataat of E. coli (T = 9288442, A = 10000)
P.Reynaud-Bouret
Hawkes
Aussois 2015
58 / 129
Multivariate Hawkes processes and Lasso
Real data analysis
Sensory-motor task
(F. Grammont (Nice), A. Riehle (Marseille))
P.Reynaud-Bouret
Hawkes
Aussois 2015
59 / 129
Multivariate Hawkes processes and Lasso
Real data analysis
On neuronal data (sensorimotor task)
Hz
−20
Hz
−40
−1.0
−60
0.00
0.01
0.02
0.03
0.04
0.00
0.01
0.02
0.03
0.04
seconds
SB= 55.5 SBO= 88.9
SB= 55.5 SBO= 88.9
−150
Hz
−50
0
0.0 0.5 1.0
seconds
−1.0
Hz
0.0 0.5 1.0
SB= 22.2 SBO= 37.2
0
SB= 22.2 SBO= 37.2
0.00
0.01
0.02
0.03
0.04
seconds
0.00
0.01
0.02
0.03
0.04
seconds
30 trials : monkey trained to touch the correct target when illuminated.
Accept the test of Hawkes hypothesis. Work with F. Grammont, V.
P.Reynaud-Bouret
Hawkes
Aussois 2015
60 / 129
Multivariate Hawkes processes and Lasso
Real data analysis
Tests
with C. Tuleau-Malot, F. Grammont (Nice) and V. Rivoirard (Dauphine) (2013)
It is possible to test whether a process is a Hawkes process with
prescribed interaction functions by using the time-rescaling theorem.
(known since Ogata)
P.Reynaud-Bouret
Hawkes
Aussois 2015
61 / 129
Multivariate Hawkes processes and Lasso
Real data analysis
Tests
with C. Tuleau-Malot, F. Grammont (Nice) and V. Rivoirard (Dauphine) (2013)
It is possible to test whether a process is a Hawkes process with
prescribed interaction functions by using the time-rescaling theorem.
(known since Ogata)
By using subsampling, it is possible to plug an estimate of the
functions and still to test with controlled asymptotic level.
P.Reynaud-Bouret
Hawkes
Aussois 2015
61 / 129
Multivariate Hawkes processes and Lasso
Real data analysis
Tests
with C. Tuleau-Malot, F. Grammont (Nice) and V. Rivoirard (Dauphine) (2013)
It is possible to test whether a process is a Hawkes process with
prescribed interaction functions by using the time-rescaling theorem.
(known since Ogata)
By using subsampling, it is possible to plug an estimate of the
functions and still to test with controlled asymptotic level.
On both previous data sets, the Hawkes hypothesis is accepted
(p-values depends on the sub-sample, usually between 20 and 80 %),
whereas the homogeneous Poisson hypothesis (i.e. no interactions) is
rejected (p-values in 10−4 , 10−16 for the neuronal data)
P.Reynaud-Bouret
Hawkes
Aussois 2015
61 / 129
Multivariate Hawkes processes and Lasso
Real data analysis
Functional Connectivity ?
(Equipe RNRP de Paris 6)
P.Reynaud-Bouret
Hawkes
Aussois 2015
62 / 129
Multivariate Hawkes processes and Lasso
Real data analysis
Tetrode data on the rat
(Equipe RNRP de Paris 6)
P.Reynaud-Bouret
Hawkes
Aussois 2015
63 / 129
Multivariate Hawkes processes and Lasso
Real data analysis
On neuronal data (vibrissa excitation)
Joint work with RNRP (Paris 6). Behavior : vibrissa excitation at low
frequency. T = 90.5, M = 10
TT 102
9191
TT 201
99
TT 202
544
TT 204
149
TT 301
15
TT 303
18
TT 401
136
TT 402
282
TT 403
8
TT 404
6
4
40
Comportement 1 ; k 10 ; delta 0.005 ; gamma 1
TT1_02 −> TT1_02
TT2_02 −> TT1_02
TT1_02 −> TT2_02
TT4_01
TT4_03
TT2_04
TT1_02 −> TT4_02
1
10
2
20
TT2_02
Hz
TT4_02
TT3_01
TT3_03
TT2_01
0
TT4_04
0
P.Reynaud-Bouret
TT1_02 −> TT2_04
30
3
TT1_02
0
Neuron
Spikes
1
2
3
4
0.00
Hawkes
0.01
0.02
0.03
0.04
0.05
Aussois 2015
64 / 129
Multivariate Hawkes processes and Lasso
Real data analysis
Application on real data
Data :
Neuron
Spikes
TT 102
9191
TT 201
99
TT 202
544
TT 204
149
TT 301
15
TT 303
18
TT 401
136
TT 402
282
TT 403
8
TT 404
6
TT 201
92
TT 202
585
TT 204
148
TT 301
13
TT 303
23
TT 401
133
TT 402
271
TT 403
8
TT 404
3
Simulation :
Neuron
Spikes
TT 102
9327
P.Reynaud-Bouret
Hawkes
Aussois 2015
65 / 129
Multivariate Hawkes processes and Lasso
Real data analysis
Application on real data
Data :
Neuron
Spikes
TT 102
9191
TT 201
99
TT 202
544
TT 204
149
TT 301
15
TT 303
18
TT 401
136
TT 402
282
TT 403
8
TT 404
6
TT 201
92
TT 202
585
TT 204
148
TT 301
13
TT 303
23
TT 401
133
TT 402
271
TT 403
8
TT 404
3
Simulation :
40
4
TT 102
9327
TT1_02 −> TT1_02
TT2_02 −> TT1_02
TT4_02
TT2_02
TT4_01
TT4_03
TT2_04
1
10
2
Hz
TT1_02
20
30
3
TT1_02 −> TT2_02
TT3_01
TT3_03
TT2_01
0
TT4_04
0
Neuron
Spikes
0
1
2
3
4
0.00
0.01
0.02
0.03
0.04
0.05
time in seconds
P.Reynaud-Bouret
Hawkes
Aussois 2015
65 / 129
Multivariate Hawkes processes and Lasso
Real data analysis
Evolution of the dependance graph as a fonction of the
vibrissa excitation
10
25
TT2_14
50
TT2_14
TT2_0F
TT4_0A
TT2_0F
TT4_0A
TT4_0A
TT4_05
TT4_18
TT4_05
TT4_18
TT3_04
TT2_03
TT1_01
TT2_03
TT1_01
TT2_03
TT1_02
TT1_02
75
100
125
TT2_14
TT2_14
TT2_14
TT2_0F
TT4_0A
TT2_0F
TT4_0A
TT4_0A
TT4_05
TT4_05
TT4_18
TT3_04
TT2_03
TT4_05
TT4_18
TT3_04
TT1_01
TT3_04
TT1_01
TT2_03
TT1_01
TT2_03
TT1_02
TT1_02
TT1_02
150
175
200
TT2_14
TT2_14
TT2_0F
TT2_14
TT2_0F
TT4_0A
TT2_0F
TT4_0A
TT4_05
P.Reynaud-Bouret
TT4_0A
TT4_05
TT4_18
TT3_04
TT1_01
TT2_03
TT3_04
TT1_02
TT2_0F
TT4_18
TT4_05
TT4_18
TT3_04
TT1_01
TT4_18
TT2_14
TT2_0F
TT4_05
TT4_18
TT3_04
TT1_01
TT2_03
Hawkes
TT3_04
TT1_01
TT2_03
Aussois 2015
66 / 129
Probabilistic ingredients
Table of Contents
1
Point process and Counting process
2
Multivariate Hawkes processes and Lasso
3
Probabilistic ingredients
Exponential inequalities (Concentration of measure)
Controls via a branching structure
4
Back to Lasso
5
PDE and point processes
P.Reynaud-Bouret
Hawkes
Aussois 2015
67 / 129
Probabilistic ingredients
Exponential inequalities (Concentration of measure)
Table of Contents
1
Point process and Counting process
2
Multivariate Hawkes processes and Lasso
3
Probabilistic ingredients
Exponential inequalities (Concentration of measure)
The Poisson case
The martingale case
Back to Lasso
Controls via a branching structure
4
Back to Lasso
5
PDE and point processes
P.Reynaud-Bouret
Hawkes
Aussois 2015
68 / 129
Probabilistic ingredients
Exponential inequalities (Concentration of measure)
Probability generating functional and Laplace transform
p.g.fl
For any point process N, the functional which associates to any positive
function h
+
E(
h(x))
x∈N
Laplace functional
For any point process N, the functional which associates to any function f
5
=#
>6
f (x)dNx
E exp
.
equivalence with f = log(h).
P.Reynaud-Bouret
Hawkes
Aussois 2015
69 / 129
Probabilistic ingredients
Exponential inequalities (Concentration of measure)
Campbell’s theorem
Let N be a Poisson process with mean measure µ,
Poisson processes
for all integer n, for all A1 , . . . , An disjoint measurable subsets of X,
NA1 , . . . , NAn are independent random variables.
for all measurable subset A of X, NA obeys a Poisson law with
parameter depending on A and denoted µ(A).
P.Reynaud-Bouret
Hawkes
Aussois 2015
70 / 129
Probabilistic ingredients
Exponential inequalities (Concentration of measure)
Campbell’s theorem
Let N be a Poisson process with mean measure µ,
typically, dµt = λ(t)dt, with λ deterministic,
P.Reynaud-Bouret
Hawkes
Aussois 2015
70 / 129
Probabilistic ingredients
Exponential inequalities (Concentration of measure)
Campbell’s theorem
Let N be a Poisson process with mean measure µ,
typically, dµt = λ(t)dt, with λ deterministic,
Campbell’s theorem
"
If f is such that min(|f (x)|, 1)dµx < ∞, then
6
5
=#
>6
5# ?
@
f (x)
e
− 1 dµx .
= exp
f (x)dNx
E exp
P.Reynaud-Bouret
Hawkes
Aussois 2015
70 / 129
Probabilistic ingredients
Exponential inequalities (Concentration of measure)
Small proof
Lz
Take f piecewise constant,
f =
,
αI 1I .
I
5
E exp
=#
f (x)dNx
>6
= E
+
I
=
+
I
=
+
I
= exp
P.Reynaud-Bouret
e αI NI
.
In
D:
Iz
I
}
24
In
)
*
E e αI NI (by independence)
exp ((e αI − 1)µI ) (Laplace of a Poisson)
5#
(e
Hawkes
f (x)
− 1)dµx
6
Aussois 2015
71 / 129
Probabilistic ingredients
Exponential inequalities (Concentration of measure)
Exponential inequality for Poisson process
Aim
For all ξ > 0,
P
5#
6
f (x)[dNx − λ(x)dx] ≥???(y ) ≤ e −y
NB : easier this way for interpretation in statistics... (quantile)
Let f be some fixed function (deterministic) and let θ > 0.
Apply Campbell’s theorem to θf :
>6
5#
5
=#
6
θf (x)
(e
− θf (x) − 1)dµx .
θf (x)[dNx − λ(x)dx]
= exp
E exp
P.Reynaud-Bouret
Hawkes
Aussois 2015
72 / 129
Probabilistic ingredients
Exponential inequalities (Concentration of measure)
Exponential inequality for Poisson process
Aim
For all ξ > 0,
P
5#
6
f (x)[dNx − λ(x)dx] ≥???(y ) ≤ e −y
NB : easier this way for interpretation in statistics... (quantile)
Let f be some fixed function (deterministic) and let θ > 0.
Apply Campbell’s theorem to θf :
>6
5#
5
=#
6
θf (x)
(e
− θf (x) − 1)dµx .
θf (x)[dNx − λ(x)dx]
= exp
E exp
Hence if E = exp
P.Reynaud-Bouret
;"
<
"
θf (x)[dNx − λ(x)dx] − (e θf (x) − θf (x) − 1)dµx ,
E(E ) = 1.
Hawkes
Aussois 2015
72 / 129
Probabilistic ingredients
Exponential inequalities (Concentration of measure)
Towards a (weak) Bernstein inequality...
Therefore for all y > 0
5#
6
#
θf (x)
P
θf (x)[dNx − λ(x)dx] ≥ (e
− θf (x) − 1)dµx + y = P(E ≥ e y )
P.Reynaud-Bouret
Hawkes
Aussois 2015
73 / 129
Probabilistic ingredients
Exponential inequalities (Concentration of measure)
Towards a (weak) Bernstein inequality...
Therefore for all y > 0
5#
6
#
θf (x)
P
θf (x)[dNx − λ(x)dx] ≥ (e
− θf (x) − 1)dµx + y = P(E ≥ e y )
By Markov,
5#
6
#
θf (x)
θf (x)[dNx − λ(x)dx] ≥ (e
− θf (x) − 1)dµx + y ≤ E(E )e −y .
P
P.Reynaud-Bouret
Hawkes
Aussois 2015
73 / 129
Probabilistic ingredients
Exponential inequalities (Concentration of measure)
Towards a (weak) Bernstein inequality...
Therefore for all y > 0
5#
6
#
θf (x)
P
θf (x)[dNx − λ(x)dx] ≥ (e
− θf (x) − 1)dµx + y = P(E ≥ e y )
Hence for all θ, y > 0
5#
6
#
1
θf (x)
f (x)[dNx − λ(x)dx] ≥
P
(e
− θf (x) − 1)dµx + y /θ ≤ e −y .
θ
P.Reynaud-Bouret
Hawkes
Aussois 2015
73 / 129
Probabilistic ingredients
Exponential inequalities (Concentration of measure)
Towards a (weak) Bernstein inequality...
Therefore for all y > 0
5#
6
#
θf (x)
P
θf (x)[dNx − λ(x)dx] ≥ (e
− θf (x) − 1)dµx + y = P(E ≥ e y )
Hence for all θ, y > 0
5#
6
#
1
θf (x)
P
f (x)[dNx − λ(x)dx] ≥
(e
− θf (x) − 1)dµx + y /θ ≤ e −y .
θ
But
e
θf (x)
− θf (x) − 1 ≤
≤
Let v =
"
f (x)2 dµx .
P.Reynaud-Bouret
Hawkes
∞
,
θk
k=2
k!
k−2
f (x)2
||f ||∞
θ 2 f (x)2
)
2 1−
θ||f ||∞
3
*
Aussois 2015
73 / 129
Probabilistic ingredients
Exponential inequalities (Concentration of measure)
Towards a (weak) Bernstein inequality...(2)
"
Let v = f (x)2 λ(x)dx.
Hence for all θ, y > 0
⎛
#
P ⎝ f (x)[dNx − λ(x)dx] ≥
P.Reynaud-Bouret
θv
)
2 1−
Hawkes
θ||f ||∞
3
⎞
* + y /θ ⎠ ≤ e −y
Aussois 2015
74 / 129
Probabilistic ingredients
Exponential inequalities (Concentration of measure)
Towards a (weak) Bernstein inequality...(2)
"
Let v = f (x)2 λ(x)dx.
Hence for all θ, y > 0
⎛
#
P ⎝ f (x)[dNx − λ(x)dx] ≥
θv
)
2 1−
θ||f ||∞
3
Nd
Optimum in θ = g (v , ||f ||∞ , y ) and
:
⎞
* + y /θ ⎠ ≤ e −y
si
Putt
4
Xnorlmf e- ¥
Pam , Frett
,
'
Exponential inequality ”`a la” Bernstein
ie
For all y > 0,
5#
6
A
||f ||∞
−y
P
f (x)[dNx − λ(x)dx] ≥ 2vy +
y ≤e
3
);"
C
B"
<2 *
NB : v = E
= Var
f (x)dNx . Hence first
f (x)[dNx − λ(x)dx]
order Gaussian tail with variance v
P.Reynaud-Bouret
Hawkes
Aussois 2015
74 / 129
Probabilistic ingredients
Exponential inequalities (Concentration of measure)
Aim
same kind of exponential inequality for Hawkes (or other general
counting process)
f (.) deterministic has to be replaced by Ht (ex : Rct ) predictable
"
The ”variance” term v should follow Hs dNs → expectation given
the past.
P.Reynaud-Bouret
Hawkes
Aussois 2015
75 / 129
Probabilistic ingredients
Exponential inequalities (Concentration of measure)
Aim
same kind of exponential inequality for Hawkes (or other general
counting process)
f (.) deterministic has to be replaced by Ht (ex : Rct ) predictable
"
The ”variance” term v should follow Hs dNs → expectation given
the past.
optional : v still depends on λ, unknown → as to be replaced by
observable quantity.
P.Reynaud-Bouret
Hawkes
Aussois 2015
75 / 129
Probabilistic ingredients
Exponential inequalities (Concentration of measure)
Martingale
Let N counting process with (predictable) intensity λ. Let Hs be any
predictable process and t > u, then
(
(
E(Ht dNt ( past at t) = Ht E(dNt | past at t) = Ht λ(t)dt
P.Reynaud-Bouret
Hawkes
Aussois 2015
76 / 129
Probabilistic ingredients
Exponential inequalities (Concentration of measure)
Martingale
Let N counting process with (predictable) intensity λ. Let Hs be any
predictable process and t > u, then
(
(
E(Ht dNt ( past at t) = Ht E(dNt | past at t) = Ht λ(t)dt
E
5#
t
0
6 #
(
(
Hs [dNs − λ(s)ds]( past at u =
u
0
Hs [dNs − λ(s)ds]
→ martingale.
P.Reynaud-Bouret
Hawkes
Aussois 2015
76 / 129
Probabilistic ingredients
Exponential inequalities (Concentration of measure)
Bracket
A Stieljes integration by part : whatever the bounded variation c`ad-l`ag
functions (i.e. works for dNt , λ(t)dt or dNt − λ((t)dt)
# t
# t
g (s − )dfs
f (t)g (t) = f (0)g (0) +
f (s)dgs +
0
0
dg
:
df
t
get
"
a
f
P.Reynaud-Bouret
Hawkes
Aussois 2015
77 / 129
Probabilistic ingredients
Exponential inequalities (Concentration of measure)
Bracket
A Stieljes integration by part : whatever the bounded variation c`ad-l`ag
functions (i.e. works for dNt , λ(t)dt or dNt − λ((t)dt)
# t
# t
g (s − )dfs
f (t)g (t) = f (0)g (0) +
f (s)dgs +
0
0
"t
Hence, with f (t) = g (t) = 0 Hs [dNs − λ(s)ds],
5# t
62
Hs [dNs − λ(s)ds] =
0
# t $# s
0
0
%
%
# t $# s−
Hu [dNu − λ(u)du] Hs [dNs − λ(s)ds] +
Hu [dNu − λ(u)du] Hs [dNs − λ(s)ds]
P.Reynaud-Bouret
0
Hawkes
0
Aussois 2015
77 / 129
Probabilistic ingredients
Exponential inequalities (Concentration of measure)
Bracket
A Stieljes integration by part : whatever the bounded variation c`ad-l`ag
functions (i.e. works for dNt , λ(t)dt or dNt − λ((t)dt)
# t
# t
g (s − )dfs
f (t)g (t) = f (0)g (0) +
f (s)dgs +
0
0
"t
Hence, with f (t) = g (t) = 0 Hs [dNs − λ(s)ds],
5# t
62
Hs [dNs − λ(s)ds] =
0
# t $# ;s
0
0
%
%
# t $# s−
i .
Hu [dNu − λ(u)du] Hs [dNs − λ(s)ds] +
Hu [dNu − λ(u)du] Hs [dNs − λ(s)ds]
0
=
#
0
t
0
Hs2 dNs + martingale
"t
→ compensator (≃ variance given the past) Vt = 0 Hs2 λ(s)ds
62
5# t
Hs [dNs − λ(s)ds] − Vt martingale
predictable and
P.Reynaud-Bouret
0
Hawkes
Aussois 2015
77 / 129
Probabilistic ingredients
Exponential inequalities (Concentration of measure)
Exponential martingale
Aim : the martingale equivalent of Campbell’s theorem.
P.Reynaud-Bouret
Hawkes
Aussois 2015
78 / 129
Probabilistic ingredients
Exponential inequalities (Concentration of measure)
Exponential martingale
fi
N Poison (7)
HIT
Aim : the martingale equivalent of Campbell’s theorem.
5# t
6
# t
Hs
Et = exp
Hs [dNs − λ(s)ds] −
(e − Hs − 1)λ(s)ds
0
0
is the unique solution of
Et = E0 +
#
0
t
Es− (e Hs − 1)[dNs − λ(s)ds].
-
P.Reynaud-Bouret
Hawkes
Aussois 2015
78 / 129
Probabilistic ingredients
Exponential inequalities (Concentration of measure)
Exponential martingale
Aim : the martingale equivalent of Campbell’s theorem.
5# t
6
# t
Hs
Et = exp
Hs [dNs − λ(s)ds] −
(e − Hs − 1)λ(s)ds
0
0
is the unique solution of
Et = E0 +
#
0
t
Es− (e Hs − 1)[dNs − λ(s)ds].
Hence martingale and E(Et ) = E0 = 1.
NB : eventually, integrability problems, so E(Et ) ≤ 1...
P.Reynaud-Bouret
Hawkes
Aussois 2015
78 / 129
Probabilistic ingredients
Exponential inequalities (Concentration of measure)
Exponential inequality `a la Bernstein
Hs → θHs and the corresponding Et → E in the Poisson Bernstein proof :
for all θ, y > 0
⎛
# t
⎝
P
Hs [dNs − λ(s)ds] ≥
0
P.Reynaud-Bouret
)
θVt
2 1−
Hawkes
θ||H||∞
3
⎞
* + y /θ ⎠ ≤ e −y
Aussois 2015
79 / 129
Probabilistic ingredients
Exponential inequalities (Concentration of measure)
Exponential inequality `a la Bernstein
Hs → θHs and the corresponding Et → E in the Poisson Bernstein proof :
for all θ, y > 0
⎛
# t
⎝
P
Hs [dNs − λ(s)ds] ≥
0
)
θVt
2 1−
θ||H||∞
3
⎞
* + y /θ ⎠ ≤ e −y
Assuming ||H||∞ ≤ b deterministic and known, not a big deal since
second order term,
but assuming Vt ≤ v and replacing Vt by v deterministic is not sharp
at all !
However optimising in θ needs a θ deterministic (because ultimately
Et depends on θ)
P.Reynaud-Bouret
Hawkes
Aussois 2015
79 / 129
Probabilistic ingredients
Exponential inequalities (Concentration of measure)
Slicing
Let S = {w ≤ Vt ≤ v = (1 + ϵ)K w } and consider the slice
Sk = {(1 + ϵ)k w ≤ Vt ≤ (1 + ϵ)k+1 w )}
P.Reynaud-Bouret
Hawkes
Aussois 2015
80 / 129
Probabilistic ingredients
Exponential inequalities (Concentration of measure)
Slicing
Let S = {w ≤ Vt ≤ v = (1 + ϵ)K w } and consider the slice
Sk = {(1 + ϵ)k w ≤ Vt ≤ (1 + ϵ)k+1 w )}
Then
P
-#
t
0
θVt
C + y /θ and Sk
Hs [dNs − λ(s)ds] ≥ B
2 1 − θb
3
P.Reynaud-Bouret
Hawkes
.
≤ e −y
Aussois 2015
80 / 129
Probabilistic ingredients
Exponential inequalities (Concentration of measure)
Slicing
Let S = {w ≤ Vt ≤ v = (1 + ϵ)K w } and consider the slice
Sk = {(1 + ϵ)k w ≤ Vt ≤ (1 + ϵ)k+1 w )}
Then
P
P
-#
-#
0
t
0
t
θVt
C + y /θ and Sk
Hs [dNs − λ(s)ds] ≥ B
2 1 − θb
3
.
θ(1 + ϵ)k+1 w
B
C + y /θ and Sk
Hs [dNs − λ(s)ds] ≥
2 1 − θb
3
P.Reynaud-Bouret
Hawkes
≤ e −y
.
≤ e −y
Aussois 2015
80 / 129
Probabilistic ingredients
Exponential inequalities (Concentration of measure)
Slicing
Let S = {w ≤ Vt ≤ v = (1 + ϵ)K w } and consider the slice
Sk = {(1 + ϵ)k w ≤ Vt ≤ (1 + ϵ)k+1 w )}
Then
P
P
-#
-#
0
t
0
t
θVt
C + y /θ and Sk
Hs [dNs − λ(s)ds] ≥ B
2 1 − θb
3
.
θ(1 + ϵ)k+1 w
B
C + y /θ and Sk
Hs [dNs − λ(s)ds] ≥
2 1 − θb
3
≤ e −y
.
≤ e −y
Here choose θ = g ((1 + ϵ)k+1 w , b, y ) optimising ...
6
5# t
D
by
k+1
and Sk ≤ e −y
Hs [dNs − λ(s)ds] ≥ 2(1 + ϵ) wy +
P
3
0
P.Reynaud-Bouret
Hawkes
Aussois 2015
80 / 129
Probabilistic ingredients
Exponential inequalities (Concentration of measure)
Slicing
Let S = {w ≤ Vt ≤ v = (1 + ϵ)K w } and consider the slice
Sk = {(1 + ϵ)k w ≤ Vt ≤ (1 + ϵ)k+1 w )}
Then
P
P
-#
-#
0
t
0
t
θVt
C + y /θ and Sk
Hs [dNs − λ(s)ds] ≥ B
2 1 − θb
3
.
θ(1 + ϵ)k+1 w
B
C + y /θ and Sk
Hs [dNs − λ(s)ds] ≥
2 1 − θb
3
≤ e −y
.
≤ e −y
Here choose θ = g ((1 + ϵ)k+1 w , b, y ) optimising ...
6
5# t
D
by
k+1
and Sk ≤ e −y
Hs [dNs − λ(s)ds] ≥ 2(1 + ϵ) wy +
P
3
0
5# t
6
A
by
Hs [dNs − λ(s)ds] ≥ 2(1 + ϵ)Vt y +
P
and Sk ≤ e −y
3
0
P.Reynaud-Bouret
Hawkes
Aussois 2015
80 / 129
Probabilistic ingredients
Exponential inequalities (Concentration of measure)
Almost there
Grouping the slices
5# t
6
A
by
P
Hs [dNs − λ(s)ds] ≥ 2(1 + ϵ)Vt y +
and S ≤ Ke −y
3
0
P.Reynaud-Bouret
Hawkes
Aussois 2015
81 / 129
Probabilistic ingredients
Exponential inequalities (Concentration of measure)
Almost there
Grouping the slices
5# t
6
A
by
P
Hs [dNs − λ(s)ds] ≥ 2(1 + ϵ)Vt y +
and S ≤ Ke −y
3
0
"t
But Vt = 0 Hs2 λ(s)ds, still depends on the unknown λ
P.Reynaud-Bouret
Hawkes
Aussois 2015
81 / 129
Probabilistic ingredients
Exponential inequalities (Concentration of measure)
Almost there
Grouping the slices
5# t
6
A
by
P
Hs [dNs − λ(s)ds] ≥ 2(1 + ϵ)Vt y +
and S ≤ Ke −y
3
0
"t
But Vt = 0 Hs2 λ(s)ds, still depends on the unknown λ
→ one more turn using the fact that
# t
ˆ
Hs2 dNs
Vt =
0
is observable and
P.Reynaud-Bouret
Vˆt − Vt = martingale
Hawkes
Aussois 2015
81 / 129
Probabilistic ingredients
Exponential inequalities (Concentration of measure)
Bernstein-type inequality for general counting process
(Hansen, RB, Rivoirard)
"t
Let (Hs )s≥0 be a predictable process and Mt = 0 Hs (dNs − λ(s)ds).
Let b > 0 and v > w > 0. For all x, µ > 0 such that µ > φ(µ), let
# τ
µ
b2x
µ
ˆ
Vτ =
Hs2 dNs +
,
µ − φ(µ) 0
µ − φ(µ)
where φ(u) = exp(u) − u − 1.
Then for every stopping time τ and every ε > 0
.
D
µ
µ
P Mτ ≥ 2(1 + ε)Vˆτ x + bx/3, w ≤ Vˆτ ≤ v and sup |Hs | ≤ b
-
s∈[0,τ ]
≤2
log(v /w ) −x
e .
log(1 + ε)
" a generic choice of d for the Lasso whatever the underlying process....
P.Reynaud-Bouret
Hawkes
Aussois 2015
82 / 129
Probabilistic ingredients
Exponential inequalities (Concentration of measure)
Back to Lasso for Hawkes
In the Hawkes case, Hs → Rcs , renormalized counts in small intervals.
Hence there is no absolute v or b.
P.Reynaud-Bouret
Hawkes
Aussois 2015
83 / 129
Probabilistic ingredients
Exponential inequalities (Concentration of measure)
Back to Lasso for Hawkes
In the Hawkes case, Hs → Rcs , renormalized counts in small intervals.
Hence there is no absolute v or b.
If controlled number of points in small intervals via exponential
inequality, then can find v and b such that P(Vt ≥ v or ||H||∞ ≥ b)
exponentially small
P.Reynaud-Bouret
Hawkes
Aussois 2015
83 / 129
Probabilistic ingredients
Exponential inequalities (Concentration of measure)
Back to Lasso for Hawkes
In the Hawkes case, Hs → Rcs , renormalized counts in small intervals.
Hence there is no absolute v or b.
If controlled number of points in small intervals via exponential
inequality, then can find v and b such that P(Vt ≥ v or ||H||∞ ≥ b)
exponentially small
i
:
Used to stop the martingale before Vt reaches level v (same for b)
one can always choose w =
martingale on this side.
P.Reynaud-Bouret
b2 x
µ−φ(µ) ,
Hawkes
nice since cannot stop the
Aussois 2015
83 / 129
Probabilistic ingredients
Exponential inequalities (Concentration of measure)
Back to Lasso for Hawkes
In the Hawkes case, Hs → Rcs , renormalized counts in small intervals.
Hence there is no absolute v or b.
If controlled number of points in small intervals^ via exponential
inequality, then can find v and b such that P(VMt ≥ v or ||H||∞ ≥ b)
exponentially small
^
Use to stop the martingale before Vt reaches level v (same for b)
µ
one can always choose w =
martingale on this side.
P.Reynaud-Bouret
b2 x
µ−φ(µ) ,
Hawkes
nice since cannot stop the
Aussois 2015
83 / 129
Probabilistic ingredients
Exponential inequalities (Concentration of measure)
Lasso criterion
(r )
ˆ(r ) = argmina {γT (λa ) + pen(a)}
a
= argmina {−2a′ br + a′ Ga + 2(d(r ) )′ |a|}
M number of interacting processes, K number of bins, A maximal
interaction delay.
Oracle inequality in probability
D
If d = 2(1 + ε)VˆTµ x + bx/3 then with probability larger than
i
> ! log(T )2 ) − P(G ̸≥ cI )
1 − !M 2 K log(T )2 e −x − P(∀t < T , i , N[t−A,t]
P.Reynaud-Bouret
Hawkes
Aussois 2015
84 / 129
Probabilistic ingredients
Exponential inequalities (Concentration of measure)
Lasso criterion
(r )
ˆ(r ) = argmina {γT (λa ) + pen(a)}
a
= argmina {−2a′ br + a′ Ga + 2(d(r ) )′ |a|}
M number of interacting processes, K number of bins, A maximal
interaction delay.
Oracle inequality in probability
D
If d = 2(1 + ε)VˆTµ x + bx/3 then with probability larger than
i
> ! log(T )2 ) − P(G ̸≥ cI )
1 − !M 2 K log(T )2 e −x − P(∀t < T , i , N[t−A,t]
,
r
ˆ(r ) ∥2 ≤ ! inf
∥λ(r ) − Rct a
a
P.Reynaud-Bouret
⎧
⎨,
⎩
r
1
∥λ(r ) − Rct a∥2 +
c
Hawkes
,
(r )
(di )2
i ∈supp(a)
Aussois 2015
⎫
⎬
⎭
.
84 / 129
Probabilistic ingredients
Exponential inequalities (Concentration of measure)
Mathematical debts
Control of the number of points per interval
Control of c the smallest eigenvalue of G
Choice of x.
P.Reynaud-Bouret
Hawkes
Aussois 2015
85 / 129
Probabilistic ingredients
Controls via a branching structure
Table of Contents
1
Point process and Counting process
2
Multivariate Hawkes processes and Lasso
3
Probabilistic ingredients
Exponential inequalities (Concentration of measure)
Controls via a branching structure
Branching structure
Number of points
Control of G
4
Back to Lasso
5
PDE and point processes
P.Reynaud-Bouret
Hawkes
Aussois 2015
86 / 129
Probabilistic ingredients
Controls via a branching structure
A branching representation of the linear Hawkes model
(univariate)
Linear Hawkes process= Branching process on a Poisson process
Start = homogeneous Poisson (ν) = ancestors
P.Reynaud-Bouret
Hawkes
Aussois 2015
87 / 129
Probabilistic ingredients
Controls via a branching structure
A branching representation of the linear Hawkes model
(univariate)
Linear Hawkes process= Branching process on a Poisson process
Start = homogeneous Poisson (ν) = ancestors
Each point generates children according to a Poisson process of
intensity h
P.Reynaud-Bouret
Hawkes
Aussois 2015
87 / 129
Probabilistic ingredients
Controls via a branching structure
A branching representation of the linear Hawkes model
(univariate)
Linear Hawkes process= Branching process on a Poisson process
Start = homogeneous Poisson (ν) = ancestors
Each point generates children according to a Poisson process of
intensity h
P.Reynaud-Bouret
Hawkes
Aussois 2015
87 / 129
Probabilistic ingredients
Controls via a branching structure
A branching representation of the linear Hawkes model
(univariate)
Linear Hawkes process= Branching process on a Poisson process
Start = homogeneous Poisson (ν) = ancestors
Each point generates children according to a Poisson process of
intensity h
"
Again and again until extinction (almost sure when h < 1)
P.Reynaud-Bouret
Hawkes
Aussois 2015
87 / 129
Probabilistic ingredients
Controls via a branching structure
A branching representation of the linear Hawkes model
(univariate)
Linear Hawkes process= Branching process on a Poisson process
Start = homogeneous Poisson (ν) = ancestors
Each point generates children according to a Poisson process of
intensity h
"
Again and again until extinction (almost sure when h < 1)
P.Reynaud-Bouret
Hawkes
Aussois 2015
87 / 129
Probabilistic ingredients
Controls via a branching structure
A branching representation of the linear Hawkes model
(univariate)
Linear Hawkes process= Branching process on a Poisson process
Start = homogeneous Poisson (ν) = ancestors
Each point generates children according to a Poisson process of
intensity h
"
Again and again until extinction (almost sure when h < 1)
P.Reynaud-Bouret
Hawkes
Aussois 2015
87 / 129
Probabilistic ingredients
Controls via a branching structure
A branching representation of the linear Hawkes model
(univariate)
Linear Hawkes process= Branching process on a Poisson process
Start = homogeneous Poisson (ν) = ancestors
Each point generates children according to a Poisson process of
intensity h
"
Again and again until extinction (almost sure when h < 1)
Hawkes= Final process without colors
P.Reynaud-Bouret
Hawkes
Aussois 2015
87 / 129
Probabilistic ingredients
Controls via a branching structure
A branching representation of the linear Hawkes model
(multivariate)
not
:#
:#
.
Ahn
"f%d
Mh2→n
hzpN
Mhm
"
N
,
a
ancestors
P.Reynaud-Bouret
Hawkes
Aussois 2015
88 / 129
Probabilistic ingredients
Controls via a branching structure
A branching representation of the linear Hawkes model
(multivariate)
b. .
"
"'
iii
ft
HEIKKINEN
Eiiiifta
An
his
,
first
P.Reynaud-Bouret
N
,
generation
Hawkes
Aussois 2015
88 / 129
Probabilistic ingredients
Controls via a branching structure
A branching representation of the linear Hawkes model
(multivariate)
kaka
.
:
-
t.tn
Ahn
An
.
.
.
µ
An
N
,
N
,
2nd generation
P.Reynaud-Bouret
Hawkes
Aussois 2015
88 / 129
Probabilistic ingredients
Controls via a branching structure
A branching representation of the linear Hawkes model
(multivariate)
Hstkl
i
µ,
Ni
ski
÷
A
,
:
pp
:
.
.
,
.
3rd
P.Reynaud-Bouret
n
Hawkes
generation
Aussois 2015
88 / 129
Probabilistic ingredients
Controls via a branching structure
A branching representation of the linear Hawkes model
(multivariate)
.
Qdustaa
=
fh£dxas
:
P.Reynaud-Bouret
.
.
Hawkes
Ni
Nz
Aussois 2015
88 / 129
Probabilistic ingredients
Controls via a branching structure
Univariate and multivariate clusters
Cluster process
ancestor in 0 + all its family
0
P.Reynaud-Bouret
t
H
Hawkes
Aussois 2015
89 / 129
Probabilistic ingredients
Controls via a branching structure
Univariate and multivariate clusters
Cluster process
ancestor in 0 + all its family
0
t
H
If multivariate, the distribution of the cluster depends on the type of the
ancestor.
P.Reynaud-Bouret
Hawkes
Aussois 2015
89 / 129
Probabilistic ingredients
Controls via a branching structure
Number of points of one cluster
If ancestor of type ℓ → Eℓ .
K (n) vector of the number of points per type in the nth generation
If ancestor of type ℓ, K (0) = eℓ
P.Reynaud-Bouret
Hawkes
Aussois 2015
90 / 129
Probabilistic ingredients
Controls via a branching structure
Number of points of one cluster
If ancestor of type ℓ → Eℓ .
K (n) vector of the number of points per type in the nth generation
If ancestor of type ℓ, K (0) = eℓ
!
W (n) = n0 K (n) : number of points in the cluster per type up to
generation n
j
P.Reynaud-Bouret
Hawkes
Aussois 2015
90 / 129
Probabilistic ingredients
Controls via a branching structure
Number of points of one cluster
If ancestor of type ℓ → Eℓ .
K (n) vector of the number of points per type in the nth generation
If ancestor of type ℓ, K (0) = eℓ
!
W (n) = n0 K (n) : number of points in the cluster per type up to
g
generation n
Has W = W (∞) a Laplace transform ? i.e. can we find a vector θ of
positive coordinates such that
) ′
*
Eℓ e θ W (∞) < ∞
P.Reynaud-Bouret
Hawkes
Aussois 2015
90 / 129
Probabilistic ingredients
Controls via a branching structure
Number of points of one cluster
If ancestor of type ℓ → Eℓ .
K (n) vector of the number of points per type in the nth generation
If ancestor of type ℓ, K (0) = eℓ
!
W (n) = n0 K (n) : number of points in the cluster per type up to
j
generation n
Has W = W (∞) a Laplace transform ? i.e. can we find a vector θ of
positive coordinates such that
) ′
*
Eℓ e θ W (∞) < ∞
If yes,
*
*
) ′
)
Pℓ (Ntot ≥ x) ≤ Eℓ e mini (θi )Ntot e − mini (θi )x ≤ Eℓ e θ W (∞) e − mini (θi )x
exponentially small
P.Reynaud-Bouret
Hawkes
Aussois 2015
90 / 129
Probabilistic ingredients
Controls via a branching structure
Laplace transform for the cluster
φ(θ)′ = (φ1 (θ), ..., φM (θ)),
′
φℓ (θ) = log Eℓ (e θ K (1) ).
P.Reynaud-Bouret
Hawkes
Aussois 2015
91 / 129
Probabilistic ingredients
Controls via a branching structure
Laplace transform for the cluster
φ(θ)′ = (φ1 (θ), ..., φM (θ)),
′
φℓ (θ) = log Eℓ (e θ K (1) ).
(
) ′
*
) ′
? ′
@*
(
Eℓ e θ W (n)
= Eℓ e θ W (n−1) E e θ K (n) ( generations < n
P.Reynaud-Bouret
Hawkes
Aussois 2015
91 / 129
Probabilistic ingredients
Controls via a branching structure
Laplace transform for the cluster
φ(θ)′ = (φ1 (θ), ..., φM (θ)),
′
φℓ (θ) = log Eℓ (e θ K (1) ).
(
) ′
*
) ′
? ′
@*
(
Eℓ e θ W (n)
= Eℓ e θ W (n−1) E e θ K (n) ( generations < n
) ′
*
′
= Eℓ e θ W (n−1) e φ(θ) K (n−1)
= exp(un (θ)ℓ )
with un (θ) = θ + φ(un−1 (θ)), u0 (θ) = θ
P.Reynaud-Bouret
Hawkes
Aussois 2015
91 / 129
Probabilistic ingredients
Controls via a branching structure
Laplace transform for the cluster(2)
If φ local contraction for a certain norm ||.|| :
there exists r > 0 and C < 1 st if ||s|| < r then
||φ(s)|| ≤ C ||s||.
P.Reynaud-Bouret
Hawkes
Aussois 2015
92 / 129
Probabilistic ingredients
Controls via a branching structure
Laplace transform for the cluster(2)
If φ local contraction for a certain norm ||.|| :
there exists r > 0 and C < 1 st if ||s|| < r then
||φ(s)|| ≤ C ||s||.
If ||θ|| ≤ r (1 − C ),
||un (θ)|| ≤ ||θ|| + ||φ(un−1 (θ))||
≤ ||θ|| + C ||un−1 (θ)||
P.Reynaud-Bouret
Hawkes
Aussois 2015
92 / 129
Probabilistic ingredients
Controls via a branching structure
Laplace transform for the cluster(2)
If φ local contraction for a certain norm ||.|| :
there exists r > 0 and C < 1 st if ||s|| < r then
||φ(s)|| ≤ C ||s||.
If ||θ|| ≤ r (1 − C ),
||un (θ)|| ≤ ||θ|| + ||φ(un−1 (θ))||
≤ ||θ|| + C ||un−1 (θ)||
≤ ||θ||(1 + C + ... + C n )
||θ||
≤
≤r
1−C
Hence each coordinate of un (θ) increases (since W (n) increases1 but
remains in a compact set. Therefore it converges and for θ small enough,
W (∞) has a Laplace transform.
P.Reynaud-Bouret
Hawkes
Aussois 2015
92 / 129
Probabilistic ingredients
Controls via a branching structure
Contraction ?
K (1) has a Laplace transform (only thing needed ⇐= it’s Poisson).
′
φℓ (θ) = log Eℓ (e θ K (1) ).
" (m)
Hence ∂φ(0) = Γ = ( hℓ )ℓ,m with spectral radius < 1.
P.Reynaud-Bouret
Hawkes
Aussois 2015
93 / 129
Probabilistic ingredients
Controls via a branching structure
Contraction ?
K (1) has a Laplace transform (only thing needed ⇐= it’s Poisson).
′
φℓ (θ) = log Eℓ (e θ K (1) ).
" (m)
Hence ∂φ(0) = Γ = ( hℓ )ℓ,m with spectral radius < 1.
Houdeholder
theorem
He
there exists a norm on RM s.t. the associated operator norm of Γ < 1
P.Reynaud-Bouret
Hawkes
Aussois 2015
93 / 129
Probabilistic ingredients
Controls via a branching structure
Contraction ?
K (1) has a Laplace transform (only thing needed ⇐= it’s Poisson).
′
φℓ (θ) = log Eℓ (e θ K (1) ).
" (m)
Hence ∂φ(0) = Γ = ( hℓ )ℓ,m with spectral radius < 1.
115L
Houdeholder
theorem
there exists a norm on RM s.t. the associated operator norm of Γ < 1
By continuity (φ infinitely differentiable in a neighborhood of 0),
∃c ∈ (0, 1), ξ > 0,
P.Reynaud-Bouret
∀||s|| < ξ,
Hawkes
||∂φ(s)|| ≤ c.
Aussois 2015
93 / 129
Probabilistic ingredients
Controls via a branching structure
Contraction ?
K (1) has a Laplace transform (only thing needed ⇐= it’s Poisson).
′
φℓ (θ) = log Eℓ (e θ K (1) ).
" (m)
Hence ∂φ(0) = Γ = ( hℓ )ℓ,m with spectral radius < 1.
Houdeholder
theorem
ill
there exists a norm on RM s.t. the associated operator norm of Γ < 1
By continuity (φ infinitely differentiable in a neighborhood of 0),
∃c ∈ (0, 1), ξ > 0,
∀||s|| < ξ,
||∂φ(s)|| ≤ c.
Since φ(0) = 0, by continuity
∃C ∈ (0, 1), r > 0,
P.Reynaud-Bouret
∀||s|| < r ,
Hawkes
||∂φ(s)||
≤ C ||s||.
p
Aussois 2015
93 / 129
Probabilistic ingredients
Controls via a branching structure
Number of points for Hawkes processes per interval
Assume
stationary (spectral radius of Γ < 1)
h has bounded support.
P.Reynaud-Bouret
Hawkes
Aussois 2015
94 / 129
Probabilistic ingredients
Controls via a branching structure
Number of points for Hawkes processes per interval
Assume
stationary (spectral radius of Γ < 1)
A
h has bounded support.
.
.
on
ancestors
'
't
.
-
.
.
.
.
0
A
.
.
Jlhtwvu
N˜A,ℓ number of points in [−A, 0) whose ancestor is of type ℓ
theedatleastonePntmgnnaeAtohqandwanunEsEt-o@munbuofpimlsjimnthgetypeeanastnkinC.n
.
;D
P.Reynaud-Bouret
N˜A,ℓ ≤
An
,,
n k=1
max{Wn,k − n + ⌈A⌉, 0}
Hawkes
Aussois 2015
94 / 129
Probabilistic ingredients
Controls via a branching structure
Number of points for Hawkes processes per interval(2)
B
C
Let Hn (θℓ ) = Eℓ e θℓ max{W −n+⌈A⌉,0} , then
*
)
+
˜
≤
E(Hn (θℓ )An )
E e θℓ NA,ℓ
n
≤
+
n
exp (νℓ (Hn (θℓ ) − 1))
Since W has a Laplace transform, Hn exponentially decreasing with n and
it converges.
P.Reynaud-Bouret
Hawkes
Aussois 2015
95 / 129
Probabilistic ingredients
Controls via a branching structure
Number of points for Hawkes processes per interval(2)
B
C
Let Hn (θℓ ) = Eℓ e θℓ max{W −n+⌈A⌉,0} , then
*
)
+
˜
≤
E(Hn (θℓ )An )
E e θℓ NA,ℓ
n
≤
+
n
exp (νℓ (Hn (θℓ ) − 1))
Since W has a Laplace transform, Hn exponentially decreasing with n and
it converges.
→ Laplace transform of N[−A,0) exists
→ P(N[−A,0) > y ) exponentially small...
NB : true as soon as the branching distribution has a Laplace,
constant unknown... depends on M ! ! !
P.Reynaud-Bouret
Hawkes
Aussois 2015
95 / 129
Probabilistic ingredients
Controls via a branching structure
The expectation of G
Recall that
ηa (t) =
Rc′t a
=ν+
,
ak
k
′
a Ga =
P.Reynaud-Bouret
#
#
t
−∞
T
δ−1/2 1[t−(k+1)δ,t−kδ] dNt−u
ηa (t)2 dt
0
Hawkes
Aussois 2015
96 / 129
Probabilistic ingredients
Controls via a branching structure
The expectation of G
Recall that
ηa (t) =
Rc′t a
=ν+
,
ak
k
′
a Ga =
Hence
#
#
t
−∞
T
ηa (t)2 dt
0
#
E(G) ≥ cI ⇐⇒ ∀a, E(
P.Reynaud-Bouret
δ−1/2 1[t−(k+1)δ,t−kδ] dNt−u
T
0
Hawkes
ηa (t)2 dt) ≥ c||a||2 .
Aussois 2015
96 / 129
Probabilistic ingredients
Controls via a branching structure
The expectation of G
Recall that
ηa (t) =
Rc′t a
=ν+
,
ak
k
′
a Ga =
Hence
#
#
t
−∞
T
δ−1/2 1[t−(k+1)δ,t−kδ] dNt−u
ηa (t)2 dt
0
#
E(G) ≥ cI ⇐⇒ ∀a, E(
T
0
ηa (t)2 dt) ≥ c||a||2 .
If the process is stationnary E(ηa (t)2 ) does not depend on t.
Hence we want to show that c = T α with E(ηa (0)2 ) ≥ α||a||2 .
Then we need to concentrate T1 G around its expectation.
P.Reynaud-Bouret
Hawkes
Aussois 2015
96 / 129
Probabilistic ingredients
Controls via a branching structure
Minoration of the expectation
If
ηa (t) = ν +
,
k
ak
#
t
−∞
δ−1/2 1[t−(k+1)δ,t−kδ] dNt−u
and N homogeneous Poisson process, then we can find α s.t.
E(ηa (0)2 ) ≥ α||a||2 .
P.Reynaud-Bouret
Hawkes
Aussois 2015
97 / 129
Probabilistic ingredients
Controls via a branching structure
Minoration of the expectation
If
ηa (t) = ν +
,
k
ak
#
t
−∞
δ−1/2 1[t−(k+1)δ,t−kδ] dNt−u
and N homogeneous Poisson process, then we can find α s.t.
E(ηa (0)2 ) ≥ α||a||2 .
Use the likelihood and Girsanov theorem, to transfer the minoration
for Poisson to another minoration for Hawkes....
because of the likelihood, need again to have a Laplace transform of
the number of points...
P.Reynaud-Bouret
Hawkes
Aussois 2015
97 / 129
Probabilistic ingredients
Controls via a branching structure
Clusters and their size
(Univariate)
Cluster = ancestor in 0 + all its family
0
P.Reynaud-Bouret
t
H
Hawkes
Aussois 2015
98 / 129
Probabilistic ingredients
Controls via a branching structure
Clusters and their size
(Univariate)
Cluster = ancestor in 0 + all its family
0
t
H
Size of cluster H ≤ GW ∗ A where A maximal support size for h
P.Reynaud-Bouret
Hawkes
Aussois 2015
98 / 129
Probabilistic ingredients
Controls via a branching structure
Clusters and their size
(Univariate)
Cluster = ancestor in 0 + all its family
0
t
H
Size of cluster H ≤ GW ∗ A where A maximal support size for h
and GW "total number of births in a Galton-Watson process
(Poisson( h)).
P.Reynaud-Bouret
Hawkes
Aussois 2015
98 / 129
Probabilistic ingredients
Controls via a branching structure
Clusters and their size
(Univariate)
Cluster = ancestor in 0 + all its family
0
t
H
Size of cluster H ≤ GW ∗ A where A maximal support size for h
and GW "total number of births in a Galton-Watson process
(Poisson( h)).
Hence P(H > t) exponentially small
P.Reynaud-Bouret
Hawkes
Aussois 2015
98 / 129
Probabilistic ingredients
Controls via a branching structure
Extinction time
Extinction time Te = time from 0 until the last birth, if no ancestor
after 0
a
a
a
Te
0
P.Reynaud-Bouret
Hawkes
Aussois 2015
99 / 129
Probabilistic ingredients
Controls via a branching structure
Extinction time
Extinction time Te = time from 0 until the last birth, if no ancestor
after 0
a
a
a
Te
0
Conditionally to the ancestor
P
H
H
(
H 7 t
.
a)
PC Tefal ancestors)
=
H
Ay
=
=
I
that
at t
it reaches
.PH )
one
.
al
expfaobg Past
,
a
.
.
at
Want
0
Poisson ancestor process before 0 marked by the size of the
associated cluster, H
P.Reynaud-Bouret
Hawkes
Aussois 2015
99 / 129
Probabilistic ingredients
Controls via a branching structure
Extinction time
Extinction time Te = time from 0 until the last birth, if no ancestor
after 0
a
a
a
5
P(Te ≤ a) = exp −ν
P.Reynaud-Bouret
Te
0
#
+∞
P(H > t)dt
a
Hawkes
6
Aussois 2015
99 / 129
Probabilistic ingredients
Controls via a branching structure
Extinction time
Extinction time Te = time from 0 until the last birth, if no ancestor
after 0
a
a
a
Te
0
Hence P(Te > a) exponentially small
P.Reynaud-Bouret
Hawkes
Aussois 2015
99 / 129
Probabilistic ingredients
Controls via a branching structure
Extinction time
Extinction time Te = time from 0 until the last birth, if no ancestor
after 0
a
a
a
Te
0
Hence P(Te > a) exponentially small
Approximated simulation of a stationnary Hawkes process.
P.Reynaud-Bouret
Hawkes
Aussois 2015
99 / 129
Probabilistic ingredients
Controls via a branching structure
Extinction time
Extinction time Te = time from 0 until the last birth, if no ancestor
after 0
a
a
a
Te
0
Hence P(Te > a) exponentially small
Approximated simulation of a stationnary Hawkes process.
For exact simulation see work of M¨
oller etc
P.Reynaud-Bouret
Hawkes
Aussois 2015
99 / 129
Probabilistic ingredients
Controls via a branching structure
Towards independence
[
P.Reynaud-Bouret
[
Hawkes
Aussois 2015
100 / 129
Probabilistic ingredients
Controls via a branching structure
Towards independence
[
[
Cut [0, T ] in almost independent pieces (cf Berbee’s lemma, discrete
case (Baraud,Comte et Viennet))
P.Reynaud-Bouret
Hawkes
Aussois 2015
100 / 129
Probabilistic ingredients
Controls via a branching structure
Towards independence
[
[
Cut [0, T ] in almost independent pieces (cf Berbee’s lemma, discrete
case (Baraud,Comte et Viennet))
Control of the total variation distance.
P.Reynaud-Bouret
Hawkes
Aussois 2015
100 / 129
Probabilistic ingredients
Controls via a branching structure
Towards independence
i.
[
÷
.
.
.
.
.
.
.
.
[
Cut [0, T ] in almost independent pieces (cf Berbee’s lemma, discrete
case (Baraud,Comte et Viennet))
Control of the total variation distance.
Up to this error = sum of two groups of independent variables.
P.Reynaud-Bouret
Hawkes
Aussois 2015
100 / 129
Probabilistic ingredients
Controls via a branching structure
Towards independence
[
[
Cut [0, T ] in almost independent pieces (cf Berbee’s lemma, discrete
case (Baraud,Comte et Viennet))
Control of the total variation distance.
Up to this error = sum of two groups of independent variables.
All concentration inequalities for i.i.d. variables apply (Bernstein ...)
P.Reynaud-Bouret
Hawkes
Aussois 2015
100 / 129
Back to Lasso
Table of Contents
1
Point process and Counting process
2
Multivariate Hawkes processes and Lasso
3
Probabilistic ingredients
4
Back to Lasso
Theory
Simulations
5
PDE and point processes
P.Reynaud-Bouret
Hawkes
Aussois 2015
101 / 129
Back to Lasso
Theory
Lasso criterion
(r )
ˆ(r ) = argmina {γT (λa ) + pen(a)}
a
= argmina {−2a′ br + a′ Ga + 2(d(r ) )′ |a|}
M number of interacting processes, K number of bins, A maximal
interaction delay. If linear stationnary Hawkes
Oracle inequality in probability
D
If d = 2(1 + ε)VˆTµ x + bx/3 then with probability larger than
i
1−!M 2 K log(T )2 e −x −P(∀t < T , i , N[t−A,t]
> ! log(T )2 )−P(G ̸≥ !I /T )
P.Reynaud-Bouret
Hawkes
Aussois 2015
102 / 129
Back to Lasso
Theory
Lasso criterion
(r )
ˆ(r ) = argmina {γT (λa ) + pen(a)}
a
= argmina {−2a′ br + a′ Ga + 2(d(r ) )′ |a|}
M number of interacting processes, K number of bins, A maximal
interaction delay. If linear stationnary Hawkes
Oracle inequality in probability
D
If d = 2(1 + ε)VˆTµ x + bx/3 then with probability larger than
i
1−!M 2 K log(T )2 e −x −P(∀t < T , i , N[t−A,t]
> ! log(T )2 )−P(G ̸≥ !I /T )
,
r
ˆ(r ) ∥2 ≤ ! inf
∥λ(r ) − Rct a
a
P.Reynaud-Bouret
⎧
⎨,
⎩
r
1
∥λ(r ) − Rct a∥2 +
T
Hawkes
,
(r )
(di )2
i ∈supp(a)
Aussois 2015
⎫
⎬
⎭
.
102 / 129
Back to Lasso
Theory
Lasso criterion...
(r )
ˆ(r ) = argmina {γT (λa ) + pen(a)}
a
= argmina {−2a′ br + a′ Ga + 2(d(r ) )′ |a|}
M number of interacting processes, K number of bins, A maximal
interaction
√ delay. If linear stationnary Hawkes and K not too large
(K ≤ T / log(T )3 )
Oracle inequality in probability
D
If d = 2(1 + ε)VˆTµ x + bx/3 then with probability larger than
1 − !M 2 K log(T )2 e −x − !T −1
P.Reynaud-Bouret
Hawkes
Aussois 2015
103 / 129
Back to Lasso
Theory
Lasso criterion...
(r )
ˆ(r ) = argmina {γT (λa ) + pen(a)}
a
= argmina {−2a′ br + a′ Ga + 2(d(r ) )′ |a|}
M number of interacting processes, K number of bins, A maximal
interaction
√ delay. If linear stationnary Hawkes and K not too large
(K ≤ T / log(T )3 )
Oracle inequality in probability
D
If d = 2(1 + ε)VˆTµ x + bx/3 then with probability larger than
1 − !M 2 K log(T )2 e −x − !T −1
,
r
ˆ(r ) ∥2 ≤ ! inf
∥λ(r ) − Rct a
a
P.Reynaud-Bouret
⎧
⎨,
⎩
r
∥λ(r ) − Rct a∥2 +
Hawkes
1
T
,
(r )
(di )2
i ∈supp(a)
Aussois 2015
⎫
⎬
⎭
.
103 / 129
Back to Lasso
Theory
Choice of x
If x = γ log(T ), with probability larger than
1 − M 2 K log(T )2 T −γ − !T −1
⎧
⎨,
,
1
ˆ(r ) ∥2 ≤ ! inf
∥λ(r ) − Rct a
∥λ(r ) − Rct a∥2 +
a ⎩
T
r
r
P.Reynaud-Bouret
Hawkes
,
(r )
(di )2
i ∈supp(a)
Aussois 2015
⎫
⎬
.
⎭
104 / 129
Back to Lasso
Theory
Choice of x
If x = γ log(T ), with probability larger than
1 − M 2 K log(T )2 T −γ − !T −1
⎧
⎨,
,
1
ˆ(r ) ∥2 ≤ ! inf
∥λ(r ) − Rct a
∥λ(r ) − Rct a∥2 +
a ⎩
T
r
r
,
i ∈supp(a)
good renormalization by T ", if piecewise constant true,
loss on interaction function ≤ 0 +D
(r )
(di )2
⎫
⎬
.
⎭
|a∗ |0 log(T )3
T
with probability larger than 1 − !M 2 K log(T )2 T −γ − !T −1
P.Reynaud-Bouret
Hawkes
Aussois 2015
104 / 129
Back to Lasso
Theory
Choice of x
If x = γ log(T ), with probability larger than
1 − M 2 K log(T )2 T −γ − !T −1
⎧
⎨,
,
1
ˆ(r ) ∥2 ≤ ! inf
∥λ(r ) − Rct a
∥λ(r ) − Rct a∥2 +
a ⎩
T
r
r
,
(r )
(di )2
i ∈supp(a)
If distance bounded (reasonable),
E( loss ) ≤D
P.Reynaud-Bouret
⎫
⎬
.
⎭
|a∗ |0 log(T )3
+ !M 2 K log(T )2 T −γ + !T −1
T
Hawkes
Aussois 2015
104 / 129
Back to Lasso
Theory
Choice of x
If x = γ log(T ), with probability larger than
1 − M 2 K log(T )2 T −γ − !T −1
⎧
⎨,
,
1
ˆ(r ) ∥2 ≤ ! inf
∥λ(r ) − Rct a
∥λ(r ) − Rct a∥2 +
a ⎩
T
r
r
,
(r )
(di )2
i ∈supp(a)
If distance bounded (reasonable),
E( loss ) ≤D
⎫
⎬
.
⎭
|a∗ |0 log(T )3
+ !M 2 K log(T )2 T −γ + !T −1
T
Hence γ > 1
P.Reynaud-Bouret
Hawkes
Aussois 2015
104 / 129
Back to Lasso
Theory
Choice of x
If x = γ log(T ), with probability larger than
1 − M 2 K log(T )2 T −γ − !T −1
⎧
⎨,
,
1
ˆ(r ) ∥2 ≤ ! inf
∥λ(r ) − Rct a
∥λ(r ) − Rct a∥2 +
a ⎩
T
r
r
,
(r )
(di )2
i ∈supp(a)
If distance bounded (reasonable),
E( loss ) ≤
⎫
⎬
.
⎭
|a∗ |0 log(T )3
+ !M 2 K log(T )2 T −γ + !T −1
T
Hence γ > 1
Good reason to believe (proof in Poisson case) that γ ∈ [1, !], γ
smaller changes the rate, γ larger changes the constant ...
P.Reynaud-Bouret
Hawkes
Aussois 2015
104 / 129
Back to Lasso
Simulations
Bernstein-type inequality for general counting process
(Hansen, RB, Rivoirard)
"t
Let (Hs )s≥0 be a predictable process and Mt = 0 Hs (dNs − λ(s)ds).
Let b > 0 and v > w > 0. For all x, µ > 0 such that µ > φ(µ), let
# τ
µ
b2x
µ
ˆ
Vτ =
Hs2 dNs +
,
µ − φ(µ) 0
µ − φ(µ)
=p
where φ(u) = exp(u) − u − 1.
Then for every stopping time τ and every ε > 0
.
.
D
µ
µ
P Mτ ≥ 2(1 + ε)Vˆτ x + bx/3, w ≤ Vˆτ ≤ v and sup |Hs | ≤ b
-
s∈[0,τ ]
≤2
log(v /w ) −x
e .
log(1 + ε)
" a generic choice of d for the Lasso whatever the underlying process....
P.Reynaud-Bouret
Hawkes
Aussois 2015
105 / 129
Back to Lasso
Simulations
Practical choice of d
Recall that the oracle inequality needs
( (r )
(
(b − b
¯ (r ) ( ≤ d(r ) ,
with
b=
#
0
T
(r )
Rct dNt
¯=
and b
#
∀r
T
Rct λ(r ) (t)dt.
0
- ’Bernstein Lasso’ and ’Bernstein Lasso + OLS’
D
Bi γ log(T )
di = 2γ log(T )Vˆi +
,
3
# T
ˆ
(Rct,i )2 dNt,ri , Bi = sup |Rct,i |.
Vi =
0
P.Reynaud-Bouret
t∈[0,T ],m
Hawkes
Aussois 2015
106 / 129
Back to Lasso
Simulations
Practical choice of d
Recall that the oracle inequality needs
( (r )
(
(b − b
¯ (r ) ( ≤ d(r ) ,
with
b=
#
0
T
(r )
Rct dNt
¯=
and b
#
∀r
T
Rct λ(r ) (t)dt.
0
- ’Bernstein Lasso’ and ’Bernstein Lasso + OLS’
D
Bi γ log(T )
di = 2γ log(T )Vˆi +
,
3
# T
ˆ
(Rct,i )2 dNt,ri , Bi = sup |Rct,i |.
Vi =
0
t∈[0,T ],m
- ’Adaptive Lasso’ (Zou)
di =
P.Reynaud-Bouret
γ
.
2|ˆ
aiols |
Hawkes
Aussois 2015
106 / 129
Back to Lasso
Simulations
Why +OLS ?
Lasso
Least
:
.
absolute
Shrinkage
and Selection Opera In
If G = I , minimizing
−2a′ b + a′ G a + 2d′ |a|
" a soft-thresholding estimator, i.e.
ˆLasso = (b − sign(b)d)+ .
a
P.Reynaud-Bouret
Hawkes
Aussois 2015
107 / 129
Back to Lasso
Simulations
Why +OLS ?
If G = I , minimizing
−2a′ b + a′ G a + 2d′ |a|
" a soft-thresholding estimator, i.e.
ˆLasso = (b − sign(b)d)+ .
a
Hence small bias anytime "
once the support is estimated by Lasso, compute the ordinary least-square
estimate (OLS) on the resulting support.
P.Reynaud-Bouret
Hawkes
Aussois 2015
107 / 129
ff:
Back to Lasso
Simulations
Choices of the weights dλ
o
o
b
o
b
30
spont1
10
0.02
1->1
4.2
o
dgamma=1
o
dgamma=1.5
30
10
10
0
|b-b|
0
0.02
2->1
0 1.4
spont2
0
1->2
10 0
2->2
0
0
spont1
0
Coeff:
10
NB : Adaptive Lasso (Zou) dλ = γ/(2|ˆ
aλ |)
P.Reynaud-Bouret
Hawkes
0.02
1->1
4.2
0.02
2->1
0 1.4
spont2
0
10
1->2
0
2->2
0
0
Aussois 2015
0
108 / 129
Back to Lasso
Simulations
Simulation study - Support recovery
We perform 100 runs with T = 2, M = 8, K = 4 (264 coefficients to be
estimated by using 636 observations)
Bernstein Lasso
0.5
1
2
0
32
24
17
6
1
0
0
2
0
0
0
22
7
1
1
2
7
Tuning parameter γ
Correct clusters identif. ∈ [0, 100]
False non-zero interactions ∈ [0, 55]
False zero interactions ∈ [0, 9]
False zero spontaneous rates ∈ [0, 8]
False non-zero coeff. ∈ [0, 238]
False zero coeff. ∈ [0, 18]
P.Reynaud-Bouret
Hawkes
Adaptive Lasso
2
200 1000
0
0
32
55
13
1
0
0
2
0
1
3
199
17
1
0
2
7
Aussois 2015
109 / 129
Back to Lasso
Simulations
Simulation study - Support recovery
We perform 100 runs with T = 20, M = 8, K = 4
Bernstein Lasso
0.5
1
2
63 99 100
3
1
0
0
0
0
0
0
0
4
1
0
0
0
0
Tuning parameter γ
Correct clusters identif. ∈ [0, 100]
False non-zero interactions ∈ [0, 55]
False zero interactions ∈ [0, 9]
False zero spontaneous rates ∈ [0, 8]
False non-zero coeff. ∈ [0, 238]
False zero coeff. ∈ [0, 18]
Adaptive Lasso
2
200 1000
0
0
90
55
10
0
0
0
0
0
0
0
197
13
0
0
0
0
For T = 20,
γ = 1 or γ = 2 is convenient for Bernstein Lasso. It was also the case
for T = 2.
γ = 1000 is convenient for Adaptive Lasso. It was not the case for
T = 2.
P.Reynaud-Bouret
Hawkes
Aussois 2015
110 / 129
Back to Lasso
Simulations
Simulation study - Influence of the OLS step
We perform 100 runs with T = 20, M = 8, K = 4.
Bernstein Lasso
γ=1
γ=2
37
69
10
9
3
6
0.5
0.4
Tuning parameter γ
MSE for spontaneous rates
MSE for spontaneous rates after OLS
MSE for interaction functions
MSE for interaction functions after OLS
MSE for spontaneous rates =
M
,
m=1
MSE for interaction functions =
M ,
M #
,
m=1 ℓ=1
P.Reynaud-Bouret
Hawkes
Adaptive Lasso
γ = 1000
27
10
0.5
0.4
(ˆ
ν (m) − ν (m) )2
(m)
(m)
(hˆℓ (t) − hℓ (t))2 dt
Aussois 2015
111 / 129
Back to Lasso
Simulations
Another (more realistic ?) neuronal network : Integrate and
Fire
1
2
3
4
5
6
7
8
9
10
P.Reynaud-Bouret
Hawkes
Aussois 2015
112 / 129
Back to Lasso
Simulations
Another (more realistic ?) neuronal network : Integrate and
Fire
T = 60s
P.Reynaud-Bouret
Hawkes
Aussois 2015
112 / 129
Back to Lasso
Simulations
Another (more realistic ?) neuronal network : Integrate and
Fire
1
2
3
4
5
6
7
8
9
10
P.Reynaud-Bouret
Hawkes
Aussois 2015
112 / 129
Back to Lasso
Simulations
Another (more realistic ?) neuronal network : Integrate and
Fire
Number of times that a given
interaction is detected
P.Reynaud-Bouret
Mean energy ( h2) when detected
Hawkes
Aussois 2015
112 / 129
Back to Lasso
Simulations
Open questions for Lasso
Branching arguments ok for linear Hawkes
Hence number of points controlled if intensity smaller than
stationnary Hawkes (thinning) and in particular for (.)+ and inhibition.
P.Reynaud-Bouret
Hawkes
Aussois 2015
113 / 129
Back to Lasso
Simulations
Open questions for Lasso
Branching arguments ok for linear Hawkes
Hence number of points controlled if intensity smaller than
stationnary Hawkes (thinning) and in particular for (.)+ and inhibition.
Minoration of E(G) can be done if intensity lower bounded and upper
bounded by Hawkes
P.Reynaud-Bouret
Hawkes
Aussois 2015
113 / 129
Back to Lasso
Simulations
Open questions for Lasso
Branching arguments ok for linear Hawkes
Hence number of points controlled if intensity smaller than
stationnary Hawkes (thinning) and in particular for (.)+ and inhibition.
Minoration of E(G) can be done if intensity lower bounded and upper
bounded by Hawkes
The ”Berbee’s lemma” : no idea how to get it without linear Hawkes.
Hence what about (.)+ ? ? ?
P.Reynaud-Bouret
Hawkes
Aussois 2015
113 / 129
Back to Lasso
Simulations
Open questions for Lasso
Branching arguments ok for linear Hawkes
Hence number of points controlled if intensity smaller than
stationnary Hawkes (thinning) and in particular for (.)+ and inhibition.
Minoration of E(G) can be done if intensity lower bounded and upper
bounded by Hawkes
The ”Berbee’s lemma” : no idea how to get it without linear Hawkes.
Hence what about (.)+ ? ? ?
Still on simulation works even if not Hawkes and γ = 1.5 works !
P.Reynaud-Bouret
Hawkes
Aussois 2015
113 / 129
Back to Lasso
Simulations
Open questions for Lasso
Branching arguments ok for linear Hawkes
Hence number of points controlled if intensity smaller than
stationnary Hawkes (thinning) and in particular for (.)+ and inhibition.
Minoration of E(G) can be done if intensity lower bounded and upper
bounded by Hawkes
The ”Berbee’s lemma” : no idea how to get it without linear Hawkes.
Hence what about (.)+ ? ? ?
Still on simulation works even if not Hawkes and γ = 1.5 works !
What about group Lasso ? (no sharp enough concentration inequality
yet)
What about non stationnarity ? Segmentation, clustering (F. Picard,
C. Tuleau-Malot)
P.Reynaud-Bouret
Hawkes
Aussois 2015
113 / 129
PDE and point processes
Table of Contents
1
Point process and Counting process
2
Multivariate Hawkes processes and Lasso
3
Probabilistic ingredients
4
Back to Lasso
5
PDE and point processes
Billions of neurons
Microscopic PPS ?
Microscopic to Macroscopic
P.Reynaud-Bouret
Hawkes
Aussois 2015
114 / 129
PDE and point processes
Billions of neurons
Biological context
Action
potential
Dep
olari
zatio
n
0
-55
Failed
initiations
Threshold
-70
Stimulus
0
ation
Repolariz
Voltage (mV)
+40
1
Resting state
Refractory
period
2
3
Time (ms)
4
5
Physiological constraint : refractory period.
P.Reynaud-Bouret
Hawkes
Aussois 2015
115 / 129
PDE and point processes
Billions of neurons
Age structured equations (Pakdaman, Perthame, Salort,
2010)
Age = delay since last spike.
E
probability density of finding a neuron with age s at time t.
n(t, s) =
ratio of the population with age s at time t.
P.Reynaud-Bouret
Hawkes
Aussois 2015
116 / 129
PDE and point processes
Billions of neurons
Age structured equations (Pakdaman, Perthame, Salort,
2010)
Age = delay since last spike.
E
probability density of finding a neuron with age s at time t.
n(t, s) =
ratio of the population with age s at time t.
⎧
∂n (t, s) ∂n (t, s)
⎪
⎪
+ p (s, X (t)) n (t, s) = 0
+
⎨
∂t
∂s
# +∞
⎪
⎪
⎩ m (t) := n (t, 0) =
p (s, X (t)) n (t, s) ds
(PPS)
0
P.Reynaud-Bouret
Hawkes
Aussois 2015
116 / 129
PDE and point processes
Billions of neurons
Age structured equations (Pakdaman, Perthame, Salort,
2010)
Age = delay since last spike.
E
probability density of finding a neuron with age s at time t.
n(t, s) =
ratio of the population with age s at time t.
⎧
∂n (t, s) ∂n (t, s)
⎪
⎪
+ p (s, X (t)) n (t, s) = 0
+
⎨
∂t
∂s
# +∞
⎪
⎪
⎩ m (t) := n (t, 0) =
p (s, X (t)) n (t, s) ds
(PPS)
0
p represents the firing rate. For example, p(s, X ) = 1{s>σ(X )} .
# t
d(x)m(t − v )dv (global neural activity)
X (t) =
0
d = delay function. For example, d(v ) = e −τ v .
P.Reynaud-Bouret
Hawkes
Aussois 2015
116 / 129
PDE and point processes
Billions of neurons
One neuron = point process " age ?
Age = delay since last spike.
Microscopic age
We consider the continuous to the left (hence predictable) version of
the age.
The age at time 0 depends on the spiking times before time 0.
The dynamic is characterized by the spiking times after time 0.
P.Reynaud-Bouret
Hawkes
Aussois 2015
117 / 129
PDE and point processes
Microscopic PPS ?
Framework
· · · < T−1 < T0 ≤ 0 < T1 < . . . the ordered sequence of points of N.
Dichotomy of the behaviour of N with respect to time 0 :
P.Reynaud-Bouret
Hawkes
Aussois 2015
118 / 129
PDE and point processes
Microscopic PPS ?
Framework
· · · < T−1 < T0 ≤ 0 < T1 < . . . the ordered sequence of points of N.
Dichotomy of the behaviour of N with respect to time 0 :
N− = N ∩ (−∞, 0] is a point process with distribution P0 (initial
condition).
The age at time 0 is finite ⇔ N− ̸= ∅.
N+ = N ∩ (0, +∞) is a point process admitting some intensity
N ) " ”probability to find a new point at time t given the past”
λ(t, Ft−
P.Reynaud-Bouret
Hawkes
Aussois 2015
118 / 129
PDE and point processes
Microscopic PPS ?
Framework
· · · < T−1 < T0 ≤ 0 < T1 < . . . the ordered sequence of points of N.
Dichotomy of the behaviour of N with respect to time 0 :
N− = N ∩ (−∞, 0] is a point process with distribution P0 (initial
condition).
The age at time 0 is finite ⇔ N− ̸= ∅.
N+ = N ∩ (0, +∞) is a point process admitting some intensity
N ) " ”probability to find a new point at time t given the past”
λ(t, Ft−
N ) are analogous.
p(s, X (t)) and λ(t, Ft−
P.Reynaud-Bouret
Hawkes
Aussois 2015
118 / 129
PDE and point processes
Microscopic PPS ?
Some classical point processes in neuroscience
N ) = λ(t) = deterministic function.
Poisson process : λ(t, Ft−
N ) = f (S ) ⇔ i.i.d. ISIs.
Renewal process : λ(t, Ft−
t−
ninth
spikes
intervals
IT
Th
.
.
.
.
N )=µ+
Hawkes process : λ(t, Ft−
#
#
t−
−∞
h(t − v )N(dv )
P.Reynaud-Bouret
←→
Hawkes
.
,
Tn
t−
−∞
#
0
h(t − v )N(dv )
t
d(v )m(t − v )dv = X (t).
Aussois 2015
119 / 129
PDE and point processes
Microscopic PPS ?
Dynamic = Ogata’s thinning
: Poisson process
: Point process
N
Π is a Poisson
process with rate 1.
,
Π(dt, dx) =
δX .
E [Π(dt, dx)] = dtdx.
λ is random.
N admits λ as an
intensity.
0
P.Reynaud-Bouret
t
Hawkes
Aussois 2015
120 / 129
PDE and point processes
Microscopic PPS ?
Dynamic = Ogata’s thinning
: Poisson process
: Point process
N
Π is a Poisson
process with rate 1.
,
Π(dt, dx) =
δX .
E [Π(dt, dx)] = dtdx.
λ is random.
N admits λ as an
intensity.
0
P.Reynaud-Bouret
t
Hawkes
Aussois 2015
120 / 129
PDE and point processes
Microscopic PPS ?
Dynamic = Ogata’s thinning
: Poisson process
: Point process
N
Π is a Poisson
process with rate 1.
,
Π(dt, dx) =
δX .
E [Π(dt, dx)] = dtdx.
λ is random.
N admits λ as an
intensity.
0
P.Reynaud-Bouret
t
Hawkes
Aussois 2015
120 / 129
PDE and point processes
Microscopic PPS ?
Dynamic = Ogata’s thinning
: Poisson process
: Point process
N
Π is a Poisson
process with rate 1.
,
Π(dt, dx) =
δX .
E [Π(dt, dx)] = dtdx.
λ is random.
N admits λ as an
intensity.
0
P.Reynaud-Bouret
t
Hawkes
Aussois 2015
120 / 129
PDE and point processes
Microscopic PPS ?
Dynamic = Ogata’s thinning
: Poisson process
: Point process
N
Π is a Poisson
process with rate 1.
,
Π(dt, dx) =
δX .
E [Π(dt, dx)] = dtdx.
λ is random.
N admits λ as an
intensity.
0
P.Reynaud-Bouret
t
Hawkes
Aussois 2015
120 / 129
PDE and point processes
Microscopic PPS ?
A microscopic analogous to n
n(t, .) is the probability density of the age at time t.
At fixed time t, we are looking at a Dirac mass at St− .
P.Reynaud-Bouret
Hawkes
Aussois 2015
121 / 129
PDE and point processes
Microscopic PPS ?
A microscopic analogous to n
n(t, .) is the probability density of the age at time t.
At fixed time t, we are looking at a Dirac mass at St− .
What we need
Random measure U on R2 .
∞ (R2 ),
Action over test functions : ∀ϕ ∈ Cc,b
+
#
ϕ(t, s)U(dt, ds) =
#
ϕ(t, St− )dt.
What we define
We construct an ad hoc random measure U which satisfies a system of
stochastic differential equations similar to (PPS).
P.Reynaud-Bouret
Hawkes
Aussois 2015
121 / 129
PDE and point processes
Microscopic PPS ?
Microscopic equation (Chevallier, C´aceres, Doumic, RB)
B
C
N )
Let Π be a Poisson measure. Let λ(t, Ft−
be some non negative
t>0
1
predictable process which is Lloc a.s.
The measure U satisfies the following system a.s.
⎧
-#
.
N
λ(t,Ft−
)
⎪
⎪
⎪
Π (dt, dx) U (t, ds) = 0,
⎪
⎨(∂t + ∂s ){U (dt, ds)} +
x=0
-#
.
#
N
λ(t,Ft−
)
⎪
⎪
⎪
Π (dt, dx) U (t, ds) ,
⎪
⎩U (dt, 0) =
s∈R
x=0
in the weak sense with initial condition limt→0+ U(t, ·) = δ−T0 . (−T0 is
the age at time 0)
P.Reynaud-Bouret
Hawkes
Aussois 2015
122 / 129
PDE and point processes
Microscopic PPS ?
Microscopic equation (Chevallier, C´aceres, Doumic, RB)
B
C
N )
Let Π be a Poisson measure. Let λ(t, Ft−
be some non negative
t>0
1
predictable process which is Lloc a.s.
The measure U satisfies the following system a.s.
⎧
-#
.
N
λ(t,Ft−
)
⎪
⎪
⎪
Π (dt, dx) U (t, ds) = 0,
⎪
⎨(∂t + ∂s ){U (dt, ds)} +
x=0
-#
.
#
N
λ(t,Ft−
)
⎪
⎪
⎪
Π (dt, dx) U (t, ds) ,
⎪
⎩U (dt, 0) =
s∈R
x=0
in the weak sense with initial condition limt→0+ U(t, ·) = δ−T0 . (−T0 is
the age at time 0)
# λ(t,F N )
t−
Π (dt, dx).
" p(s, X (t)) is replaced by
x=0
(
H
G#
N
(
*
)
λ(t,Ft−
)
( N
N
dt.
E
Π (dt, dx)(Ft− = λ t, Ft−
(
x=0
P.Reynaud-Bouret
Hawkes
Aussois 2015
122 / 129
PDE and point processes
Microscopic to Macroscopic
Taking the expectation
[...] Bdefining u C= ”E(U)”, u(t, .) distribution of St− .
N )
Let λ(t, Ft−
be some non negative predictable process which is L1loc
t>0
a.s.
The measure U satisfies the following system,
⎧
-#
.
N
λ(t,Ft−
)
⎪
⎪
⎪
Π (dt, dx) U (t, ds) = 0,
⎪
⎨(∂t + ∂s ){U (dt, ds)} +
x=0
.
# λ(t,F N )
#
⎪
t−
⎪
⎪
Π (dt, dx) U (t, ds) ,
⎪
⎩U (dt, 0) =
s∈R
x=0
in the weak sense with initial condition limt→0+ U(t, ·) = δ−T0 .
P.Reynaud-Bouret
Hawkes
Aussois 2015
123 / 129
PDE and point processes
Microscopic to Macroscopic
Taking the expectation
[...] Bdefining u C= ”E(U)”, u(t, .) distribution of St− .
N )
Let λ(t, Ft−
be some non negative predictable process which is
t>0
1
Lloc in expectation, and which admits a finite mean.
The measure u = ”E(U)” satisfies the following system,
⎧
⎨(∂t + ∂s )u (dt,
# ds) + ρλ,P0 (t, s)u (dt, ds) = 0,
⎩u (dt, 0) =
s∈R
ρλ,P0 (t, s)u (t, ds) dt,
; B
C(
<
N (S
in the weak sense where ρλ,P0 (t, s) = E λ t, Ft−
t− = s for almost
every t. The initial condition limt→0+ u (t, ·) is given by the distribution of
−T0 .
P.Reynaud-Bouret
Hawkes
Aussois 2015
123 / 129
PDE and point processes
Microscopic to Macroscopic
Law of large numbers
Law of large numbers.
Population-based approach.
Theorem
Let (N i )i ≥1 be some i.i.d. point
on R with L1loc intensity in
B i processes
C
expectation. For each i , let St− t>0 denote the age process associated to
N i . Then, for every test function ϕ,
.
- n
#
#
1,
a.s.
δS i (ds) dt −−−→ ϕ(t, s)u(dt, ds),
ϕ(t, s)
t−
n→∞
n
i =1
with u satisfying the deterministic system.
P.Reynaud-Bouret
Hawkes
Aussois 2015
124 / 129
PDE and point processes
Microscopic to Macroscopic
Review of the examples
The system in expectation
⎧
⎨(∂t + ∂s )u (dt,
# ds) + ρλ,P0 (t, s) u (dt, ds) = 0,
⎩u (dt, 0) =
s∈R
ρλ,P0 (t, s) u (t, ds) dt.
; B
C(
<
N (S
where ρλ,P0 (t, s) = E λ t, Ft−
t− = s .
This result may seem OK to a probabilist,
But analysts need some explicit expression for ρ.
In particular, this system may seem linear, but it is non-linear in
general.
P.Reynaud-Bouret
Hawkes
Aussois 2015
125 / 129
PDE and point processes
Microscopic to Macroscopic
Review of the examples
The system in expectation
⎧
⎨(∂t + ∂s )u (dt,
# ds) + ρλ,P0 (t, s) u (dt, ds) = 0,
⎩u (dt, 0) =
s∈R
ρλ,P0 (t, s) u (t, ds) dt.
; B
C(
<
N (S
where ρλ,P0 (t, s) = E λ t, Ft−
t− = s .
This result may seem OK to a probabilist,
But analysts need some explicit expression for ρ.
In particular, this system may seem linear, but it is non-linear in
general.
Poisson process.
Renewal process.
P.Reynaud-Bouret
→ ρλ,P0 (t, s) = f (t).
→ ρλ,P0 (t, s) = f (s).
Hawkes
Aussois 2015
125 / 129
PDE and point processes
Microscopic to Macroscopic
Review of the examples
The system in expectation
⎧
⎨(∂t + ∂s )u (dt,
# ds) + ρλ,P0 (t, s) u (dt, ds) = 0,
⎩u (dt, 0) =
s∈R
ρλ,P0 (t, s) u (t, ds) dt.
; B
C(
<
N (S
where ρλ,P0 (t, s) = E λ t, Ft−
t− = s .
This result may seem OK to a probabilist,
But analysts need some explicit expression for ρ.
In particular, this system may seem linear, but it is non-linear in
general.
Poisson process.
Renewal process.
Hawkes process.
P.Reynaud-Bouret
→ ρλ,P0 (t, s) = f (t).
→ ρλ,P0 (t, s) = f (s).
→ ρλ,P0 is much more complex.
Hawkes
Aussois 2015
125 / 129
PDE and point processes
Microscopic to Macroscopic
For Hawkes
Recall that
# t−
−∞
h(t − x)N(dx)
P.Reynaud-Bouret
←→
#
t
0
Hawkes
d(x)m(t − x)dx = X (t).
Aussois 2015
126 / 129
PDE and point processes
Microscopic to Macroscopic
For Hawkes
Recall that
# t−
−∞
h(t − x)N(dx)
←→
#
t
0
d(x)m(t − x)dx = X (t).
What we expected
Replacement of p(s, X (t)) by
# t
?
@
N
E λ(t, Ft− ) = µ +
h (t − x) u(dx, 0) ←→ X (t)
0
What we find
p(s, X (t)) is replaced by ρλ,P0 (t, s) which is the conditional expectation,
not the full expectation.
q
Technical difficulty
; B
C(
<
N (S
ρλ,P0 (t,
s) = E λ t, Ft−
is not so easy to compute.
t− = sHawkes
P.Reynaud-Bouret
Aussois 2015
126 / 129
PDE and point processes
Microscopic to Macroscopic
Conclusions : work in progress of J. Chevallier
Univariate Hawkes processes do NOT lead to a PPS of the form
p(s, Xt ) = ν + Xt , but to another completely explicit closed PDE
system (NB : Non Markovian process)
P.Reynaud-Bouret
Hawkes
Aussois 2015
127 / 129
PDE and point processes
Microscopic to Macroscopic
Conclusions : work in progress of J. Chevallier
Univariate Hawkes processes do NOT lead to a PPS of the form
p(s, Xt ) = ν + Xt , but to another completely explicit closed PDE
system (NB : Non Markovian process)
Interacting Hawkes processes without refractory periods DO lead to
this intuition (mean field)
Even true for more realistic models with refractory periods and special
p(s, Xt ).
Propagation of chaos, CLT ...
P.Reynaud-Bouret
Hawkes
Aussois 2015
127 / 129
PDE and point processes
Microscopic to Macroscopic
Open questions
But then when we pick some neurons and measure their activity via
electrodes, why do we measure interactions ?
What kind of models would allow both : independence with most of
the neurons, sparse dependence with some (functional connectivity) ?
What can we exactly infer in the reconstructed graphs ?
Can it be couple to a more global measure of activity (LFP) to infer
more information ?
P.Reynaud-Bouret
Hawkes
Aussois 2015
128 / 129
PDE and point processes
Microscopic to Macroscopic
Many thanks to :
M. Albert, Y. Bouret, J. Chevallier, F. Delarue,
F. Grammont, T. Lalo¨e, A. Rouis, C. Tuleau-Malot (Nice),
N.R Hansen (Copenhagen), V. Rivoirard (Dauphine),
M. Fromont (Rennes 2),
M. Doumic (Inria Rocquencourt), M. C´
aceres (Granada),
L. Sansonnet (AgroParisTech),
S. Schbath (INRA Jouy-en-Josas), F. Picard (LBBE Lyon),
T. Bessa¨ıh, R. Lambert, N. Leresche (Neuronal Networks and
Physiopathological Rhythms, NPA, Paris 6)
and to Sylvie M´el´eard and Vincent Bansaye for this very nice opportunity !
Slides and references on http://math.unice.fr/ reynaudb/
N
P.Reynaud-Bouret
Hawkes
Aussois 2015
129 / 129
PDE and point processes
Microscopic to Macroscopic
Many thanks to :
M. Albert, Y. Bouret, J. Chevallier, F. Delarue,
F. Grammont, T. Lalo¨e, A. Rouis, C. Tuleau-Malot (Nice),
N.R Hansen (Copenhagen), V. Rivoirard (Dauphine),
M. Fromont (Rennes 2),
M. Doumic (Inria Rocquencourt), M. C´
aceres (Granada),
L. Sansonnet (AgroParisTech),
S. Schbath (INRA Jouy-en-Josas), F. Picard (LBBE Lyon),
T. Bessa¨ıh, R. Lambert, N. Leresche (Neuronal Networks and
Physiopathological Rhythms, NPA, Paris 6)
and to Sylvie M´el´eard and Vincent Bansaye for this very nice opportunity !
Slides and references on http://math.unice.fr/ reynaudb/
P.Reynaud-Bouret
Hawkes
Aussois 2015
129 / 129