Hawkes processes and application in neuroscience and genomic Patricia Reynaud-Bouret CNRS, Universit´ e de Nice Sophia Antipolis Journ´ees Aussois 2015, Mod´elisation Math´ematique et Biodiversit´e P.Reynaud-Bouret Hawkes Aussois 2015 1 / 129 Hawkes processes and application in neuroscience and genomic Patricia Reynaud-Bouret CNRS, Universit´ e de Nice Sophia Antipolis Journ´ees Aussois 2015, Mod´elisation Math´ematique et Biodiversit´e P.Reynaud-Bouret Hawkes Aussois 2015 1 / 129 Table of Contents 1 Point process and Counting process 2 Multivariate Hawkes processes and Lasso 3 Probabilistic ingredients 4 Back to Lasso 5 PDE and point processes P.Reynaud-Bouret Hawkes Aussois 2015 2 / 129 Point process and Counting process Table of Contents 1 Point process and Counting process Introduction Poisson process More general counting process and conditional intensity Classical statistics for counting processes 2 Multivariate Hawkes processes and Lasso 3 Probabilistic ingredients 4 Back to Lasso 5 PDE and point processes P.Reynaud-Bouret Hawkes Aussois 2015 3 / 129 Point process and Counting process Introduction Definition Point process N= random countable set of points of X (usually R). P.Reynaud-Bouret Hawkes Aussois 2015 4 / 129 Point process and Counting process Introduction Definition Point process N= random countable set of points of X (usually R). ! NA number of points of N in A, dNt = T point de N δT . " ! f (t)dNt = T ∈N f (T ) If X = R+ and no accumulation, the counting process is Nt = N[0,t] . th . . . . . . . . . . . ... .÷ . . . . . . . . i . . . . H . P.Reynaud-Bouret :-. . . . 2 , . Hawkes Aussois 2015 4 / 129 Point process and Counting process Introduction Examples disintegration of radioactive atoms date / duration / position of phone calls date / position/ magnitude of earthquakes P.Reynaud-Bouret Hawkes Aussois 2015 5 / 129 Point process and Counting process Introduction Examples disintegration of radioactive atoms date / duration / position of phone calls date / position/ magnitude of earthquakes position / size / age of trees in a forest discovery time / position/ size of petroleum fields breakdowns P.Reynaud-Bouret Hawkes Aussois 2015 5 / 129 Point process and Counting process Introduction Examples disintegration of radioactive atoms date / duration / position of phone calls date / position/ magnitude of earthquakes position / size / age of trees in a forest discovery time / position/ size of petroleum fields breakdowns time of the event ”a bee came out of its hive”, ”a monkey climbed in the tree”, ”an agent buys a financial asset”, ”someone clicks on the website” .... P.Reynaud-Bouret Hawkes Aussois 2015 5 / 129 Point process and Counting process Introduction Examples disintegration of radioactive atoms date / duration / position of phone calls date / position/ magnitude of earthquakes position / size / age of trees in a forest discovery time / position/ size of petroleum fields breakdowns time of the event ”a bee came out of its hive”, ”a monkey climbed in the tree”, ”an agent buys a financial asset”, ”someone clicks on the website” .... Everything has been noted, probably link via a model between those quantities. No reason at all that they may be identically distributed and/or independent (not i.i.d.). P.Reynaud-Bouret Hawkes Aussois 2015 5 / 129 Point process and Counting process Introduction Life times Historically, first evident records of statistics start with Graunt in 1662 and its life table. NB : at the same time, Pascal and Euler started Probability ... Halley, Huyghens and many others also recorded life tables ”Why are people dying ?” → record of the time of deaths in London at that time. P.Reynaud-Bouret Hawkes Aussois 2015 6 / 129 Point process and Counting process Introduction Life times Historically, first evident records of statistics start with Graunt in 1662 and its life table. ”Why are people dying ?” → record of the time of deaths in London at that time. Even if considered i.i.d., the interesting quantity is the hazard rate q(t), with q(t)dt ≃ P( die just after t given that alive in t) If f density, q(t) = f (t)/S(t), with S(t) = # +∞ f (s)ds. t P.Reynaud-Bouret Hawkes Aussois 2015 6 / 129 Point process and Counting process Introduction Life times Historically, first evident records of statistics start with Graunt in 1662 and its life table. ”Why are people dying ?” → record of the time of deaths in London at that time. Even if considered i.i.d., the interesting quantity is the hazard rate q(t), q = cte : do not ”age”, no memory → exponential variable q increases : better young than old q q decreases : better old than young. classical shape for human Ould mortality old aging → age P.Reynaud-Bouret Hawkes Aussois 2015 6 / 129 Point process and Counting process Introduction Life times Historically, first evident records of statistics start with Graunt in 1662 and its life table. ”Why are people dying ?” → record of the time of deaths in London at that time. Even if considered i.i.d., the interesting quantity is the hazard rate q(t), not even clearly i.i.d. : people may move out before disease, may contaminate each other etc... → what is the interesting quantity ? P.Reynaud-Bouret Hawkes Aussois 2015 6 / 129 Point process and Counting process Introduction Neuroscience and neuronal unitary activity P.Reynaud-Bouret Hawkes Aussois 2015 7 / 129 Point process and Counting process Introduction Neuronal data and Unitary Events P.Reynaud-Bouret Hawkes Aussois 2015 8 / 129 Point process and Counting process Introduction Genomics and Transcription Regulatory Elements P.Reynaud-Bouret Hawkes Aussois 2015 9 / 129 Point process and Counting process Poisson process Definition Poisson processes for all integer n, for all A1 , . . . , An disjoint measurable subsets of X, NA1 , . . . , NAn are independent random variables. P.Reynaud-Bouret Hawkes Aussois 2015 10 / 129 Point process and Counting process Poisson process Definition :a÷ ah iii. ix:i ! : ' . . Poisson processes for all integer n, for all A1 , . . . , An disjoint measurable subsets of X, NA1 , . . . , NAn are independent random variables. for all measurable subset A of X, NA obeys a Poisson law with parameter depending on A and denoted ℓ(A). P.Reynaud-Bouret Hawkes Aussois 2015 10 / 129 Point process and Counting process Poisson process Definition Poisson processes for all integer n, for all A1 , . . . , An disjoint measurable subsets of X, NA1 , . . . , NAn are independent random variables. for all measurable subset A of X, NA obeys a Poisson law with parameter depending on A and denoted ℓ(A). If X = R and ℓ([0, t]) = λt, then P(Nt = 0) = P(T1 > t) 0 P.Reynaud-Bouret £ j Hawkes . . . . . Aussois 2015 10 / 129 Point process and Counting process Poisson process Definition Poisson processes for all integer n, for all A1 , . . . , An disjoint measurable subsets of X, NA1 , . . . , NAn are independent random variables. for all measurable subset A of X, NA obeys a Poisson law with parameter depending on A and denoted ℓ(A). If X = R and ℓ([0, t]) = λt, then P(Nt = 0) = P(T1 > t) = e −λt H Poisson PINEH P.Reynaud-Bouret dt=o with distribution Hawkes . . Q,÷i0 parameter . Aussois 2015 10 / 129 Point process and Counting process Poisson process Definition Poisson processes for all integer n, for all A1 , . . . , An disjoint measurable subsets of X, NA1 , . . . , NAn are independent random variables. for all measurable subset A of X, NA obeys a Poisson law with parameter depending on A and denoted ℓ(A). If X = R and ℓ([0, t]) = λt, then P(Nt = 0) = P(T1 > t) = e −λt → first time is an exponential variable P.Reynaud-Bouret Hawkes Aussois 2015 10 / 129 Point process and Counting process Poisson process Definition Poisson processes for all integer n, for all A1 , . . . , An disjoint measurable subsets of X, NA1 , . . . , NAn are independent random variables. for all measurable subset A of X, NA obeys a Poisson law with parameter depending on A and denoted ℓ(A). If X = R and ℓ([0, t]) = λt, then P(Nt = 0) = P(T1 > t) = e −λt → first time is an exponential variable All ”intervals” are i.i.d and exponentially distributed. P.Reynaud-Bouret Hawkes Aussois 2015 10 / 129 Point process and Counting process Poisson process Definition Poisson processes for all integer n, for all A1 , . . . , An disjoint measurable subsets of X, NA1 , . . . , NAn are independent random variables. for all measurable subset A of X, NA obeys a Poisson law with parameter depending on A and denoted ℓ(A). If X = R and ℓ([0, t]) = λt, then P(Nt = 0) = P(T1 > t) = e −λt → first time is an exponential variable All ”intervals” are i.i.d and exponentially distributed. No memory ⇐⇒ independence to the past P.Reynaud-Bouret Hawkes Aussois 2015 10 / 129 Point process and Counting process Poisson process If λ not constant If ℓ([0, t]) = "t 0 λ(s)ds, let N be a Poisson process in R2+ and That @ a! . OP . : . - P.Reynaud-Bouret Hawkes Aussois 2015 11 / 129 Point process and Counting process Poisson process If λ not constant If ℓ([0, t]) = "t 0 λ(s)ds, let N be a Poisson process in R2+ and number fens o°= 0:i Poisson Ftakretu : OP of area . . " : i :3 :mkEnwmei¥ . . AH . . . - P.Reynaud-Bouret Hawkes Aussois 2015 11 / 129 Point process and Counting process Poisson process If λ not constant If ℓ([0, t]) = "t 0 λ(s)ds, let N be a Poisson process in R2+ and N) ECM ) ECM ) :: . Ham ) in ] ) ECM ) If λ(.) bounded by M P.Reynaud-Bouret Hawkes Aussois 2015 11 / 129 Point process and Counting process Poisson process If λ not constant If ℓ([0, t]) = "t 0 λ(s)ds, let N be a Poisson process in R2+ and Rejected Point HGID ) gT.ru 0 Ham ) - accepted points ECM ) If λ(.) bounded by M P.Reynaud-Bouret WGN ) ECM ) ECM ) Hawkes Aussois 2015 11 / 129 Point process and Counting process More general counting process and conditional intensity Conditional intensity Need of the same sort of interpretation as hazard rate but given ”the Past” ! P.Reynaud-Bouret Hawkes Aussois 2015 12 / 129 Point process and Counting process More general counting process and conditional intensity Conditional intensity Need of the same sort of interpretation as hazard rate but given ”the Past” ! dN $ %&t ' Nbr observed points in [t, t + dt] P.Reynaud-Bouret = intensity & '$ % λ(t) dt $ %& ' Expected Number given the past before t Hawkes + $noise %& ' Martingales differences Aussois 2015 12 / 129 Point process and Counting process More general counting process and conditional intensity Conditional intensity Need of the same sort of interpretation as hazard rate but given ”the Past” ! dN $ %&t ' Nbr observed points in [t, t + dt] λ(t) = P.Reynaud-Bouret = intensity & '$ % λ(t) dt $ %& ' Expected Number given the past before t + $noise %& ' Martingales differences instantaneous frequency Hawkes Aussois 2015 12 / 129 Point process and Counting process More general counting process and conditional intensity Conditional intensity Need of the same sort of interpretation as hazard rate but given ”the Past” ! dN $ %&t ' Nbr observed points in [t, t + dt] λ(t) = = P.Reynaud-Bouret = intensity & '$ % λ(t) dt $ %& ' Expected Number given the past before t + $noise %& ' Martingales differences instantaneous frequency random, depends on the Past (previous points ...) Hawkes Aussois 2015 12 / 129 Point process and Counting process More general counting process and conditional intensity Conditional intensity Need of the same sort of interpretation as hazard rate but given ”the Past” ! dN $ %&t ' Nbr observed points in [t, t + dt] λ(t) = = = = intensity & '$ % λ(t) dt $ %& ' Expected Number given the past before t + $noise %& ' Martingales differences instantaneous frequency random, depends on the Past (previous points ...) characterizes the distribution when exists NB : if λ(t) deterministic → Poisson process P.Reynaud-Bouret Hawkes Aussois 2015 12 / 129 Point process and Counting process More general counting process and conditional intensity Thinning Let N be a Poisson process in R2+ . In oil .÷ = . "" . ' ' ftp.ik . . λ(t) is predictable, i.e. depends on the past only, can be drawn from left to right if one discovers one point at a time. Usually ”c-`a-g” for ”continu `a gauche” (left continuous) P.Reynaud-Bouret Hawkes Aussois 2015 13 / 129 Point process and Counting process More general counting process and conditional intensity Thinning Let N be a Poisson process in R2+ . €p~ ÷ ay dtlltipasde points 0 s€,¥HEm] . ) For simulation (Ogata’s thinning), λ(t) bounded if no points appear P.Reynaud-Bouret Hawkes Aussois 2015 13 / 129 Point process and Counting process More general counting process and conditional intensity Thinning Let N be a Poisson process in R2+ . min ydttitpinhoppeanyytu • ELMH I ' For simulation (Ogata’s thinning), λ(t) bounded if no points appear P.Reynaud-Bouret Hawkes Aussois 2015 13 / 129 Point process and Counting process More general counting process and conditional intensity One life time N = {T }, T of hazard rate q. ' 9kt ' . . , , P.Reynaud-Bouret Hawkes Aussois 2015 14 / 129 Point process and Counting process More general counting process and conditional intensity One life time N = {T }, T of hazard rate q. . 941 . . tttsaoatayst 'M .tt?dsi.=explnsLH to ) It . expfftqlslds ) Plt7t-P( = . . =expfs =S(t . ) : λ(t) = q(t)1T ≥t P.Reynaud-Bouret t Hawkes Aussois 2015 14 / 129 Point process and Counting process More general counting process and conditional intensity Poissonian interaction Parents : Ui either (uniform) i.i.d. or (homogeneous) Poisson process Each parent gives birth according to a Poisson process of intensity h originated in Ui . Process of the children ? then aIy¥÷ eE¥ P.Reynaud-Bouret Hawkes Aussois 2015 15 / 129 Point process and Counting process More general counting process and conditional intensity Poissonian interaction Parents : Ui either (uniform) i.i.d. or (homogeneous) Poisson process Each parent gives birth according to a Poisson process of intensity h originated in Ui . " Process of the children ? " int fhlt λ(t) = ! i /Ui <t P.Reynaud-Bouret h(t − Ui ) ← htttithttud Ey Hawkes Aussois 2015 15 / 129 Point process and Counting process More general counting process and conditional intensity Poissonian interaction Parents : Ui either (uniform) i.i.d. or (homogeneous) Poisson process Each parent gives birth according to a Poisson process of intensity h originated in Ui . With some orphans of rate ν ? Othtt U )+h(t Ud . - , ← 0 utan i÷ ; . i . . ; a P.Reynaud-Bouret Hawkes Aussois 2015 15 / 129 Point process and Counting process More general counting process and conditional intensity Poissonian interaction Parents : Ui either (uniform) i.i.d. or (homogeneous) Poisson process Each parent gives birth according to a Poisson process of intensity h originated in Ui . With some orphans of rate ν ? . .÷ = i.in ' D ; 11 I 11 i λ(t) = ν + ! ai ; i /Ui <t P.Reynaud-Bouret I I / l I I I 11 I 1 ' ' 11 i 1 , 1 1 11 . h(t − Ui ) Hawkes Aussois 2015 15 / 129 Point process and Counting process More general counting process and conditional intensity Hawkes process : linear case A self-exciting process introduced by Hawkes in the 70’s to model earthquakes. Ancestors appear at rate ν. Each point gives birth according to h etc etc (aftershocks) €i% =iii :-, , P.Reynaud-Bouret Hawkes , . Aussois 2015 , 16 / 129 Point process and Counting process More general counting process and conditional intensity Hawkes process : linear case A self-exciting process introduced by Hawkes in the 70’s to model earthquakes. Ancestors appear at rate ν. Each point gives birth according to h etc etc (aftershocks) vk.mil#HiIY?Dh λ(t) = ν + ! T <t P.Reynaud-Bouret h(t − T ) = ν + "t −∞ h(t Hawkes − u)dNu Aussois 2015 16 / 129 Point process and Counting process More general counting process and conditional intensity Hawkes and Modelisation models any sort of self contamination (earthquakes, clicks, etc) P.Reynaud-Bouret Hawkes Aussois 2015 17 / 129 Point process and Counting process More general counting process and conditional intensity Hawkes and Modelisation models any sort of self contamination (earthquakes, clicks, etc) can be marked : position or magnitude of earthquakes, neuron on which the spike happens (see later) P.Reynaud-Bouret Hawkes Aussois 2015 17 / 129 Point process and Counting process More general counting process and conditional intensity Hawkes and Modelisation models any sort of self contamination (earthquakes, clicks, etc) can be marked : position or magnitude of earthquakes, neuron on which the spike happens (see later) can therefore model interaction also (see later) P.Reynaud-Bouret Hawkes Aussois 2015 17 / 129 Point process and Counting process More general counting process and conditional intensity Hawkes and Modelisation models any sort of self contamination (earthquakes, clicks, etc) can be marked : position or magnitude of earthquakes, neuron on which the spike happens (see later) can therefore model interaction also (see later) " generally, modelling people ”want” stationnarity : if h < 1, the branching procedure ends ⇐⇒ existence of stationnary version. P.Reynaud-Bouret Hawkes Aussois 2015 17 / 129 Point process and Counting process More general counting process and conditional intensity Hawkes and Modelisation models any sort of self contamination (earthquakes, clicks, etc) can be marked : position or magnitude of earthquakes, neuron on which the spike happens (see later) can therefore model interaction also (see later) " generally, modelling people ”want” stationnarity : if h < 1, the branching procedure ends ⇐⇒ existence of stationnary version. for genomics and neuroscience, need of inhibition : possible to use # t λ(t) = Φ( h(t − u)dNu ), −∞ with Φ 1-Lipschitz positive and h of any sign. Stationnarity condition " |h| < 1. In particular, Φ = (.)+ but loss of the branching structure ( ?) P.Reynaud-Bouret Hawkes Aussois 2015 17 / 129 Point process and Counting process Classical statistics for counting processes Likelihood Informally the likelihood of the observation N should give the probability to see (something near) N. P.Reynaud-Bouret Hawkes Aussois 2015 18 / 129 Point process and Counting process Classical statistics for counting processes Likelihood Informally the likelihood of the observation N should give the probability to see (something near) N. Formally, it is somehow the density wrt (here) Poisson, but without depending on the intensity of the reference Poisson. P.Reynaud-Bouret Hawkes Aussois 2015 18 / 129 Point process and Counting process Classical statistics for counting processes Likelihood Informally the likelihood of the observation N should give the probability to see (something near) N. If N = {t1 , ..., tn }, we want for N ′ of intensity λ(.) observed on [0, T ], P(N ′ has n points, each Ti in [ti − dti , ti ]) . , P.Reynaud-Bouret itit . . . . Hawkes Aussois 2015 18 / 129 Point process and Counting process Classical statistics for counting processes Likelihood Informally the likelihood of the observation N should give the probability to see (something near) N. If N = {t1 , ..., tn }, we want for N ′ of intensity λ(.) observed on [0, T ], P(N ′ has n points, each Ti in [ti − dti , ti ]) . , itit . . . . P(no point in [0, t1 −dt1 ], 1 point in [t1 −dt1 , t1 ], ..., no point in[tn , T ]) P.Reynaud-Bouret Hawkes Aussois 2015 18 / 129 Point process and Counting process Classical statistics for counting processes Likelihood(2) ; jin .jx£+:e , , it . . . . ( !T ( P(no point in[tn , T ](past at tn ) = e − tn λ(s)ds P.Reynaud-Bouret Hawkes H Aussois 2015 19 / 129 Point process and Counting process Classical statistics for counting processes Likelihood(3) =y4odsIeoec ;f , if . . . , it . . . . H ( ! tn +dtn ( λ(s)ds P(1 point in[tn − dtn , tn ](past at tn − dtn ) ≃ λ(tn )dtn e − tn P.Reynaud-Bouret Hawkes Aussois 2015 20 / 129 Point process and Counting process Classical statistics for counting processes Likelihood(4) P(no point in [0, t1 − dt1 ], 1 point in [t1 − dt1 , t1 ], ..., no point in[tn , T ]) ) = E 1no point in [0,t1 −dt1 ] ...11 point in [tn −dtn ,tn ] × ( * ( P(no point in[tn , T ](past at tn ) ) * !T = E 1no point in [0,t1 −dt1 ] ...11 point in [tn −dtn ,tn ] e − tn λ(s)ds Point process and Counting process Classical statistics for counting processes Likelihood(4) P(no point in [0, t1 − dt1 ], 1 point in [t1 − dt1 , t1 ], ..., no point in[tn , T ]) ) = E 1no point in [0,t1 −dt1 ] ...11 point in [tn −dtn ,tn ] × ( * ( P(no point in[tn , T ](past at tn ) ) * !T = E 1no point in [0,t1 −dt1 ] ...11 point in [tn −dtn ,tn ] e − tn λ(s)ds Point process and Counting process Classical statistics for counting processes Likelihood(4) P(no point in [0, t1 − dt1 ], 1 point in [t1 − dt1 , t1 ], ..., no point in[tn , T ]) ) = E 1no point in [0,t1 −dt1 ] ...11 point in [tn −dtn ,tn ] × ( * ( P(no point in[tn , T ](past at tn ) ) * !T = E 1no point in [0,t1 −dt1 ] ...11 point in [tn −dtn ,tn ] e − tn λ(s)ds ( ) * ( = E 1no point in [0,t1 −dt1 ] ...P(1 point in[tn − dtn , tn ](past at tn − dtn ) × e− ) ≃ E 1no point in P.Reynaud-Bouret !T tn λ(s)ds * [tn−1 ,tn −dtn ] × !T λ(tn )dtn e − tn −dtn λ(s)ds [0,t1 −dt1 ] ...1no point in Hawkes Aussois 2015 21 / 129 Point process and Counting process Classical statistics for counting processes Likelihood and log-likelihood (Pseudo)Likelihood + λ(ti )e − !T λ(s)ds 0 i dependance on the past not explicit (ex Past before 0 for Hawkes) na :* IHI :p Eyton t.IN . P.Reynaud-Bouret Hawkes Aussois 2015 22 / 129 Point process and Counting process Classical statistics for counting processes Likelihood and log-likelihood (Pseudo)Likelihood + λ(ti )e − !T 0 λ(s)ds i dependance on the past not explicit (ex Past before 0 for Hawkes) LogLikelihood ℓλ (N) = P.Reynaud-Bouret # T ] λ(s)dNs look 0 Hawkes − # T λ(s)ds 0 Aussois 2015 22 / 129 Point process and Counting process Classical statistics for counting processes Maximum likelihood estimator If model for λ → λθ , θ ∈ Rd , how to choose the best θ according to the observed data N on [0, T ] ? .÷Xh P.Reynaud-Bouret Hawkes ... Aussois 2015 23 / 129 Point process and Counting process Classical statistics for counting processes Maximum likelihood estimator If model for λ → λθ , θ ∈ Rd , how to choose the best θ according to the observed data N on [0, T ] ? *I D= los likely . likely ,÷€"bglH#o5ko& 7 : Imre n - observed points then MLE = θˆ = arg maxθ ℓλθ (N). P.Reynaud-Bouret Hawkes Aussois 2015 23 / 129 Point process and Counting process Classical statistics for counting processes Maximum likelihood estimator(2) If d, number of unknown parameter, fixed one can show usually nice asymptotic properties (in T ... ) : consistency (θˆ → θ) asymptotic normality efficiency (smallest asymptotic variance) (cf. Andersen, Borgan, Gill and Keiding) P.Reynaud-Bouret Hawkes Aussois 2015 24 / 129 Point process and Counting process Classical statistics for counting processes Maximum likelihood estimator(2) If d, number of unknown parameter, fixed one can show usually nice asymptotic properties (in T ... ) : consistency (θˆ → θ) asymptotic normality efficiency (smallest asymptotic variance) (cf. Andersen, Borgan, Gill and Keiding) iterate :* In practice, T and d are fixed → if d large wrt T , i.e. in the non asymptotic regime, over fitting. P.Reynaud-Bouret Hawkes Aussois 2015 24 / 129 Point process and Counting process Classical statistics for counting processes Goodness-of-fit test If Λt = "t 0 λ(s)ds, then Λt is a nondecreasing process, predictable and its (pseudo)inverse exists. It is the compensator of the counting process Nt , i.e. (Nt − Λt )t martingale. . P.Reynaud-Bouret D: At . Hawkes Aussois 2015 25 / 129 Point process and Counting process Classical statistics for counting processes Goodness-of-fit test If Λt = "t 0 λ(s)ds, then Λt is a nondecreasing process, predictable and its (pseudo)inverse exists. It is the compensator of the counting process Nt , i.e. (Nt − Λt )t martingale. NΛ−1 is a Poisson process of intensity 1. t atrp jj ' . . . . P.Reynaud-Bouret . ' ' ' . Hawkes Aussois 2015 25 / 129 Point process and Counting process Classical statistics for counting processes Goodness-of-fit test If Λt = "t 0 λ(s)ds, then Λt is a nondecreasing process, predictable and its (pseudo)inverse exists. It is the compensator of the counting process Nt , i.e. (Nt − Λt )t martingale. NΛ−1 is a Poisson process of intensity 1. t → test that N (observed) has the desired intensity λ. P.Reynaud-Bouret Hawkes Aussois 2015 25 / 129 Multivariate Hawkes processes and Lasso Table of Contents 1 Point process and Counting process 2 Multivariate Hawkes processes and Lasso A model for neuroscience and genomics Parametric estimation What can we do against over fitting ? (Adaptation) Lasso criterion Simulations Real data analysis 3 Probabilistic ingredients 4 Back to Lasso 5 PDE and point processes P.Reynaud-Bouret Hawkes Aussois 2015 26 / 129 Multivariate Hawkes processes and Lasso A model for neuroscience and genomics Biological framework (a) One neuron P.Reynaud-Bouret (b) Connected neurons Hawkes Aussois 2015 27 / 129 Multivariate Hawkes processes and Lasso A model for neuroscience and genomics Synaptic integration without synchronization seuil d’excitation du neurone potentiel de repos du neurone diff´ erentes dendrites signaux d’entr´ ees P.Reynaud-Bouret Hawkes Aussois 2015 28 / 129 Multivariate Hawkes processes and Lasso A model for neuroscience and genomics Synaptic integration without synchronization seuil d’excitation du neurone potentiel de repos du neurone diff´ erentes dendrites signaux d’entr´ ees P.Reynaud-Bouret Hawkes Aussois 2015 28 / 129 Multivariate Hawkes processes and Lasso A model for neuroscience and genomics Synaptic integration without synchronization seuil d’excitation du neurone potentiel de repos du neurone diff´ erentes dendrites signaux d’entr´ ees P.Reynaud-Bouret Hawkes Aussois 2015 28 / 129 Multivariate Hawkes processes and Lasso A model for neuroscience and genomics Synaptic integration without synchronization seuil d’excitation du neurone potentiel de repos du neurone diff´ erentes dendrites signaux d’entr´ ees P.Reynaud-Bouret Hawkes Aussois 2015 28 / 129 Multivariate Hawkes processes and Lasso A model for neuroscience and genomics Synaptic integration without synchronization seuil d’excitation du neurone potentiel de repos du neurone diff´ erentes dendrites signaux d’entr´ ees P.Reynaud-Bouret Hawkes Aussois 2015 28 / 129 Multivariate Hawkes processes and Lasso A model for neuroscience and genomics Synaptic integration without synchronization seuil d’excitation du neurone potentiel de repos du neurone diff´ erentes dendrites signaux d’entr´ ees P.Reynaud-Bouret Hawkes Aussois 2015 28 / 129 Multivariate Hawkes processes and Lasso A model for neuroscience and genomics Synaptic integration without synchronization seuil d’excitation du neurone potentiel de repos du neurone diff´ erentes dendrites signaux d’entr´ ees P.Reynaud-Bouret Hawkes Aussois 2015 28 / 129 Multivariate Hawkes processes and Lasso A model for neuroscience and genomics Synaptic integration without synchronization seuil d’excitation du neurone potentiel de repos du neurone diff´ erentes dendrites signaux d’entr´ ees P.Reynaud-Bouret Hawkes Aussois 2015 28 / 129 Multivariate Hawkes processes and Lasso A model for neuroscience and genomics Synaptic integration without synchronization seuil d’excitation du neurone potentiel de repos du neurone diff´ erentes dendrites signaux d’entr´ ees P.Reynaud-Bouret Hawkes Aussois 2015 28 / 129 Multivariate Hawkes processes and Lasso A model for neuroscience and genomics Synaptic integration without synchronization seuil d’excitation du neurone potentiel de repos du neurone diff´ erentes dendrites signaux d’entr´ ees P.Reynaud-Bouret Hawkes Aussois 2015 28 / 129 Multivariate Hawkes processes and Lasso A model for neuroscience and genomics Synaptic integration without synchronization potentiel d’action seuil d’excitation du neurone potentiel de repos du neurone diff´ erentes dendrites signaux d’entr´ ees P.Reynaud-Bouret Hawkes Aussois 2015 28 / 129 Multivariate Hawkes processes and Lasso A model for neuroscience and genomics Synaptic integration with synchronization seuil d’excitation du neurone potentiel de repos du neurone diff´ erentes dendrites signaux d’entr´ ees P.Reynaud-Bouret Hawkes Aussois 2015 29 / 129 Multivariate Hawkes processes and Lasso A model for neuroscience and genomics Synaptic integration with synchronization seuil d’excitation du neurone potentiel de repos du neurone diff´ erentes dendrites signaux d’entr´ ees P.Reynaud-Bouret Hawkes Aussois 2015 29 / 129 Multivariate Hawkes processes and Lasso A model for neuroscience and genomics Synaptic integration with synchronization seuil d’excitation du neurone potentiel de repos du neurone diff´ erentes dendrites signaux d’entr´ ees P.Reynaud-Bouret Hawkes Aussois 2015 29 / 129 Multivariate Hawkes processes and Lasso A model for neuroscience and genomics Synaptic integration with synchronization potentiel d’action seuil d’excitation du neurone potentiel de repos du neurone diff´ erentes dendrites signaux d’entr´ ees P.Reynaud-Bouret Hawkes Aussois 2015 29 / 129 Multivariate Hawkes processes and Lasso A model for neuroscience and genomics Synaptic integration with synchronization potentiel d’action seuil d’excitation du neurone potentiel de repos du neurone diff´ erentes dendrites signaux d’entr´ ees P.Reynaud-Bouret Hawkes Aussois 2015 29 / 129 Multivariate Hawkes processes and Lasso A model for neuroscience and genomics Synaptic integration with synchronization potentiel d’action seuil d’excitation du neurone potentiel de repos du neurone diff´ erentes dendrites signaux d’entr´ ees P.Reynaud-Bouret Hawkes Aussois 2015 29 / 129 Multivariate Hawkes processes and Lasso A model for neuroscience and genomics Synaptic integration with synchronization potentiel d’action seuil d’excitation du neurone potentiel de repos du neurone diff´ erentes dendrites signaux d’entr´ ees P.Reynaud-Bouret Hawkes Aussois 2015 29 / 129 Multivariate Hawkes processes and Lasso A model for neuroscience and genomics Synaptic integration with synchronization potentiel d’action seuil d’excitation du neurone potentiel de repos du neurone diff´ erentes dendrites signaux d’entr´ ees P.Reynaud-Bouret Hawkes Aussois 2015 29 / 129 Multivariate Hawkes processes and Lasso A model for neuroscience and genomics Synaptic integration with synchronization potentiel d’action potentiel d’action seuil d’excitation du neurone potentiel de repos du neurone diff´ erentes dendrites signaux d’entr´ ees P.Reynaud-Bouret Hawkes Aussois 2015 29 / 129 Multivariate Hawkes processes and Lasso A model for neuroscience and genomics Point process and genomics There are several ”events” of different types on the DNA that may ”work” together in synergy. P.Reynaud-Bouret Hawkes Aussois 2015 30 / 129 Multivariate Hawkes processes and Lasso A model for neuroscience and genomics Point process and genomics There are several ”events” of different types on the DNA that may ”work” together in synergy. Motifs = words in the DNA-alphabet {actg}. P.Reynaud-Bouret Hawkes Aussois 2015 30 / 129 Multivariate Hawkes processes and Lasso A model for neuroscience and genomics Point process and genomics There are several ”events” of different types on the DNA that may ”work” together in synergy. Motifs = words in the DNA-alphabet {actg}. How can statistician suggest functional motifs based on the statistical properties of their occurrences ? Unexpected frequency → Markov models (see for a review Reinert, Schbath, Waterman (2000)) P.Reynaud-Bouret Hawkes Aussois 2015 30 / 129 Multivariate Hawkes processes and Lasso A model for neuroscience and genomics Point process and genomics There are several ”events” of different types on the DNA that may ”work” together in synergy. Motifs = words in the DNA-alphabet {actg}. How can statistician suggest functional motifs based on the statistical properties of their occurrences ? Unexpected frequency → Markov models (see for a review Reinert, Schbath, Waterman (2000)) Poor or rich regions → scan statistics (see, for instance, Robin Daudin (1999) or Stefanov (2003)) P.Reynaud-Bouret Hawkes Aussois 2015 30 / 129 Multivariate Hawkes processes and Lasso A model for neuroscience and genomics Point process and genomics There are several ”events” of different types on the DNA that may ”work” together in synergy. Motifs = words in the DNA-alphabet {actg}. How can statistician suggest functional motifs based on the statistical properties of their occurrences ? Unexpected frequency → Markov models (see for a review Reinert, Schbath, Waterman (2000)) Poor or rich regions → scan statistics (see, for instance, Robin Daudin (1999) or Stefanov (2003)) If two motifs are part of a common biological process, the space between their occurrences (not necessarily consecutive) should be somehow fixed → favored or avoided distances (Gusto, Schbath (2005)) P.Reynaud-Bouret Hawkes Aussois 2015 30 / 129 Multivariate Hawkes processes and Lasso A model for neuroscience and genomics Point process and genomics There are several ”events” of different types on the DNA that may ”work” together in synergy. Motifs = words in the DNA-alphabet {actg}. How can statistician suggest functional motifs based on the statistical properties of their occurrences ? Unexpected frequency → Markov models (see for a review Reinert, Schbath, Waterman (2000)) Poor or rich regions → scan statistics (see, for instance, Robin Daudin (1999) or Stefanov (2003)) If two motifs are part of a common biological process, the space between their occurrences (not necessarily consecutive) should be somehow fixed → favored or avoided distances (Gusto, Schbath (2005)) pairwise study. P.Reynaud-Bouret Hawkes Aussois 2015 30 / 129 Multivariate Hawkes processes and Lasso A model for neuroscience and genomics Why real multivariate point processes in genomics ? TRE Transcription Regulatory Elements = ”everything” that may enhance or repress gene expression P.Reynaud-Bouret Hawkes Aussois 2015 31 / 129 Multivariate Hawkes processes and Lasso A model for neuroscience and genomics Why real multivariate point processes in genomics ? TRE Transcription Regulatory Elements = ”everything” that may enhance or repress gene expression promoter, enhancer, silencer, histone modifications, replication origin on the DNA.... They should interact but how ? Can we have a statistical guess ? P.Reynaud-Bouret Hawkes Aussois 2015 31 / 129 Multivariate Hawkes processes and Lasso A model for neuroscience and genomics Why real multivariate point processes in genomics ? TRE Transcription Regulatory Elements = ”everything” that may enhance or repress gene expression promoter, enhancer, silencer, histone modifications, replication origin on the DNA.... They should interact but how ? Can we have a statistical guess ? There are methods (ChIP-chip experiments, ChIP-seq experiments) where after preprocessing the data one has access to the (almost exact) positions of several type of TREs at one time, and this under different experimental conditions. (ENCODE) P.Reynaud-Bouret Hawkes Aussois 2015 31 / 129 Multivariate Hawkes processes and Lasso A model for neuroscience and genomics Why real multivariate point processes in genomics ? TRE Transcription Regulatory Elements = ”everything” that may enhance or repress gene expression promoter, enhancer, silencer, histone modifications, replication origin on the DNA.... They should interact but how ? Can we have a statistical guess ? There are methods (ChIP-chip experiments, ChIP-seq experiments) where after preprocessing the data one has access to the (almost exact) positions of several type of TREs at one time, and this under different experimental conditions. (ENCODE) On the real line = DNA if the 3D structure of the DNA is negligible (typically interaction range between points ≤ 10 kB) P.Reynaud-Bouret Hawkes Aussois 2015 31 / 129 Multivariate Hawkes processes and Lasso A model for neuroscience and genomics Why real multivariate point processes in genomics ? TRE Transcription Regulatory Elements = ”everything” that may enhance or repress gene expression promoter, enhancer, silencer, histone modifications, replication origin on the DNA.... They should interact but how ? Can we have a statistical guess ? There are methods (ChIP-chip experiments, ChIP-seq experiments) where after preprocessing the data one has access to the (almost exact) positions of several type of TREs at one time, and this under different experimental conditions. (ENCODE) On the real line = DNA if the 3D structure of the DNA is negligible (typically interaction range between points ≤ 10 kB) If the real structure → 3D point processes on graphs... ( ? ?) Why just DNA ? RNA etc ... P.Reynaud-Bouret Hawkes Aussois 2015 31 / 129 Multivariate Hawkes processes and Lasso A model for neuroscience and genomics Point processes and conditional intensity dN $ %&t ' Nbr observed points in [t, t + dt] λ(t) P.Reynaud-Bouret = = = intensity & '$ % λ(t) dt $ %& ' Expected Number given the past before t + $noise %& ' Martingales differences instantaneous frequency random, depends on previous points Hawkes Aussois 2015 32 / 129 Multivariate Hawkes processes and Lasso A model for neuroscience and genomics Local independence graphs (Didelez (2008)) 1 3 1 2 P.Reynaud-Bouret 3 1 2 Hawkes 3 2 Aussois 2015 33 / 129 Multivariate Hawkes processes and Lasso A model for neuroscience and genomics Local independence graphs (Didelez (2008)) 1 3 1 2 3 1 2 3 2 If one is able to infer a local independence graph, then we may have access to ”functional connectivity”. &→ needs a ”model” . P.Reynaud-Bouret Hawkes Aussois 2015 33 / 129 Multivariate Hawkes processes and Lasso A model for neuroscience and genomics Multivariate Hawkes processes t N1 N2 N3 P.Reynaud-Bouret Hawkes Aussois 2015 34 / 129 Multivariate Hawkes processes and Lasso A model for neuroscience and genomics Multivariate Hawkes processes t N1 N2 N3 λ(1) (t) P.Reynaud-Bouret = ν $%&' spontaneous Hawkes Aussois 2015 34 / 129 Multivariate Hawkes processes and Lasso A model for neuroscience and genomics Multivariate Hawkes processes t h N1 1->1 N2 N3 λ(1) (t) = ν $%&' spontaneous P.Reynaud-Bouret + , h1→1 (t − T ) T <t pt de N1 $ %& self-interaction Hawkes ' Aussois 2015 34 / 129 Multivariate Hawkes processes and Lasso A model for neuroscience and genomics Multivariate Hawkes processes t h N1 1->1 x N2 N3 λ(1) (t) = ν $%&' spontaneous P.Reynaud-Bouret + , h1→1 (t − T ) T <t pt de N1 $ %& self-interaction Hawkes ' Aussois 2015 34 / 129 Multivariate Hawkes processes and Lasso A model for neuroscience and genomics Multivariate Hawkes processes t h N1 x 1->1 x N2 N3 λ(1) (t) = ν $%&' spontaneous P.Reynaud-Bouret + , h1→1 (t − T ) T <t pt de N1 $ %& self-interaction Hawkes ' Aussois 2015 34 / 129 Multivariate Hawkes processes and Lasso A model for neuroscience and genomics Multivariate Hawkes processes t h N1 x 1->1 x x h N2 2->1 N3 λ(1) (t) = ν $%&' spontaneous P.Reynaud-Bouret + , h1→1 (t − T ) T <t pt de N1 %& $ self-interaction Hawkes ' + , hm→1 (t − T ) m ̸= 1, T < t, pt de Nm %& $ interaction Aussois 2015 ' 34 / 129 Multivariate Hawkes processes and Lasso A model for neuroscience and genomics Multivariate Hawkes processes t h N1 x 1->1 x x h N2 2->1 x N3 λ(1) (t) = ν $%&' spontaneous P.Reynaud-Bouret + , h1→1 (t − T ) T <t pt de N1 %& $ self-interaction Hawkes ' + , hm→1 (t − T ) m ̸= 1, T < t, pt de Nm %& $ interaction Aussois 2015 ' 34 / 129 Multivariate Hawkes processes and Lasso A model for neuroscience and genomics Multivariate Hawkes processes t h N1 x 1->1 x x h N2 2->1 x h N3 λ(1) (t) = ν $%&' spontaneous P.Reynaud-Bouret + , h1→1 (t − T ) T <t pt de N1 %& $ self-interaction Hawkes x 3->1 ' + , hm→1 (t − T ) m ̸= 1, T < t, pt de Nm %& $ interaction Aussois 2015 ' 34 / 129 Multivariate Hawkes processes and Lasso A model for neuroscience and genomics Multivariate Hawkes processes t h N1 x 1->1 x x h N2 2->1 x h N3 λ(1) (t) = ν $%&' spontaneous P.Reynaud-Bouret + , h1→1 (t − T ) T <t pt de N1 %& $ self-interaction Hawkes x 3->1 ' + x , hm→1 (t − T ) m ̸= 1, T < t, pt de Nm %& $ interaction Aussois 2015 ' 34 / 129 Multivariate Hawkes processes and Lasso A model for neuroscience and genomics Multivariate Hawkes processes t h N1 x 1->1 x x h N2 2->1 x h N3 λ(1) (t) = ν $%&' spontaneous P.Reynaud-Bouret + , h1→1 (t − T ) T <t pt de N1 %& $ self-interaction Hawkes x 3->1 ' + x x , hm→1 (t − T ) m ̸= 1, T < t, pt de Nm %& $ interaction Aussois 2015 ' 34 / 129 Multivariate Hawkes processes and Lasso A model for neuroscience and genomics More formally (r ) Only excitation (all the hℓ are positive) : for all r , M # t− , (r ) (ℓ) λ(r ) (t) = νr + hℓ (t − u)dNu . ℓ=1 −∞ Branching)/ Cluster representation, stationary process if the spectral * " (r ) radius of hℓ (t)dt is < 1. P.Reynaud-Bouret Hawkes Aussois 2015 35 / 129 Multivariate Hawkes processes and Lasso A model for neuroscience and genomics More formally (r ) Only excitation (all the hℓ are positive) : for all r , M # t− , (r ) (ℓ) λ(r ) (t) = νr + hℓ (t − u)dNu . ℓ=1 −∞ Branching)/ Cluster representation, stationary process if the spectral * " (r ) radius of hℓ (t)dt is < 1. Interaction, for instance, (in general any 1-Lipschitz function, Br´emaud Massouli´e 1996) . M # t− , (ℓ) (r ) (r ) . λ (t) = νr + hℓ (t − u)dNu ℓ=1 P.Reynaud-Bouret −∞ Hawkes + Aussois 2015 35 / 129 Multivariate Hawkes processes and Lasso A model for neuroscience and genomics More formally (r ) Only excitation (all the hℓ are positive) : for all r , M # t− , (r ) (ℓ) λ(r ) (t) = νr + hℓ (t − u)dNu . ℓ=1 −∞ Branching)/ Cluster representation, stationary process if the spectral * " (r ) radius of hℓ (t)dt is < 1. Interaction, for instance, (in general any 1-Lipschitz function, Br´emaud Massouli´e 1996) . M # t− , (ℓ) (r ) (r ) . λ (t) = νr + hℓ (t − u)dNu ℓ=1 −∞ + Exponential (Multiplicative no guarantee of a stationary . - shapeMbut , # t− (r ) (ℓ) . hℓ (t − u)dNu version ...) λ(r ) (t) = exp νr + P.Reynaud-Bouret ℓ=1 Hawkes −∞ Aussois 2015 35 / 129 Multivariate Hawkes processes and Lasso Parametric estimation Previous works Maximum likelihood estimates eventually (Ogata, Vere-Jones etc mainly for sismology, Chornoboy et al., for neuroscience, Gusto and Schbath for genomics) Parametric tests for the detection of edge + Maximum likelihood + exponential formula + spline estimation (Carstensen et al., in genomics) P.Reynaud-Bouret Hawkes Aussois 2015 36 / 129 Multivariate Hawkes processes and Lasso Parametric estimation Previous works Maximum likelihood estimates eventually (Ogata, Vere-Jones etc mainly for sismology, Chornoboy et al., for neuroscience, Gusto and Schbath for genomics) Parametric tests for the detection of edge + Maximum likelihood + exponential formula + spline estimation (Carstensen et al., in genomics) Main problem not enough spikes ! so either over-fitting because d (number of parameters) too large or bad estimation if d too small (see later). Also MLE are not that easy to compute (EM algorithm etc) ̸= least-squares if linear parametric model. P.Reynaud-Bouret Hawkes Aussois 2015 36 / 129 Multivariate Hawkes processes and Lasso Parametric estimation Another parametric method : Least-squares Observation on [0, T ] A good estimate of the parameters should correspond to an intensity candidate close to the true one. P.Reynaud-Bouret Hawkes Aussois 2015 37 / 129 Multivariate Hawkes processes and Lasso Parametric estimation Another parametric method : Least-squares Observation on [0, T ] A good estimate of the parameters should correspond to an intensity candidate close to the true one. "T Hence one would like to minimize ∥η − λ∥2 = 0 [η(t) − λ(t)]2 dt in η P.Reynaud-Bouret Hawkes Aussois 2015 37 / 129 Multivariate Hawkes processes and Lasso Parametric estimation Another parametric method : Least-squares Observation on [0, T ] A good estimate of the parameters should correspond to an intensity candidate close to the true one. "T Hence one would like to minimize ∥η − λ∥2 = 0 [η(t) − λ(t)]2 dt in η "T "T or equivalently −2 0 η(t)λ(t)dt + 0 η(t)2 dt. P.Reynaud-Bouret Hawkes Aussois 2015 37 / 129 Multivariate Hawkes processes and Lasso Parametric estimation Another parametric method : Least-squares Observation on [0, T ] A good estimate of the parameters should correspond to an intensity candidate close to the true one. "T Hence one would like to minimize ∥η − λ∥2 = 0 [η(t) − λ(t)]2 dt in η "T "T or equivalently −2 0 η(t)λ(t)dt + 0 η(t)2 dt. But dNt randomly fluctuates around λ(t)dt and is observable. P.Reynaud-Bouret Hawkes Aussois 2015 37 / 129 Multivariate Hawkes processes and Lasso Parametric estimation Another parametric method : Least-squares Observation on [0, T ] A good estimate of the parameters should correspond to an intensity candidate close to the true one. "T Hence one would like to minimize ∥η − λ∥2 = 0 [η(t) − λ(t)]2 dt in η "T "T or equivalently −2 0 η(t)λ(t)dt + 0 η(t)2 dt. But dNt randomly fluctuates around λ(t)dt and is observable. Hence minimize γ(η) = −2 # T η(t)dNt + 0 # T η(t)2 dt, 0 for a model η = λa (t), If multivariate minimize P.Reynaud-Bouret ! m γm (η (m) ). Hawkes Aussois 2015 37 / 129 Multivariate Hawkes processes and Lasso Parametric estimation Least-square contrast on a toy model Let [a, b] an interval of R+ . P.Reynaud-Bouret Hawkes Aussois 2015 38 / 129 Multivariate Hawkes processes and Lasso Parametric estimation Least-square contrast on a toy model Let [a, b] an interval of R+ . η(t) = , α1a≤t−T ≤b = αN[t−b,t−a] T ∈N There is only one parameter α → minimizing γ P.Reynaud-Bouret Hawkes Aussois 2015 38 / 129 Multivariate Hawkes processes and Lasso Parametric estimation Least-square contrast on a toy model Let [a, b] an interval of R+ . η(t) = , α1a≤t−T ≤b = αN[t−b,t−a] T ∈N There is only one parameter α → minimizing γ γ(η) = −2α P.Reynaud-Bouret # T 0 2 N[t−b,t−a] dNt + α Hawkes # 0 T 2 N[t−b,t−a] dt Aussois 2015 38 / 129 Multivariate Hawkes processes and Lasso Parametric estimation Least-square contrast on a toy model Let [a, b] an interval of R+ . η(t) = , α1a≤t−T ≤b = αN[t−b,t−a] T ∈N There is only one parameter α → minimizing γ "T N[t−b,t−a] dNt α ˆ = "0 T 2 0 N[t−b,t−a] dt P.Reynaud-Bouret Hawkes Aussois 2015 38 / 129 Multivariate Hawkes processes and Lasso Parametric estimation Least-square contrast on a toy model Let [a, b] an interval of R+ . η(t) = , α1a≤t−T ≤b = αN[t−b,t−a] T ∈N There is only one parameter α → minimizing γ "T N[t−b,t−a] dNt α ˆ = "0 T 2 0 N[t−b,t−a] dt b a P.Reynaud-Bouret Hawkes Aussois 2015 38 / 129 Multivariate Hawkes processes and Lasso Parametric estimation Least-square contrast on a toy model Let [a, b] an interval of R+ . η(t) = , α1a≤t−T ≤b = αN[t−b,t−a] T ∈N There is only one parameter α → minimizing γ "T N[t−b,t−a] dNt α ˆ = "0 T 2 0 N[t−b,t−a] dt b a P.Reynaud-Bouret Hawkes Aussois 2015 38 / 129 Multivariate Hawkes processes and Lasso Parametric estimation Least-square contrast on a toy model Let [a, b] an interval of R+ . η(t) = , α1a≤t−T ≤b = αN[t−b,t−a] T ∈N There is only one parameter α → minimizing γ "T N[t−b,t−a] dNt α ˆ = "0 T 2 0 N[t−b,t−a] dt b a P.Reynaud-Bouret Hawkes Aussois 2015 38 / 129 Multivariate Hawkes processes and Lasso Parametric estimation Least-square contrast on a toy model Let [a, b] an interval of R+ . η(t) = , α1a≤t−T ≤b = αN[t−b,t−a] T ∈N There is only one parameter α → minimizing γ "T N[t−b,t−a] dNt α ˆ = "0 T 2 0 N[t−b,t−a] dt b a P.Reynaud-Bouret Hawkes Aussois 2015 38 / 129 Multivariate Hawkes processes and Lasso Parametric estimation Least-square contrast on a toy model Let [a, b] an interval of R+ . η(t) = , α1a≤t−T ≤b = αN[t−b,t−a] T ∈N There is only one parameter α → minimizing γ "T N[t−b,t−a] dNt α ˆ = "0 T 2 0 N[t−b,t−a] dt The numerator is the number of pairs of points with delay in [a, b] P.Reynaud-Bouret Hawkes Aussois 2015 38 / 129 Multivariate Hawkes processes and Lasso Parametric estimation Full multivariate processes and linear parametrisation (r ) ? λ (t) = ν (r ) + M , , ℓ=1 T <t,T in N (ℓ) P.Reynaud-Bouret Hawkes (r ) hℓ (t − T ). Aussois 2015 39 / 129 Multivariate Hawkes processes and Lasso Parametric estimation Full multivariate processes and linear parametrisation (r ) ? λ (t) = ν (r ) + M , , ℓ=1 T <t,T in N (ℓ) (r ) hℓ (t − T ). + Piecewise constant model with parameter a P.Reynaud-Bouret Hawkes Aussois 2015 39 / 129 Multivariate Hawkes processes and Lasso Parametric estimation Full multivariate processes and linear parametrisation (r ) ? λ (t) = ν (r ) + M , , ℓ=1 T <t,T in N (ℓ) (r ) hℓ (t − T ). + Piecewise constant model with parameter a By linearity, ? λ(r ) (t) = (Rct )′ a, P.Reynaud-Bouret Hawkes Aussois 2015 39 / 129 Multivariate Hawkes processes and Lasso Parametric estimation Full multivariate processes and linear parametrisation (r ) ? λ (t) = ν (r ) + M , , ℓ=1 T <t,T in N (ℓ) (r ) hℓ (t − T ). + Piecewise constant model with parameter a By linearity, ? λ(r ) (t) = (Rct )′ a, with Rct being the renormalized instantaneous count given by ′ (Rct ) = P.Reynaud-Bouret ! " −1/2 (1) ′ −1/2 (M) ′ 1, δ (ct ) , ..., δ (ct ) , Hawkes Aussois 2015 39 / 129 Multivariate Hawkes processes and Lasso Parametric estimation Full multivariate processes and linear parametrisation (r ) ? λ (t) = ν (r ) + M , , ℓ=1 T <t,T in N (ℓ) (r ) hℓ (t − T ). + Piecewise constant model with parameter a By linearity, ? λ(r ) (t) = (Rct )′ a, with Rct being the renormalized instantaneous count given by ′ (Rct ) = (ℓ) and with ct ! " −1/2 (1) ′ −1/2 (M) ′ 1, δ (ct ) , ..., δ (ct ) , being the vector of instantaneous count with delay of Nℓ i.e. (ℓ) ′ (ct ) = P.Reynaud-Bouret ! " (ℓ) (ℓ) N[t−δ,t) , ..., N[t−K δ,t−(K −1)δ) . Hawkes Aussois 2015 39 / 129 Multivariate Hawkes processes and Lasso Parametric estimation An heuristic for the least-square estimator Informally, the link between the point process and its intensity can be written as (r ) dN (r ) (t) ≃ (Rct )′ a∗ dt + noise. P.Reynaud-Bouret Hawkes Aussois 2015 40 / 129 Multivariate Hawkes processes and Lasso Parametric estimation An heuristic for the least-square estimator Informally, the link between the point process and its intensity can be written as (r ) dN (r ) (t) ≃ (Rct )′ a∗ dt + noise. Let G= # T Rct (Rct )′ dt, 0 the integrated covariation of the renormalized instantaneous count. P.Reynaud-Bouret Hawkes Aussois 2015 40 / 129 Multivariate Hawkes processes and Lasso Parametric estimation An heuristic for the least-square estimator Informally, the link between the point process and its intensity can be written as (r ) dN (r ) (t) ≃ (Rct )′ a∗ dt + noise. Let G= # T Rct (Rct )′ dt, 0 the integrated covariation of the renormalized instantaneous count. b(r ) := P.Reynaud-Bouret # T 0 (r ) Rct dN (r ) (t) ≃ Ga∗ + noise. Hawkes Aussois 2015 40 / 129 Multivariate Hawkes processes and Lasso Parametric estimation An heuristic for the least-square estimator Informally, the link between the point process and its intensity can be written as (r ) dN (r ) (t) ≃ (Rct )′ a∗ dt + noise. Let G= # T Rct (Rct )′ dt, 0 the integrated covariation of the renormalized instantaneous count. b(r ) := # T 0 (r ) Rct dN (r ) (t) ≃ Ga∗ + noise. where in b lies again the number of couples with a certain delay (cross-correlogram). P.Reynaud-Bouret Hawkes Aussois 2015 40 / 129 Multivariate Hawkes processes and Lasso Parametric estimation Least-square estimate G= # T Rct (Rct )′ dt, 0 (r ) b := # T Rct dN (r ) (t) 0 Least-square estimate ˆ(r ) = G−1 b(r ) , a → simpler formula than Maximum Likelihood Estimators for similar properties (except efficiency). P.Reynaud-Bouret Hawkes Aussois 2015 41 / 129 Multivariate Hawkes processes and Lasso Parametric estimation What gain wrt cross correlogramm ? 1 5ms 3 2 5ms only 2 non zero interaction functions over 9 P.Reynaud-Bouret Hawkes Aussois 2015 42 / 129 Multivariate Hawkes processes and Lasso Parametric estimation What gain wrt cross correlogramm ? Histogram of dist2 Histogram of dist2 −0.03 −0.02 −0.01 0.00 1−>2 P.Reynaud-Bouret 0.01 0.02 0.03 40 Frequency 0 0 0 20 50 20 40 100 Frequency 60 Frequency 150 80 60 200 100 80 Histogram of dist2 −0.03 −0.02 −0.01 0.00 2−>3 Hawkes 0.01 0.02 0.03 −0.03 −0.02 −0.01 0.00 0.01 0.02 0.03 1−>3 Aussois 2015 42 / 129 Multivariate Hawkes processes and Lasso Parametric estimation 4 0 1 2 interaction 3 4 3 2 interaction 0 1 2 0 1 interaction 3 4 What gain wrt cross correlogramm ? 0.000 0.005 0.010 0.015 0.020 0.025 0.030 0.000 0.005 0.010 0.015 0.020 0.025 0.030 0.000 0.005 0.010 0.015 0.020 0.025 0.030 1−>2 2−>3 1−>3 P.Reynaud-Bouret Hawkes Aussois 2015 42 / 129 Multivariate Hawkes processes and Lasso What can we do against over fitting ? (Adaptation) Adaptive statistics If no model known, one usually wants to consider the largest possible model (piecewise constant with hundreds of parameters etc) If use MLE or OLS, each parameter estimate has a variance ≃ 1/T P.Reynaud-Bouret Hawkes Aussois 2015 43 / 129 Multivariate Hawkes processes and Lasso What can we do against over fitting ? (Adaptation) Adaptive statistics If no model known, one usually wants to consider the largest possible model (piecewise constant with hundreds of parameters etc) If use MLE or OLS, each parameter estimate has a variance ≃ 1/T If there are d parameters, global variance ≃ d/T P.Reynaud-Bouret Hawkes Aussois 2015 43 / 129 Multivariate Hawkes processes and Lasso What can we do against over fitting ? (Adaptation) Adaptive statistics If no model known, one usually wants to consider the largest possible model (piecewise constant with hundreds of parameters etc) If use MLE or OLS, each parameter estimate has a variance ≃ 1/T If there are d parameters, global variance ≃ d/T Over-fitting ! ! ! ! P.Reynaud-Bouret Hawkes Aussois 2015 43 / 129 Multivariate Hawkes processes and Lasso What can we do against over fitting ? (Adaptation) Moindres carrés Penalty and parameter choices Contraste Complexité du modèle P.Reynaud-Bouret Hawkes Aussois 2015 44 / 129 Multivariate Hawkes processes and Lasso What can we do against over fitting ? (Adaptation) Penalty and parameter choices 1 2 Moindres carr 3 P.Reynaud-Bouret Hawkes Aussois 2015 44 / 129 Multivariate Hawkes processes and Lasso What can we do against over fitting ? (Adaptation) Penalty and parameter choices 1 2 Moindres carr 3 1 10 xité du modèle (dimension) P.Reynaud-Bouret Hawkes 270 Aussois 2015 44 / 129 Multivariate Hawkes processes and Lasso What can we do against over fitting ? (Adaptation) Penalty and parameter choices Moindres carrés 1 3 2 Complexité du modèle P.Reynaud-Bouret Hawkes Aussois 2015 44 / 129 Multivariate Hawkes processes and Lasso What can we do against over fitting ? (Adaptation) Moindres carrés Penalty and parameter choices Pénalité Contraste Complexité du modèle P.Reynaud-Bouret Hawkes Aussois 2015 44 / 129 Multivariate Hawkes processes and Lasso What can we do against over fitting ? (Adaptation) Penalty and parameter choices Contraste + Pénalité = Critère pénalisé 1 Moindres carrés 3 2 Pénalité Contraste P.Reynaud-Bouret Complexité du modèle Hawkes Aussois 2015 44 / 129 Multivariate Hawkes processes and Lasso What can we do against over fitting ? (Adaptation) Previous works log-likelihood penalized by AIC (Akaike Criterion) (Ogata, Vere-Jones, Gusto and Schbath) : choose the model with d parameters such that ℓ(λθˆd ) + d Generally works well if few models and largest dimension fixed whereas T → ∞ P.Reynaud-Bouret Hawkes Aussois 2015 45 / 129 Multivariate Hawkes processes and Lasso What can we do against over fitting ? (Adaptation) Previous works log-likelihood penalized by AIC (Akaike Criterion) (Ogata, Vere-Jones, Gusto and Schbath) : choose the model with d parameters such that ℓ(λθˆd ) + d Generally works well if few models and largest dimension fixed whereas T → ∞ least-squares+ ℓ0 penalty, (RB and Schbath) γ(λθˆd ) + cˆd with cˆ data-driven. Proof that if cˆ > κ, it works even if d grows with T (moderately, log(T ) with d) P.Reynaud-Bouret Hawkes Aussois 2015 45 / 129 Multivariate Hawkes processes and Lasso What can we do against over fitting ? (Adaptation) Previous works log-likelihood penalized by AIC (Akaike Criterion) (Ogata, Vere-Jones, Gusto and Schbath) : choose the model with d parameters such that ℓ(λθˆd ) + d Generally works well if few models and largest dimension fixed whereas T → ∞ least-squares+ ℓ0 penalty, (RB and Schbath) γ(λθˆd ) + cˆd with cˆ data-driven. Proof that if cˆ > κ, it works even if d grows with T (moderately, log(T ) with d) → possible mathematically to search for the ”zeros” P.Reynaud-Bouret Hawkes Aussois 2015 45 / 129 Multivariate Hawkes processes and Lasso What can we do against over fitting ? (Adaptation) Previous works log-likelihood penalized by AIC (Akaike Criterion) (Ogata, Vere-Jones, Gusto and Schbath) : choose the model with d parameters such that ℓ(λθˆd ) + d Generally works well if few models and largest dimension fixed whereas T → ∞ least-squares+ ℓ0 penalty, (RB and Schbath) γ(λθˆd ) + cˆd with cˆ data-driven. Proof that if cˆ > κ, it works even if d grows with T (moderately, log(T ) with d) → possible mathematically to search for the ”zeros” Problem only for univariate because computational time and memory size awfully large ! P.Reynaud-Bouret Hawkes Aussois 2015 45 / 129 Multivariate Hawkes processes and Lasso What can we do against over fitting ? (Adaptation) Previous works log-likelihood penalized by AIC (Akaike Criterion) (Ogata, Vere-Jones, Gusto and Schbath) : choose the model with d parameters such that ℓ(λθˆd ) + d Generally works well if few models and largest dimension fixed whereas T → ∞ least-squares+ ℓ0 penalty, (RB and Schbath) γ(λθˆd ) + cˆd Magill with cˆ data-driven. Proof that if cˆ > κ, it works even if d grows with T (moderately, log(T ) with d) → possible mathematically to search for the ”zeros” Problem only for univariate because computational time and memory size awfully large ! → convex criterion P.Reynaud-Bouret Hawkes Aussois 2015 45 / 129 Multivariate Hawkes processes and Lasso What can we do against over fitting ? (Adaptation) Other works Maximum likelihood + exponential formula + ℓ1 ”group Lasso” penalty (Pillow et al. in neuroscience) but no mathematical proof Thresholding + tests for very particular bivariate models, oracle inequality (Sansonnet) P.Reynaud-Bouret Hawkes Aussois 2015 46 / 129 Multivariate Hawkes processes and Lasso Lasso criterion ℓ1 penalty with N.R. Hansen (Copenhagen), and V. Rivoirard (Dauphine) (2012) The Lasso criterion can be expressed independently for each sub-process by : Least Absolute Shrinkage introduced c- byTlbswnanid996)= Lasso criterion . And selection Operate (r ) ˆ(r ) = argmina {γT (λa ) + pen(a)} a = argmina {−2a′ br + a′ Ga + 2(d(r ) )′ |a|} with d′ |a| = lasso ! k dk |ak | ball minimizing ⇐, minimizing klinloutm Under Constraint lannm a. on curves bilinear . solution with ago issue P.Reynaud-Bouret level ofthe Hawkes J form n Aussois 2015 47 / 129 Multivariate Hawkes processes and Lasso Lasso criterion ℓ1 penalty with N.R. Hansen (Copenhagen), and V. Rivoirard (Dauphine) (2012) The Lasso criterion can be expressed independently for each sub-process by : Lasso criterion (r ) ˆ(r ) = argmina {γT (λa ) + pen(a)} a = argmina {−2a′ br + a′ Ga + 2(d(r ) )′ |a|} The crucial choice is the d(r ) , should be data-driven ! The theoretical validation : be able to state that our choice is the best possible choice. The practical validation : on simulated Hawkes processes (done), on simulated neuronal networks, on real data (RNRP Paris 6 work in progress)... P.Reynaud-Bouret Hawkes Aussois 2015 47 / 129 Multivariate Hawkes processes and Lasso Lasso criterion Theoretical Validation Recall that (r ) b # = T Rct dN (r ) (t) 0 and G= # T Rct (Rct )′ dt. 0 Hansen,Rivoirard,RB If G ≥ cI with c > 0 and if # ) *( ( T ( Rct dN (r ) (t) − λ(r ) (t)dt ( ≤ d(r ) , 0 then ⎧ ⎨, , 1 ˆ(r ) ∥2 ≤ ! inf ∥λ(r ) − Rct a ∥λ(r ) − Rct a∥2 + a ⎩ c r r P.Reynaud-Bouret Hawkes ∀r , (r ) (di )2 i ∈supp(a) Aussois 2015 ⎫ ⎬ ⎭ . 48 / 129 Multivariate Hawkes processes and Lasso Lasso criterion Small proof (Univariate case) Let us fix some a and let ηa (t) = Rc′t a, our candidate intensity. "T Since G = 0 Rct (Rct )′ dt, a′ Ga = ||ηa ||2 P.Reynaud-Bouret Hawkes Aussois 2015 49 / 129 Multivariate Hawkes processes and Lasso Lasso criterion Small proof (Univariate case) Let us fix some a and let ηa (t) = Rc′t a, our candidate intensity. "T Since G = 0 Rct (Rct )′ dt, a′ Ga = ||ηa ||2 ||λ − ηˆa ||2 = ||λ||2 + ||ηˆa || − 2 # T ηˆa (t)λ(t)dt # T 2 = ||λ|| + γ(ηˆa ) + 2 ηˆa (t)(dNt − λ(t)dt) 0 0 P.Reynaud-Bouret Hawkes Aussois 2015 49 / 129 Multivariate Hawkes processes and Lasso Lasso criterion Small proof (Univariate case) Let us fix some a and let ηa (t) = Rc′t a, our candidate intensity. "T Since G = 0 Rct (Rct )′ dt, a′ Ga = ||ηa ||2 2 2 ||λ − ηˆa || = ||λ|| + ||ηˆa || − 2 # T ηˆa (t)λ(t)dt # T 2 ηˆa (t)(dNt − λ(t)dt) = ||λ|| + γ(ηˆa ) + 2 0 # T 2 ′ ηˆa (t)(dNt − λ(t)dt) ≤ ||λ|| + γ(ηa ) + 2d (|a| − |ˆ a|) + 2 0 0 P.Reynaud-Bouret Hawkes Aussois 2015 49 / 129 Multivariate Hawkes processes and Lasso Lasso criterion Small proof (Univariate case) Let us fix some a and let ηa (t) = Rc′t a, our candidate intensity. "T Since G = 0 Rct (Rct )′ dt, a′ Ga = ||ηa ||2 2 2 ||λ − ηˆa || = ||λ|| + ||ηˆa || − 2 # T ηˆa (t)λ(t)dt # T = ||λ||2 + γ(ηˆa ) + 2 ηˆa (t)(dNt − λ(t)dt) 0 # T 2 ′ ≤ ||λ|| + γ(ηa ) + 2d (|a| − |ˆ a|) + 2 ηˆa (t)(dNt − λ(t)dt) 0 # T 2 ′ [ηˆa (t) − ηa (t)](dNt − λ(t)dt) ≤ ||λ − ηa || + 2d (|a| − |ˆ a|) + 2 0 0 P.Reynaud-Bouret Hawkes Aussois 2015 49 / 129 Multivariate Hawkes processes and Lasso Lasso criterion Small proof (Univariate case) Let us fix some a and let ηa (t) = Rc′t a, our candidate intensity. "T Since G = 0 Rct (Rct )′ dt, a′ Ga = ||ηa ||2 ||λ − ηˆa ||2 = ||λ||2 + ||ηˆa || − 2 # T ηˆa (t)λ(t)dt # T ηˆa (t)(dNt − λ(t)dt) = ||λ||2 + γ(ηˆa ) + 2 0 # T 2 ′ ≤ ||λ|| + γ(ηa ) + 2d (|a| − |ˆ a|) + 2 ηˆa (t)(dNt − λ(t)dt) 0 6 5# T ′ 2 ′ Rct (dNt − λ(t)dt (ˆ a − a) ≤ ||λ − ηa || + 2d (|a| − |ˆ a|) + 2 0 0 P.Reynaud-Bouret Hawkes Aussois 2015 49 / 129 Multivariate Hawkes processes and Lasso Lasso criterion Small proof (Univariate case) Let us fix some a and let ηa (t) = Rc′t a, our candidate intensity. "T Since G = 0 Rct (Rct )′ dt, a′ Ga = ||ηa ||2 2 2 ||λ − ηˆa || = ||λ|| + ||ηˆa || − 2 # T ηˆa (t)λ(t)dt # T 2 ηˆa (t)(dNt − λ(t)dt) = ||λ|| + γ(ηˆa ) + 2 0 # T ηˆa (t)(dNt − λ(t)dt) ≤ ||λ||2 + γ(ηa ) + 2d′ (|a| − |ˆ a|) + 2 0 0 ˆ| ≤ ||λ − ηa ||2 + 2d′ (|a| − |ˆ a|) + 2d′ |a − a P.Reynaud-Bouret Hawkes Aussois 2015 49 / 129 Multivariate Hawkes processes and Lasso Lasso criterion Small proof (Univariate case)(2) ˆ| ||λ − ηˆa ||2 ≤ ||λ − ηa ||2 + 2d′ (|a| − |ˆ a|) + 2d′ |a − a P.Reynaud-Bouret Hawkes Aussois 2015 50 / 129 Multivariate Hawkes processes and Lasso Lasso criterion Small proof (Univariate case)(2) ˆ| ||λ − ηˆa ||2 ≤ ||λ − ηa ||2 + 2d′ (|a| − |ˆ a|) + 2d′ |a − a , ˆi | ≤ ||λ − ηa ||2 + 4 di |ai − a i ∈supp(a) P.Reynaud-Bouret Hawkes Aussois 2015 50 / 129 Multivariate Hawkes processes and Lasso Lasso criterion Small proof (Univariate case)(2) ˆ| ||λ − ηˆa ||2 ≤ ||λ − ηa ||2 + 2d′ (|a| − |ˆ a|) + 2d′ |a − a , ˆi | ≤ ||λ − ηa ||2 + 4 di |ai − a i ∈supp(a) ⎛ ˆ|| ⎝ ≤ ||λ − ηa ||2 + 4||a − a P.Reynaud-Bouret Hawkes , i ∈supp(a) ⎞1/2 d2i ⎠ Aussois 2015 50 / 129 Multivariate Hawkes processes and Lasso Lasso criterion Small proof (Univariate case)(2) ˆ| ||λ − ηˆa ||2 ≤ ||λ − ηa ||2 + 2d′ (|a| − |ˆ a|) + 2d′ |a − a , ˆi | ≤ ||λ − ηa ||2 + 4 di |ai − a i ∈supp(a) ⎛ ˆ|| ⎝ ≤ ||λ − ηa ||2 + 4||a − a But ˆ||2 ≤ ||a − a , i ∈supp(a) ⎞1/2 d2i ⎠ 1 1 ˆ)′ G(a − a ˆ) = ||ηa − ηˆa ||2 (a − a c c < 2; ||ηa − λ||2 + ||λ − ηˆa ||2 ≤ c P.Reynaud-Bouret Hawkes Aussois 2015 50 / 129 Multivariate Hawkes processes and Lasso Lasso criterion Small proof (Univariate case)(2) KB : ˆ| ||λ − ηˆa ||2 ≤ ||λ − ηa ||2 + 2d′ (|a| − |ˆ a|) + 2d′ |a − a , ˆi | ≤ ||λ − ηa ||2 + 4 di |ai − a ' i ∈supp(a) too Iuvfcuittev ( Futral .eu?iE2thw ⎛ ' ⇐ . Hence for all α > 0 ˆ|| ⎝ ≤ ||λ − ηa ||2 + 4||a − a , i ∈supp(a) ; < 8α ||λ − ηˆa ||2 ≤ ||λ − ηa ||2 + α ||ηa − λ||2 + ||λ − ηˆa ||2 + c P.Reynaud-Bouret Hawkes ⎞1/2 d2i ⎠ , d2i i ∈supp(a) Aussois 2015 50 / 129 Multivariate Hawkes processes and Lasso Lasso criterion What did we proved ? (Univariate) Recall that # b= T Rct dN(t) 0 and G= # T Rct (Rct )′ dt. 0 Hansen,Rivoirard,RB If G ≥ cI with c > 0 and if # ( T ( ( Rct (dN(t) − λ(t)dt) ( ≤ d, 0 then ˆ∥2 ≤ ! inf ∥λ − Rct a a P.Reynaud-Bouret ⎧ ⎨ ⎩ ∥λ − Rct a∥2 + 1 c Hawkes , i ∈supp(a) (di )2 ⎫ ⎬ ⎭ . Aussois 2015 51 / 129 Multivariate Hawkes processes and Lasso Lasso criterion Known support If there is a true parameter a∗ , then λ(t) = Rc′t a∗ and ¯= b # T Rct λ(t)dt = Ga∗ 0 P.Reynaud-Bouret Hawkes Aussois 2015 52 / 129 Multivariate Hawkes processes and Lasso Lasso criterion Known support If there is a true parameter a∗ , then λ(t) = Rc′t a∗ and ¯= b # T Rct λ(t)dt = Ga∗ 0 ˆS If support of a∗ , S, known, the least-square estimate on S → a P.Reynaud-Bouret Hawkes Aussois 2015 52 / 129 Multivariate Hawkes processes and Lasso Lasso criterion Known support If there is a true parameter a∗ , then λ(t) = Rc′t a∗ and ¯= b # T Rct λ(t)dt = Ga∗ 0 ˆS If support of a∗ , S, known, the least-square estimate on S → a ˆS ∥2 = (a∗ − a ˆS )′ G(a∗ − a ˆS ) ∥λ − Rct a ′ −1 ¯ ¯ = (b − b) G (b − b) ≤ P.Reynaud-Bouret 1 ¯ ||b − b||2 c Hawkes Aussois 2015 52 / 129 Multivariate Hawkes processes and Lasso Lasso criterion Oracle inequality Lasso criterion (r ) ˆ(r ) = argmina {γT (λa ) + pen(a)} a = argmina {−2a′ br + a′ Ga + 2(d(r ) )′ |a|} Hansen,Rivoirard,RB If G ≥ cI with c > 0 and if ( (r ) ( (b − b ¯ (r ) ( ≤ d(r ) , ∀r then ⎧ ⎨, , 1 ˆ(r ) ∥2 ≤ ! inf ∥λ(r ) − Rct a ∥λ(r ) − Rct a∥2 + a ⎩ c r r P.Reynaud-Bouret Hawkes , (r ) (di )2 i ∈supp(a) Aussois 2015 ⎫ ⎬ ⎭ . 53 / 129 Multivariate Hawkes processes and Lasso Lasso criterion Comments If we can find d sharp and data-driven, then this is an oracle inequality ! ! : Only an oracle could know the support in advance Up to constant, we can do as well as the oracle P.Reynaud-Bouret Hawkes Aussois 2015 54 / 129 Multivariate Hawkes processes and Lasso Lasso criterion Comments If we can find d sharp and data-driven, then this is an oracle inequality ! ! : Only an oracle could know the support in advance Up to constant, we can do as well as the oracle We do not pay anything here for the size, except G invertible it’s a ”cheap” oracle inequality ... In fact could be done under more relaxed assumption... c large (and we can observe it !) → quite confident for good reconstruction =quality indicator. P.Reynaud-Bouret Hawkes Aussois 2015 54 / 129 Multivariate Hawkes processes and Lasso Lasso criterion Comments If we can find d sharp and data-driven, then this is an oracle inequality ! ! : Only an oracle could know the support in advance Up to constant, we can do as well as the oracle We do not pay anything here for the size, except G invertible it’s a ”cheap” oracle inequality ... In fact could be done under more relaxed assumption... c large (and we can observe it !) → quite confident for good reconstruction =quality indicator. True even if the Hawkes linear model not true ! ! ! and we just pay ”quality of approximation” quality of approximation + unavoidable price due to estimation. P.Reynaud-Bouret Hawkes Aussois 2015 54 / 129 Multivariate Hawkes processes and Lasso Lasso criterion Mathematical ”debts” works for other models (anything that is linear), works for other basis and dictionary ... Choice of d : → martingale calculus G invertible in some cases ? ? ? → branching structure P.Reynaud-Bouret Hawkes Aussois 2015 55 / 129 Multivariate Hawkes processes and Lasso Simulations Simulation study - Estimation (T = 20, M = 8, K = 8) oedema Interactions reconstructed with ’Adaptive Lasso’, ’Bernstein Lasso’ and ’Bernstein Lasso+OLS’. Above graphs, estimation of spontaneous rates P.Reynaud-Bouret Hawkes Aussois 2015 56 / 129 Multivariate Hawkes processes and Lasso Simulations Another example with inhibition P.Reynaud-Bouret Hawkes Aussois 2015 57 / 129 Multivariate Hawkes processes and Lasso Real data analysis On real (genomic) data Spont= 0.000385 8e−04 0e+00 4e−04 interaction 0e+00 −5e−04 interaction 5e−04 Spont= 0.000385 1000 2000 3000 4000 5000 0 1000 2000 3000 bps bps Spont= 9.9e−05 Spont= 9.9e−05 4000 5000 4000 5000 4e−04 interaction −1.0 0e+00 2e−04 0.0 −0.5 interaction 0.5 6e−04 1.0 0 0 1000 2000 3000 4000 5000 bps 0 1000 2000 3000 bps 4290 genes and 1036 tataat of E. coli (T = 9288442, A = 10000) P.Reynaud-Bouret Hawkes Aussois 2015 58 / 129 Multivariate Hawkes processes and Lasso Real data analysis Sensory-motor task (F. Grammont (Nice), A. Riehle (Marseille)) P.Reynaud-Bouret Hawkes Aussois 2015 59 / 129 Multivariate Hawkes processes and Lasso Real data analysis On neuronal data (sensorimotor task) Hz −20 Hz −40 −1.0 −60 0.00 0.01 0.02 0.03 0.04 0.00 0.01 0.02 0.03 0.04 seconds SB= 55.5 SBO= 88.9 SB= 55.5 SBO= 88.9 −150 Hz −50 0 0.0 0.5 1.0 seconds −1.0 Hz 0.0 0.5 1.0 SB= 22.2 SBO= 37.2 0 SB= 22.2 SBO= 37.2 0.00 0.01 0.02 0.03 0.04 seconds 0.00 0.01 0.02 0.03 0.04 seconds 30 trials : monkey trained to touch the correct target when illuminated. Accept the test of Hawkes hypothesis. Work with F. Grammont, V. P.Reynaud-Bouret Hawkes Aussois 2015 60 / 129 Multivariate Hawkes processes and Lasso Real data analysis Tests with C. Tuleau-Malot, F. Grammont (Nice) and V. Rivoirard (Dauphine) (2013) It is possible to test whether a process is a Hawkes process with prescribed interaction functions by using the time-rescaling theorem. (known since Ogata) P.Reynaud-Bouret Hawkes Aussois 2015 61 / 129 Multivariate Hawkes processes and Lasso Real data analysis Tests with C. Tuleau-Malot, F. Grammont (Nice) and V. Rivoirard (Dauphine) (2013) It is possible to test whether a process is a Hawkes process with prescribed interaction functions by using the time-rescaling theorem. (known since Ogata) By using subsampling, it is possible to plug an estimate of the functions and still to test with controlled asymptotic level. P.Reynaud-Bouret Hawkes Aussois 2015 61 / 129 Multivariate Hawkes processes and Lasso Real data analysis Tests with C. Tuleau-Malot, F. Grammont (Nice) and V. Rivoirard (Dauphine) (2013) It is possible to test whether a process is a Hawkes process with prescribed interaction functions by using the time-rescaling theorem. (known since Ogata) By using subsampling, it is possible to plug an estimate of the functions and still to test with controlled asymptotic level. On both previous data sets, the Hawkes hypothesis is accepted (p-values depends on the sub-sample, usually between 20 and 80 %), whereas the homogeneous Poisson hypothesis (i.e. no interactions) is rejected (p-values in 10−4 , 10−16 for the neuronal data) P.Reynaud-Bouret Hawkes Aussois 2015 61 / 129 Multivariate Hawkes processes and Lasso Real data analysis Functional Connectivity ? (Equipe RNRP de Paris 6) P.Reynaud-Bouret Hawkes Aussois 2015 62 / 129 Multivariate Hawkes processes and Lasso Real data analysis Tetrode data on the rat (Equipe RNRP de Paris 6) P.Reynaud-Bouret Hawkes Aussois 2015 63 / 129 Multivariate Hawkes processes and Lasso Real data analysis On neuronal data (vibrissa excitation) Joint work with RNRP (Paris 6). Behavior : vibrissa excitation at low frequency. T = 90.5, M = 10 TT 102 9191 TT 201 99 TT 202 544 TT 204 149 TT 301 15 TT 303 18 TT 401 136 TT 402 282 TT 403 8 TT 404 6 4 40 Comportement 1 ; k 10 ; delta 0.005 ; gamma 1 TT1_02 −> TT1_02 TT2_02 −> TT1_02 TT1_02 −> TT2_02 TT4_01 TT4_03 TT2_04 TT1_02 −> TT4_02 1 10 2 20 TT2_02 Hz TT4_02 TT3_01 TT3_03 TT2_01 0 TT4_04 0 P.Reynaud-Bouret TT1_02 −> TT2_04 30 3 TT1_02 0 Neuron Spikes 1 2 3 4 0.00 Hawkes 0.01 0.02 0.03 0.04 0.05 Aussois 2015 64 / 129 Multivariate Hawkes processes and Lasso Real data analysis Application on real data Data : Neuron Spikes TT 102 9191 TT 201 99 TT 202 544 TT 204 149 TT 301 15 TT 303 18 TT 401 136 TT 402 282 TT 403 8 TT 404 6 TT 201 92 TT 202 585 TT 204 148 TT 301 13 TT 303 23 TT 401 133 TT 402 271 TT 403 8 TT 404 3 Simulation : Neuron Spikes TT 102 9327 P.Reynaud-Bouret Hawkes Aussois 2015 65 / 129 Multivariate Hawkes processes and Lasso Real data analysis Application on real data Data : Neuron Spikes TT 102 9191 TT 201 99 TT 202 544 TT 204 149 TT 301 15 TT 303 18 TT 401 136 TT 402 282 TT 403 8 TT 404 6 TT 201 92 TT 202 585 TT 204 148 TT 301 13 TT 303 23 TT 401 133 TT 402 271 TT 403 8 TT 404 3 Simulation : 40 4 TT 102 9327 TT1_02 −> TT1_02 TT2_02 −> TT1_02 TT4_02 TT2_02 TT4_01 TT4_03 TT2_04 1 10 2 Hz TT1_02 20 30 3 TT1_02 −> TT2_02 TT3_01 TT3_03 TT2_01 0 TT4_04 0 Neuron Spikes 0 1 2 3 4 0.00 0.01 0.02 0.03 0.04 0.05 time in seconds P.Reynaud-Bouret Hawkes Aussois 2015 65 / 129 Multivariate Hawkes processes and Lasso Real data analysis Evolution of the dependance graph as a fonction of the vibrissa excitation 10 25 TT2_14 50 TT2_14 TT2_0F TT4_0A TT2_0F TT4_0A TT4_0A TT4_05 TT4_18 TT4_05 TT4_18 TT3_04 TT2_03 TT1_01 TT2_03 TT1_01 TT2_03 TT1_02 TT1_02 75 100 125 TT2_14 TT2_14 TT2_14 TT2_0F TT4_0A TT2_0F TT4_0A TT4_0A TT4_05 TT4_05 TT4_18 TT3_04 TT2_03 TT4_05 TT4_18 TT3_04 TT1_01 TT3_04 TT1_01 TT2_03 TT1_01 TT2_03 TT1_02 TT1_02 TT1_02 150 175 200 TT2_14 TT2_14 TT2_0F TT2_14 TT2_0F TT4_0A TT2_0F TT4_0A TT4_05 P.Reynaud-Bouret TT4_0A TT4_05 TT4_18 TT3_04 TT1_01 TT2_03 TT3_04 TT1_02 TT2_0F TT4_18 TT4_05 TT4_18 TT3_04 TT1_01 TT4_18 TT2_14 TT2_0F TT4_05 TT4_18 TT3_04 TT1_01 TT2_03 Hawkes TT3_04 TT1_01 TT2_03 Aussois 2015 66 / 129 Probabilistic ingredients Table of Contents 1 Point process and Counting process 2 Multivariate Hawkes processes and Lasso 3 Probabilistic ingredients Exponential inequalities (Concentration of measure) Controls via a branching structure 4 Back to Lasso 5 PDE and point processes P.Reynaud-Bouret Hawkes Aussois 2015 67 / 129 Probabilistic ingredients Exponential inequalities (Concentration of measure) Table of Contents 1 Point process and Counting process 2 Multivariate Hawkes processes and Lasso 3 Probabilistic ingredients Exponential inequalities (Concentration of measure) The Poisson case The martingale case Back to Lasso Controls via a branching structure 4 Back to Lasso 5 PDE and point processes P.Reynaud-Bouret Hawkes Aussois 2015 68 / 129 Probabilistic ingredients Exponential inequalities (Concentration of measure) Probability generating functional and Laplace transform p.g.fl For any point process N, the functional which associates to any positive function h + E( h(x)) x∈N Laplace functional For any point process N, the functional which associates to any function f 5 =# >6 f (x)dNx E exp . equivalence with f = log(h). P.Reynaud-Bouret Hawkes Aussois 2015 69 / 129 Probabilistic ingredients Exponential inequalities (Concentration of measure) Campbell’s theorem Let N be a Poisson process with mean measure µ, Poisson processes for all integer n, for all A1 , . . . , An disjoint measurable subsets of X, NA1 , . . . , NAn are independent random variables. for all measurable subset A of X, NA obeys a Poisson law with parameter depending on A and denoted µ(A). P.Reynaud-Bouret Hawkes Aussois 2015 70 / 129 Probabilistic ingredients Exponential inequalities (Concentration of measure) Campbell’s theorem Let N be a Poisson process with mean measure µ, typically, dµt = λ(t)dt, with λ deterministic, P.Reynaud-Bouret Hawkes Aussois 2015 70 / 129 Probabilistic ingredients Exponential inequalities (Concentration of measure) Campbell’s theorem Let N be a Poisson process with mean measure µ, typically, dµt = λ(t)dt, with λ deterministic, Campbell’s theorem " If f is such that min(|f (x)|, 1)dµx < ∞, then 6 5 =# >6 5# ? @ f (x) e − 1 dµx . = exp f (x)dNx E exp P.Reynaud-Bouret Hawkes Aussois 2015 70 / 129 Probabilistic ingredients Exponential inequalities (Concentration of measure) Small proof Lz Take f piecewise constant, f = , αI 1I . I 5 E exp =# f (x)dNx >6 = E + I = + I = + I = exp P.Reynaud-Bouret e αI NI . In D: Iz I } 24 In ) * E e αI NI (by independence) exp ((e αI − 1)µI ) (Laplace of a Poisson) 5# (e Hawkes f (x) − 1)dµx 6 Aussois 2015 71 / 129 Probabilistic ingredients Exponential inequalities (Concentration of measure) Exponential inequality for Poisson process Aim For all ξ > 0, P 5# 6 f (x)[dNx − λ(x)dx] ≥???(y ) ≤ e −y NB : easier this way for interpretation in statistics... (quantile) Let f be some fixed function (deterministic) and let θ > 0. Apply Campbell’s theorem to θf : >6 5# 5 =# 6 θf (x) (e − θf (x) − 1)dµx . θf (x)[dNx − λ(x)dx] = exp E exp P.Reynaud-Bouret Hawkes Aussois 2015 72 / 129 Probabilistic ingredients Exponential inequalities (Concentration of measure) Exponential inequality for Poisson process Aim For all ξ > 0, P 5# 6 f (x)[dNx − λ(x)dx] ≥???(y ) ≤ e −y NB : easier this way for interpretation in statistics... (quantile) Let f be some fixed function (deterministic) and let θ > 0. Apply Campbell’s theorem to θf : >6 5# 5 =# 6 θf (x) (e − θf (x) − 1)dµx . θf (x)[dNx − λ(x)dx] = exp E exp Hence if E = exp P.Reynaud-Bouret ;" < " θf (x)[dNx − λ(x)dx] − (e θf (x) − θf (x) − 1)dµx , E(E ) = 1. Hawkes Aussois 2015 72 / 129 Probabilistic ingredients Exponential inequalities (Concentration of measure) Towards a (weak) Bernstein inequality... Therefore for all y > 0 5# 6 # θf (x) P θf (x)[dNx − λ(x)dx] ≥ (e − θf (x) − 1)dµx + y = P(E ≥ e y ) P.Reynaud-Bouret Hawkes Aussois 2015 73 / 129 Probabilistic ingredients Exponential inequalities (Concentration of measure) Towards a (weak) Bernstein inequality... Therefore for all y > 0 5# 6 # θf (x) P θf (x)[dNx − λ(x)dx] ≥ (e − θf (x) − 1)dµx + y = P(E ≥ e y ) By Markov, 5# 6 # θf (x) θf (x)[dNx − λ(x)dx] ≥ (e − θf (x) − 1)dµx + y ≤ E(E )e −y . P P.Reynaud-Bouret Hawkes Aussois 2015 73 / 129 Probabilistic ingredients Exponential inequalities (Concentration of measure) Towards a (weak) Bernstein inequality... Therefore for all y > 0 5# 6 # θf (x) P θf (x)[dNx − λ(x)dx] ≥ (e − θf (x) − 1)dµx + y = P(E ≥ e y ) Hence for all θ, y > 0 5# 6 # 1 θf (x) f (x)[dNx − λ(x)dx] ≥ P (e − θf (x) − 1)dµx + y /θ ≤ e −y . θ P.Reynaud-Bouret Hawkes Aussois 2015 73 / 129 Probabilistic ingredients Exponential inequalities (Concentration of measure) Towards a (weak) Bernstein inequality... Therefore for all y > 0 5# 6 # θf (x) P θf (x)[dNx − λ(x)dx] ≥ (e − θf (x) − 1)dµx + y = P(E ≥ e y ) Hence for all θ, y > 0 5# 6 # 1 θf (x) P f (x)[dNx − λ(x)dx] ≥ (e − θf (x) − 1)dµx + y /θ ≤ e −y . θ But e θf (x) − θf (x) − 1 ≤ ≤ Let v = " f (x)2 dµx . P.Reynaud-Bouret Hawkes ∞ , θk k=2 k! k−2 f (x)2 ||f ||∞ θ 2 f (x)2 ) 2 1− θ||f ||∞ 3 * Aussois 2015 73 / 129 Probabilistic ingredients Exponential inequalities (Concentration of measure) Towards a (weak) Bernstein inequality...(2) " Let v = f (x)2 λ(x)dx. Hence for all θ, y > 0 ⎛ # P ⎝ f (x)[dNx − λ(x)dx] ≥ P.Reynaud-Bouret θv ) 2 1− Hawkes θ||f ||∞ 3 ⎞ * + y /θ ⎠ ≤ e −y Aussois 2015 74 / 129 Probabilistic ingredients Exponential inequalities (Concentration of measure) Towards a (weak) Bernstein inequality...(2) " Let v = f (x)2 λ(x)dx. Hence for all θ, y > 0 ⎛ # P ⎝ f (x)[dNx − λ(x)dx] ≥ θv ) 2 1− θ||f ||∞ 3 Nd Optimum in θ = g (v , ||f ||∞ , y ) and : ⎞ * + y /θ ⎠ ≤ e −y si Putt 4 Xnorlmf e- ¥ Pam , Frett , ' Exponential inequality ”`a la” Bernstein ie For all y > 0, 5# 6 A ||f ||∞ −y P f (x)[dNx − λ(x)dx] ≥ 2vy + y ≤e 3 );" C B" <2 * NB : v = E = Var f (x)dNx . Hence first f (x)[dNx − λ(x)dx] order Gaussian tail with variance v P.Reynaud-Bouret Hawkes Aussois 2015 74 / 129 Probabilistic ingredients Exponential inequalities (Concentration of measure) Aim same kind of exponential inequality for Hawkes (or other general counting process) f (.) deterministic has to be replaced by Ht (ex : Rct ) predictable " The ”variance” term v should follow Hs dNs → expectation given the past. P.Reynaud-Bouret Hawkes Aussois 2015 75 / 129 Probabilistic ingredients Exponential inequalities (Concentration of measure) Aim same kind of exponential inequality for Hawkes (or other general counting process) f (.) deterministic has to be replaced by Ht (ex : Rct ) predictable " The ”variance” term v should follow Hs dNs → expectation given the past. optional : v still depends on λ, unknown → as to be replaced by observable quantity. P.Reynaud-Bouret Hawkes Aussois 2015 75 / 129 Probabilistic ingredients Exponential inequalities (Concentration of measure) Martingale Let N counting process with (predictable) intensity λ. Let Hs be any predictable process and t > u, then ( ( E(Ht dNt ( past at t) = Ht E(dNt | past at t) = Ht λ(t)dt P.Reynaud-Bouret Hawkes Aussois 2015 76 / 129 Probabilistic ingredients Exponential inequalities (Concentration of measure) Martingale Let N counting process with (predictable) intensity λ. Let Hs be any predictable process and t > u, then ( ( E(Ht dNt ( past at t) = Ht E(dNt | past at t) = Ht λ(t)dt E 5# t 0 6 # ( ( Hs [dNs − λ(s)ds]( past at u = u 0 Hs [dNs − λ(s)ds] → martingale. P.Reynaud-Bouret Hawkes Aussois 2015 76 / 129 Probabilistic ingredients Exponential inequalities (Concentration of measure) Bracket A Stieljes integration by part : whatever the bounded variation c`ad-l`ag functions (i.e. works for dNt , λ(t)dt or dNt − λ((t)dt) # t # t g (s − )dfs f (t)g (t) = f (0)g (0) + f (s)dgs + 0 0 dg : df t get " a f P.Reynaud-Bouret Hawkes Aussois 2015 77 / 129 Probabilistic ingredients Exponential inequalities (Concentration of measure) Bracket A Stieljes integration by part : whatever the bounded variation c`ad-l`ag functions (i.e. works for dNt , λ(t)dt or dNt − λ((t)dt) # t # t g (s − )dfs f (t)g (t) = f (0)g (0) + f (s)dgs + 0 0 "t Hence, with f (t) = g (t) = 0 Hs [dNs − λ(s)ds], 5# t 62 Hs [dNs − λ(s)ds] = 0 # t $# s 0 0 % % # t $# s− Hu [dNu − λ(u)du] Hs [dNs − λ(s)ds] + Hu [dNu − λ(u)du] Hs [dNs − λ(s)ds] P.Reynaud-Bouret 0 Hawkes 0 Aussois 2015 77 / 129 Probabilistic ingredients Exponential inequalities (Concentration of measure) Bracket A Stieljes integration by part : whatever the bounded variation c`ad-l`ag functions (i.e. works for dNt , λ(t)dt or dNt − λ((t)dt) # t # t g (s − )dfs f (t)g (t) = f (0)g (0) + f (s)dgs + 0 0 "t Hence, with f (t) = g (t) = 0 Hs [dNs − λ(s)ds], 5# t 62 Hs [dNs − λ(s)ds] = 0 # t $# ;s 0 0 % % # t $# s− i . Hu [dNu − λ(u)du] Hs [dNs − λ(s)ds] + Hu [dNu − λ(u)du] Hs [dNs − λ(s)ds] 0 = # 0 t 0 Hs2 dNs + martingale "t → compensator (≃ variance given the past) Vt = 0 Hs2 λ(s)ds 62 5# t Hs [dNs − λ(s)ds] − Vt martingale predictable and P.Reynaud-Bouret 0 Hawkes Aussois 2015 77 / 129 Probabilistic ingredients Exponential inequalities (Concentration of measure) Exponential martingale Aim : the martingale equivalent of Campbell’s theorem. P.Reynaud-Bouret Hawkes Aussois 2015 78 / 129 Probabilistic ingredients Exponential inequalities (Concentration of measure) Exponential martingale fi N Poison (7) HIT Aim : the martingale equivalent of Campbell’s theorem. 5# t 6 # t Hs Et = exp Hs [dNs − λ(s)ds] − (e − Hs − 1)λ(s)ds 0 0 is the unique solution of Et = E0 + # 0 t Es− (e Hs − 1)[dNs − λ(s)ds]. - P.Reynaud-Bouret Hawkes Aussois 2015 78 / 129 Probabilistic ingredients Exponential inequalities (Concentration of measure) Exponential martingale Aim : the martingale equivalent of Campbell’s theorem. 5# t 6 # t Hs Et = exp Hs [dNs − λ(s)ds] − (e − Hs − 1)λ(s)ds 0 0 is the unique solution of Et = E0 + # 0 t Es− (e Hs − 1)[dNs − λ(s)ds]. Hence martingale and E(Et ) = E0 = 1. NB : eventually, integrability problems, so E(Et ) ≤ 1... P.Reynaud-Bouret Hawkes Aussois 2015 78 / 129 Probabilistic ingredients Exponential inequalities (Concentration of measure) Exponential inequality `a la Bernstein Hs → θHs and the corresponding Et → E in the Poisson Bernstein proof : for all θ, y > 0 ⎛ # t ⎝ P Hs [dNs − λ(s)ds] ≥ 0 P.Reynaud-Bouret ) θVt 2 1− Hawkes θ||H||∞ 3 ⎞ * + y /θ ⎠ ≤ e −y Aussois 2015 79 / 129 Probabilistic ingredients Exponential inequalities (Concentration of measure) Exponential inequality `a la Bernstein Hs → θHs and the corresponding Et → E in the Poisson Bernstein proof : for all θ, y > 0 ⎛ # t ⎝ P Hs [dNs − λ(s)ds] ≥ 0 ) θVt 2 1− θ||H||∞ 3 ⎞ * + y /θ ⎠ ≤ e −y Assuming ||H||∞ ≤ b deterministic and known, not a big deal since second order term, but assuming Vt ≤ v and replacing Vt by v deterministic is not sharp at all ! However optimising in θ needs a θ deterministic (because ultimately Et depends on θ) P.Reynaud-Bouret Hawkes Aussois 2015 79 / 129 Probabilistic ingredients Exponential inequalities (Concentration of measure) Slicing Let S = {w ≤ Vt ≤ v = (1 + ϵ)K w } and consider the slice Sk = {(1 + ϵ)k w ≤ Vt ≤ (1 + ϵ)k+1 w )} P.Reynaud-Bouret Hawkes Aussois 2015 80 / 129 Probabilistic ingredients Exponential inequalities (Concentration of measure) Slicing Let S = {w ≤ Vt ≤ v = (1 + ϵ)K w } and consider the slice Sk = {(1 + ϵ)k w ≤ Vt ≤ (1 + ϵ)k+1 w )} Then P -# t 0 θVt C + y /θ and Sk Hs [dNs − λ(s)ds] ≥ B 2 1 − θb 3 P.Reynaud-Bouret Hawkes . ≤ e −y Aussois 2015 80 / 129 Probabilistic ingredients Exponential inequalities (Concentration of measure) Slicing Let S = {w ≤ Vt ≤ v = (1 + ϵ)K w } and consider the slice Sk = {(1 + ϵ)k w ≤ Vt ≤ (1 + ϵ)k+1 w )} Then P P -# -# 0 t 0 t θVt C + y /θ and Sk Hs [dNs − λ(s)ds] ≥ B 2 1 − θb 3 . θ(1 + ϵ)k+1 w B C + y /θ and Sk Hs [dNs − λ(s)ds] ≥ 2 1 − θb 3 P.Reynaud-Bouret Hawkes ≤ e −y . ≤ e −y Aussois 2015 80 / 129 Probabilistic ingredients Exponential inequalities (Concentration of measure) Slicing Let S = {w ≤ Vt ≤ v = (1 + ϵ)K w } and consider the slice Sk = {(1 + ϵ)k w ≤ Vt ≤ (1 + ϵ)k+1 w )} Then P P -# -# 0 t 0 t θVt C + y /θ and Sk Hs [dNs − λ(s)ds] ≥ B 2 1 − θb 3 . θ(1 + ϵ)k+1 w B C + y /θ and Sk Hs [dNs − λ(s)ds] ≥ 2 1 − θb 3 ≤ e −y . ≤ e −y Here choose θ = g ((1 + ϵ)k+1 w , b, y ) optimising ... 6 5# t D by k+1 and Sk ≤ e −y Hs [dNs − λ(s)ds] ≥ 2(1 + ϵ) wy + P 3 0 P.Reynaud-Bouret Hawkes Aussois 2015 80 / 129 Probabilistic ingredients Exponential inequalities (Concentration of measure) Slicing Let S = {w ≤ Vt ≤ v = (1 + ϵ)K w } and consider the slice Sk = {(1 + ϵ)k w ≤ Vt ≤ (1 + ϵ)k+1 w )} Then P P -# -# 0 t 0 t θVt C + y /θ and Sk Hs [dNs − λ(s)ds] ≥ B 2 1 − θb 3 . θ(1 + ϵ)k+1 w B C + y /θ and Sk Hs [dNs − λ(s)ds] ≥ 2 1 − θb 3 ≤ e −y . ≤ e −y Here choose θ = g ((1 + ϵ)k+1 w , b, y ) optimising ... 6 5# t D by k+1 and Sk ≤ e −y Hs [dNs − λ(s)ds] ≥ 2(1 + ϵ) wy + P 3 0 5# t 6 A by Hs [dNs − λ(s)ds] ≥ 2(1 + ϵ)Vt y + P and Sk ≤ e −y 3 0 P.Reynaud-Bouret Hawkes Aussois 2015 80 / 129 Probabilistic ingredients Exponential inequalities (Concentration of measure) Almost there Grouping the slices 5# t 6 A by P Hs [dNs − λ(s)ds] ≥ 2(1 + ϵ)Vt y + and S ≤ Ke −y 3 0 P.Reynaud-Bouret Hawkes Aussois 2015 81 / 129 Probabilistic ingredients Exponential inequalities (Concentration of measure) Almost there Grouping the slices 5# t 6 A by P Hs [dNs − λ(s)ds] ≥ 2(1 + ϵ)Vt y + and S ≤ Ke −y 3 0 "t But Vt = 0 Hs2 λ(s)ds, still depends on the unknown λ P.Reynaud-Bouret Hawkes Aussois 2015 81 / 129 Probabilistic ingredients Exponential inequalities (Concentration of measure) Almost there Grouping the slices 5# t 6 A by P Hs [dNs − λ(s)ds] ≥ 2(1 + ϵ)Vt y + and S ≤ Ke −y 3 0 "t But Vt = 0 Hs2 λ(s)ds, still depends on the unknown λ → one more turn using the fact that # t ˆ Hs2 dNs Vt = 0 is observable and P.Reynaud-Bouret Vˆt − Vt = martingale Hawkes Aussois 2015 81 / 129 Probabilistic ingredients Exponential inequalities (Concentration of measure) Bernstein-type inequality for general counting process (Hansen, RB, Rivoirard) "t Let (Hs )s≥0 be a predictable process and Mt = 0 Hs (dNs − λ(s)ds). Let b > 0 and v > w > 0. For all x, µ > 0 such that µ > φ(µ), let # τ µ b2x µ ˆ Vτ = Hs2 dNs + , µ − φ(µ) 0 µ − φ(µ) where φ(u) = exp(u) − u − 1. Then for every stopping time τ and every ε > 0 . D µ µ P Mτ ≥ 2(1 + ε)Vˆτ x + bx/3, w ≤ Vˆτ ≤ v and sup |Hs | ≤ b - s∈[0,τ ] ≤2 log(v /w ) −x e . log(1 + ε) " a generic choice of d for the Lasso whatever the underlying process.... P.Reynaud-Bouret Hawkes Aussois 2015 82 / 129 Probabilistic ingredients Exponential inequalities (Concentration of measure) Back to Lasso for Hawkes In the Hawkes case, Hs → Rcs , renormalized counts in small intervals. Hence there is no absolute v or b. P.Reynaud-Bouret Hawkes Aussois 2015 83 / 129 Probabilistic ingredients Exponential inequalities (Concentration of measure) Back to Lasso for Hawkes In the Hawkes case, Hs → Rcs , renormalized counts in small intervals. Hence there is no absolute v or b. If controlled number of points in small intervals via exponential inequality, then can find v and b such that P(Vt ≥ v or ||H||∞ ≥ b) exponentially small P.Reynaud-Bouret Hawkes Aussois 2015 83 / 129 Probabilistic ingredients Exponential inequalities (Concentration of measure) Back to Lasso for Hawkes In the Hawkes case, Hs → Rcs , renormalized counts in small intervals. Hence there is no absolute v or b. If controlled number of points in small intervals via exponential inequality, then can find v and b such that P(Vt ≥ v or ||H||∞ ≥ b) exponentially small i : Used to stop the martingale before Vt reaches level v (same for b) one can always choose w = martingale on this side. P.Reynaud-Bouret b2 x µ−φ(µ) , Hawkes nice since cannot stop the Aussois 2015 83 / 129 Probabilistic ingredients Exponential inequalities (Concentration of measure) Back to Lasso for Hawkes In the Hawkes case, Hs → Rcs , renormalized counts in small intervals. Hence there is no absolute v or b. If controlled number of points in small intervals^ via exponential inequality, then can find v and b such that P(VMt ≥ v or ||H||∞ ≥ b) exponentially small ^ Use to stop the martingale before Vt reaches level v (same for b) µ one can always choose w = martingale on this side. P.Reynaud-Bouret b2 x µ−φ(µ) , Hawkes nice since cannot stop the Aussois 2015 83 / 129 Probabilistic ingredients Exponential inequalities (Concentration of measure) Lasso criterion (r ) ˆ(r ) = argmina {γT (λa ) + pen(a)} a = argmina {−2a′ br + a′ Ga + 2(d(r ) )′ |a|} M number of interacting processes, K number of bins, A maximal interaction delay. Oracle inequality in probability D If d = 2(1 + ε)VˆTµ x + bx/3 then with probability larger than i > ! log(T )2 ) − P(G ̸≥ cI ) 1 − !M 2 K log(T )2 e −x − P(∀t < T , i , N[t−A,t] P.Reynaud-Bouret Hawkes Aussois 2015 84 / 129 Probabilistic ingredients Exponential inequalities (Concentration of measure) Lasso criterion (r ) ˆ(r ) = argmina {γT (λa ) + pen(a)} a = argmina {−2a′ br + a′ Ga + 2(d(r ) )′ |a|} M number of interacting processes, K number of bins, A maximal interaction delay. Oracle inequality in probability D If d = 2(1 + ε)VˆTµ x + bx/3 then with probability larger than i > ! log(T )2 ) − P(G ̸≥ cI ) 1 − !M 2 K log(T )2 e −x − P(∀t < T , i , N[t−A,t] , r ˆ(r ) ∥2 ≤ ! inf ∥λ(r ) − Rct a a P.Reynaud-Bouret ⎧ ⎨, ⎩ r 1 ∥λ(r ) − Rct a∥2 + c Hawkes , (r ) (di )2 i ∈supp(a) Aussois 2015 ⎫ ⎬ ⎭ . 84 / 129 Probabilistic ingredients Exponential inequalities (Concentration of measure) Mathematical debts Control of the number of points per interval Control of c the smallest eigenvalue of G Choice of x. P.Reynaud-Bouret Hawkes Aussois 2015 85 / 129 Probabilistic ingredients Controls via a branching structure Table of Contents 1 Point process and Counting process 2 Multivariate Hawkes processes and Lasso 3 Probabilistic ingredients Exponential inequalities (Concentration of measure) Controls via a branching structure Branching structure Number of points Control of G 4 Back to Lasso 5 PDE and point processes P.Reynaud-Bouret Hawkes Aussois 2015 86 / 129 Probabilistic ingredients Controls via a branching structure A branching representation of the linear Hawkes model (univariate) Linear Hawkes process= Branching process on a Poisson process Start = homogeneous Poisson (ν) = ancestors P.Reynaud-Bouret Hawkes Aussois 2015 87 / 129 Probabilistic ingredients Controls via a branching structure A branching representation of the linear Hawkes model (univariate) Linear Hawkes process= Branching process on a Poisson process Start = homogeneous Poisson (ν) = ancestors Each point generates children according to a Poisson process of intensity h P.Reynaud-Bouret Hawkes Aussois 2015 87 / 129 Probabilistic ingredients Controls via a branching structure A branching representation of the linear Hawkes model (univariate) Linear Hawkes process= Branching process on a Poisson process Start = homogeneous Poisson (ν) = ancestors Each point generates children according to a Poisson process of intensity h P.Reynaud-Bouret Hawkes Aussois 2015 87 / 129 Probabilistic ingredients Controls via a branching structure A branching representation of the linear Hawkes model (univariate) Linear Hawkes process= Branching process on a Poisson process Start = homogeneous Poisson (ν) = ancestors Each point generates children according to a Poisson process of intensity h " Again and again until extinction (almost sure when h < 1) P.Reynaud-Bouret Hawkes Aussois 2015 87 / 129 Probabilistic ingredients Controls via a branching structure A branching representation of the linear Hawkes model (univariate) Linear Hawkes process= Branching process on a Poisson process Start = homogeneous Poisson (ν) = ancestors Each point generates children according to a Poisson process of intensity h " Again and again until extinction (almost sure when h < 1) P.Reynaud-Bouret Hawkes Aussois 2015 87 / 129 Probabilistic ingredients Controls via a branching structure A branching representation of the linear Hawkes model (univariate) Linear Hawkes process= Branching process on a Poisson process Start = homogeneous Poisson (ν) = ancestors Each point generates children according to a Poisson process of intensity h " Again and again until extinction (almost sure when h < 1) P.Reynaud-Bouret Hawkes Aussois 2015 87 / 129 Probabilistic ingredients Controls via a branching structure A branching representation of the linear Hawkes model (univariate) Linear Hawkes process= Branching process on a Poisson process Start = homogeneous Poisson (ν) = ancestors Each point generates children according to a Poisson process of intensity h " Again and again until extinction (almost sure when h < 1) Hawkes= Final process without colors P.Reynaud-Bouret Hawkes Aussois 2015 87 / 129 Probabilistic ingredients Controls via a branching structure A branching representation of the linear Hawkes model (multivariate) not :# :# . Ahn "f%d Mh2→n hzpN Mhm " N , a ancestors P.Reynaud-Bouret Hawkes Aussois 2015 88 / 129 Probabilistic ingredients Controls via a branching structure A branching representation of the linear Hawkes model (multivariate) b. . " "' iii ft HEIKKINEN Eiiiifta An his , first P.Reynaud-Bouret N , generation Hawkes Aussois 2015 88 / 129 Probabilistic ingredients Controls via a branching structure A branching representation of the linear Hawkes model (multivariate) kaka . : - t.tn Ahn An . . . µ An N , N , 2nd generation P.Reynaud-Bouret Hawkes Aussois 2015 88 / 129 Probabilistic ingredients Controls via a branching structure A branching representation of the linear Hawkes model (multivariate) Hstkl i µ, Ni ski ÷ A , : pp : . . , . 3rd P.Reynaud-Bouret n Hawkes generation Aussois 2015 88 / 129 Probabilistic ingredients Controls via a branching structure A branching representation of the linear Hawkes model (multivariate) . Qdustaa = fh£dxas : P.Reynaud-Bouret . . Hawkes Ni Nz Aussois 2015 88 / 129 Probabilistic ingredients Controls via a branching structure Univariate and multivariate clusters Cluster process ancestor in 0 + all its family 0 P.Reynaud-Bouret t H Hawkes Aussois 2015 89 / 129 Probabilistic ingredients Controls via a branching structure Univariate and multivariate clusters Cluster process ancestor in 0 + all its family 0 t H If multivariate, the distribution of the cluster depends on the type of the ancestor. P.Reynaud-Bouret Hawkes Aussois 2015 89 / 129 Probabilistic ingredients Controls via a branching structure Number of points of one cluster If ancestor of type ℓ → Eℓ . K (n) vector of the number of points per type in the nth generation If ancestor of type ℓ, K (0) = eℓ P.Reynaud-Bouret Hawkes Aussois 2015 90 / 129 Probabilistic ingredients Controls via a branching structure Number of points of one cluster If ancestor of type ℓ → Eℓ . K (n) vector of the number of points per type in the nth generation If ancestor of type ℓ, K (0) = eℓ ! W (n) = n0 K (n) : number of points in the cluster per type up to generation n j P.Reynaud-Bouret Hawkes Aussois 2015 90 / 129 Probabilistic ingredients Controls via a branching structure Number of points of one cluster If ancestor of type ℓ → Eℓ . K (n) vector of the number of points per type in the nth generation If ancestor of type ℓ, K (0) = eℓ ! W (n) = n0 K (n) : number of points in the cluster per type up to g generation n Has W = W (∞) a Laplace transform ? i.e. can we find a vector θ of positive coordinates such that ) ′ * Eℓ e θ W (∞) < ∞ P.Reynaud-Bouret Hawkes Aussois 2015 90 / 129 Probabilistic ingredients Controls via a branching structure Number of points of one cluster If ancestor of type ℓ → Eℓ . K (n) vector of the number of points per type in the nth generation If ancestor of type ℓ, K (0) = eℓ ! W (n) = n0 K (n) : number of points in the cluster per type up to j generation n Has W = W (∞) a Laplace transform ? i.e. can we find a vector θ of positive coordinates such that ) ′ * Eℓ e θ W (∞) < ∞ If yes, * * ) ′ ) Pℓ (Ntot ≥ x) ≤ Eℓ e mini (θi )Ntot e − mini (θi )x ≤ Eℓ e θ W (∞) e − mini (θi )x exponentially small P.Reynaud-Bouret Hawkes Aussois 2015 90 / 129 Probabilistic ingredients Controls via a branching structure Laplace transform for the cluster φ(θ)′ = (φ1 (θ), ..., φM (θ)), ′ φℓ (θ) = log Eℓ (e θ K (1) ). P.Reynaud-Bouret Hawkes Aussois 2015 91 / 129 Probabilistic ingredients Controls via a branching structure Laplace transform for the cluster φ(θ)′ = (φ1 (θ), ..., φM (θ)), ′ φℓ (θ) = log Eℓ (e θ K (1) ). ( ) ′ * ) ′ ? ′ @* ( Eℓ e θ W (n) = Eℓ e θ W (n−1) E e θ K (n) ( generations < n P.Reynaud-Bouret Hawkes Aussois 2015 91 / 129 Probabilistic ingredients Controls via a branching structure Laplace transform for the cluster φ(θ)′ = (φ1 (θ), ..., φM (θ)), ′ φℓ (θ) = log Eℓ (e θ K (1) ). ( ) ′ * ) ′ ? ′ @* ( Eℓ e θ W (n) = Eℓ e θ W (n−1) E e θ K (n) ( generations < n ) ′ * ′ = Eℓ e θ W (n−1) e φ(θ) K (n−1) = exp(un (θ)ℓ ) with un (θ) = θ + φ(un−1 (θ)), u0 (θ) = θ P.Reynaud-Bouret Hawkes Aussois 2015 91 / 129 Probabilistic ingredients Controls via a branching structure Laplace transform for the cluster(2) If φ local contraction for a certain norm ||.|| : there exists r > 0 and C < 1 st if ||s|| < r then ||φ(s)|| ≤ C ||s||. P.Reynaud-Bouret Hawkes Aussois 2015 92 / 129 Probabilistic ingredients Controls via a branching structure Laplace transform for the cluster(2) If φ local contraction for a certain norm ||.|| : there exists r > 0 and C < 1 st if ||s|| < r then ||φ(s)|| ≤ C ||s||. If ||θ|| ≤ r (1 − C ), ||un (θ)|| ≤ ||θ|| + ||φ(un−1 (θ))|| ≤ ||θ|| + C ||un−1 (θ)|| P.Reynaud-Bouret Hawkes Aussois 2015 92 / 129 Probabilistic ingredients Controls via a branching structure Laplace transform for the cluster(2) If φ local contraction for a certain norm ||.|| : there exists r > 0 and C < 1 st if ||s|| < r then ||φ(s)|| ≤ C ||s||. If ||θ|| ≤ r (1 − C ), ||un (θ)|| ≤ ||θ|| + ||φ(un−1 (θ))|| ≤ ||θ|| + C ||un−1 (θ)|| ≤ ||θ||(1 + C + ... + C n ) ||θ|| ≤ ≤r 1−C Hence each coordinate of un (θ) increases (since W (n) increases1 but remains in a compact set. Therefore it converges and for θ small enough, W (∞) has a Laplace transform. P.Reynaud-Bouret Hawkes Aussois 2015 92 / 129 Probabilistic ingredients Controls via a branching structure Contraction ? K (1) has a Laplace transform (only thing needed ⇐= it’s Poisson). ′ φℓ (θ) = log Eℓ (e θ K (1) ). " (m) Hence ∂φ(0) = Γ = ( hℓ )ℓ,m with spectral radius < 1. P.Reynaud-Bouret Hawkes Aussois 2015 93 / 129 Probabilistic ingredients Controls via a branching structure Contraction ? K (1) has a Laplace transform (only thing needed ⇐= it’s Poisson). ′ φℓ (θ) = log Eℓ (e θ K (1) ). " (m) Hence ∂φ(0) = Γ = ( hℓ )ℓ,m with spectral radius < 1. Houdeholder theorem He there exists a norm on RM s.t. the associated operator norm of Γ < 1 P.Reynaud-Bouret Hawkes Aussois 2015 93 / 129 Probabilistic ingredients Controls via a branching structure Contraction ? K (1) has a Laplace transform (only thing needed ⇐= it’s Poisson). ′ φℓ (θ) = log Eℓ (e θ K (1) ). " (m) Hence ∂φ(0) = Γ = ( hℓ )ℓ,m with spectral radius < 1. 115L Houdeholder theorem there exists a norm on RM s.t. the associated operator norm of Γ < 1 By continuity (φ infinitely differentiable in a neighborhood of 0), ∃c ∈ (0, 1), ξ > 0, P.Reynaud-Bouret ∀||s|| < ξ, Hawkes ||∂φ(s)|| ≤ c. Aussois 2015 93 / 129 Probabilistic ingredients Controls via a branching structure Contraction ? K (1) has a Laplace transform (only thing needed ⇐= it’s Poisson). ′ φℓ (θ) = log Eℓ (e θ K (1) ). " (m) Hence ∂φ(0) = Γ = ( hℓ )ℓ,m with spectral radius < 1. Houdeholder theorem ill there exists a norm on RM s.t. the associated operator norm of Γ < 1 By continuity (φ infinitely differentiable in a neighborhood of 0), ∃c ∈ (0, 1), ξ > 0, ∀||s|| < ξ, ||∂φ(s)|| ≤ c. Since φ(0) = 0, by continuity ∃C ∈ (0, 1), r > 0, P.Reynaud-Bouret ∀||s|| < r , Hawkes ||∂φ(s)|| ≤ C ||s||. p Aussois 2015 93 / 129 Probabilistic ingredients Controls via a branching structure Number of points for Hawkes processes per interval Assume stationary (spectral radius of Γ < 1) h has bounded support. P.Reynaud-Bouret Hawkes Aussois 2015 94 / 129 Probabilistic ingredients Controls via a branching structure Number of points for Hawkes processes per interval Assume stationary (spectral radius of Γ < 1) A h has bounded support. . . on ancestors ' 't . - . . . . 0 A . . Jlhtwvu N˜A,ℓ number of points in [−A, 0) whose ancestor is of type ℓ theedatleastonePntmgnnaeAtohqandwanunEsEt-o@munbuofpimlsjimnthgetypeeanastnkinC.n . ;D P.Reynaud-Bouret N˜A,ℓ ≤ An ,, n k=1 max{Wn,k − n + ⌈A⌉, 0} Hawkes Aussois 2015 94 / 129 Probabilistic ingredients Controls via a branching structure Number of points for Hawkes processes per interval(2) B C Let Hn (θℓ ) = Eℓ e θℓ max{W −n+⌈A⌉,0} , then * ) + ˜ ≤ E(Hn (θℓ )An ) E e θℓ NA,ℓ n ≤ + n exp (νℓ (Hn (θℓ ) − 1)) Since W has a Laplace transform, Hn exponentially decreasing with n and it converges. P.Reynaud-Bouret Hawkes Aussois 2015 95 / 129 Probabilistic ingredients Controls via a branching structure Number of points for Hawkes processes per interval(2) B C Let Hn (θℓ ) = Eℓ e θℓ max{W −n+⌈A⌉,0} , then * ) + ˜ ≤ E(Hn (θℓ )An ) E e θℓ NA,ℓ n ≤ + n exp (νℓ (Hn (θℓ ) − 1)) Since W has a Laplace transform, Hn exponentially decreasing with n and it converges. → Laplace transform of N[−A,0) exists → P(N[−A,0) > y ) exponentially small... NB : true as soon as the branching distribution has a Laplace, constant unknown... depends on M ! ! ! P.Reynaud-Bouret Hawkes Aussois 2015 95 / 129 Probabilistic ingredients Controls via a branching structure The expectation of G Recall that ηa (t) = Rc′t a =ν+ , ak k ′ a Ga = P.Reynaud-Bouret # # t −∞ T δ−1/2 1[t−(k+1)δ,t−kδ] dNt−u ηa (t)2 dt 0 Hawkes Aussois 2015 96 / 129 Probabilistic ingredients Controls via a branching structure The expectation of G Recall that ηa (t) = Rc′t a =ν+ , ak k ′ a Ga = Hence # # t −∞ T ηa (t)2 dt 0 # E(G) ≥ cI ⇐⇒ ∀a, E( P.Reynaud-Bouret δ−1/2 1[t−(k+1)δ,t−kδ] dNt−u T 0 Hawkes ηa (t)2 dt) ≥ c||a||2 . Aussois 2015 96 / 129 Probabilistic ingredients Controls via a branching structure The expectation of G Recall that ηa (t) = Rc′t a =ν+ , ak k ′ a Ga = Hence # # t −∞ T δ−1/2 1[t−(k+1)δ,t−kδ] dNt−u ηa (t)2 dt 0 # E(G) ≥ cI ⇐⇒ ∀a, E( T 0 ηa (t)2 dt) ≥ c||a||2 . If the process is stationnary E(ηa (t)2 ) does not depend on t. Hence we want to show that c = T α with E(ηa (0)2 ) ≥ α||a||2 . Then we need to concentrate T1 G around its expectation. P.Reynaud-Bouret Hawkes Aussois 2015 96 / 129 Probabilistic ingredients Controls via a branching structure Minoration of the expectation If ηa (t) = ν + , k ak # t −∞ δ−1/2 1[t−(k+1)δ,t−kδ] dNt−u and N homogeneous Poisson process, then we can find α s.t. E(ηa (0)2 ) ≥ α||a||2 . P.Reynaud-Bouret Hawkes Aussois 2015 97 / 129 Probabilistic ingredients Controls via a branching structure Minoration of the expectation If ηa (t) = ν + , k ak # t −∞ δ−1/2 1[t−(k+1)δ,t−kδ] dNt−u and N homogeneous Poisson process, then we can find α s.t. E(ηa (0)2 ) ≥ α||a||2 . Use the likelihood and Girsanov theorem, to transfer the minoration for Poisson to another minoration for Hawkes.... because of the likelihood, need again to have a Laplace transform of the number of points... P.Reynaud-Bouret Hawkes Aussois 2015 97 / 129 Probabilistic ingredients Controls via a branching structure Clusters and their size (Univariate) Cluster = ancestor in 0 + all its family 0 P.Reynaud-Bouret t H Hawkes Aussois 2015 98 / 129 Probabilistic ingredients Controls via a branching structure Clusters and their size (Univariate) Cluster = ancestor in 0 + all its family 0 t H Size of cluster H ≤ GW ∗ A where A maximal support size for h P.Reynaud-Bouret Hawkes Aussois 2015 98 / 129 Probabilistic ingredients Controls via a branching structure Clusters and their size (Univariate) Cluster = ancestor in 0 + all its family 0 t H Size of cluster H ≤ GW ∗ A where A maximal support size for h and GW "total number of births in a Galton-Watson process (Poisson( h)). P.Reynaud-Bouret Hawkes Aussois 2015 98 / 129 Probabilistic ingredients Controls via a branching structure Clusters and their size (Univariate) Cluster = ancestor in 0 + all its family 0 t H Size of cluster H ≤ GW ∗ A where A maximal support size for h and GW "total number of births in a Galton-Watson process (Poisson( h)). Hence P(H > t) exponentially small P.Reynaud-Bouret Hawkes Aussois 2015 98 / 129 Probabilistic ingredients Controls via a branching structure Extinction time Extinction time Te = time from 0 until the last birth, if no ancestor after 0 a a a Te 0 P.Reynaud-Bouret Hawkes Aussois 2015 99 / 129 Probabilistic ingredients Controls via a branching structure Extinction time Extinction time Te = time from 0 until the last birth, if no ancestor after 0 a a a Te 0 Conditionally to the ancestor P H H ( H 7 t . a) PC Tefal ancestors) = H Ay = = I that at t it reaches .PH ) one . al expfaobg Past , a . . at Want 0 Poisson ancestor process before 0 marked by the size of the associated cluster, H P.Reynaud-Bouret Hawkes Aussois 2015 99 / 129 Probabilistic ingredients Controls via a branching structure Extinction time Extinction time Te = time from 0 until the last birth, if no ancestor after 0 a a a 5 P(Te ≤ a) = exp −ν P.Reynaud-Bouret Te 0 # +∞ P(H > t)dt a Hawkes 6 Aussois 2015 99 / 129 Probabilistic ingredients Controls via a branching structure Extinction time Extinction time Te = time from 0 until the last birth, if no ancestor after 0 a a a Te 0 Hence P(Te > a) exponentially small P.Reynaud-Bouret Hawkes Aussois 2015 99 / 129 Probabilistic ingredients Controls via a branching structure Extinction time Extinction time Te = time from 0 until the last birth, if no ancestor after 0 a a a Te 0 Hence P(Te > a) exponentially small Approximated simulation of a stationnary Hawkes process. P.Reynaud-Bouret Hawkes Aussois 2015 99 / 129 Probabilistic ingredients Controls via a branching structure Extinction time Extinction time Te = time from 0 until the last birth, if no ancestor after 0 a a a Te 0 Hence P(Te > a) exponentially small Approximated simulation of a stationnary Hawkes process. For exact simulation see work of M¨ oller etc P.Reynaud-Bouret Hawkes Aussois 2015 99 / 129 Probabilistic ingredients Controls via a branching structure Towards independence [ P.Reynaud-Bouret [ Hawkes Aussois 2015 100 / 129 Probabilistic ingredients Controls via a branching structure Towards independence [ [ Cut [0, T ] in almost independent pieces (cf Berbee’s lemma, discrete case (Baraud,Comte et Viennet)) P.Reynaud-Bouret Hawkes Aussois 2015 100 / 129 Probabilistic ingredients Controls via a branching structure Towards independence [ [ Cut [0, T ] in almost independent pieces (cf Berbee’s lemma, discrete case (Baraud,Comte et Viennet)) Control of the total variation distance. P.Reynaud-Bouret Hawkes Aussois 2015 100 / 129 Probabilistic ingredients Controls via a branching structure Towards independence i. [ ÷ . . . . . . . . [ Cut [0, T ] in almost independent pieces (cf Berbee’s lemma, discrete case (Baraud,Comte et Viennet)) Control of the total variation distance. Up to this error = sum of two groups of independent variables. P.Reynaud-Bouret Hawkes Aussois 2015 100 / 129 Probabilistic ingredients Controls via a branching structure Towards independence [ [ Cut [0, T ] in almost independent pieces (cf Berbee’s lemma, discrete case (Baraud,Comte et Viennet)) Control of the total variation distance. Up to this error = sum of two groups of independent variables. All concentration inequalities for i.i.d. variables apply (Bernstein ...) P.Reynaud-Bouret Hawkes Aussois 2015 100 / 129 Back to Lasso Table of Contents 1 Point process and Counting process 2 Multivariate Hawkes processes and Lasso 3 Probabilistic ingredients 4 Back to Lasso Theory Simulations 5 PDE and point processes P.Reynaud-Bouret Hawkes Aussois 2015 101 / 129 Back to Lasso Theory Lasso criterion (r ) ˆ(r ) = argmina {γT (λa ) + pen(a)} a = argmina {−2a′ br + a′ Ga + 2(d(r ) )′ |a|} M number of interacting processes, K number of bins, A maximal interaction delay. If linear stationnary Hawkes Oracle inequality in probability D If d = 2(1 + ε)VˆTµ x + bx/3 then with probability larger than i 1−!M 2 K log(T )2 e −x −P(∀t < T , i , N[t−A,t] > ! log(T )2 )−P(G ̸≥ !I /T ) P.Reynaud-Bouret Hawkes Aussois 2015 102 / 129 Back to Lasso Theory Lasso criterion (r ) ˆ(r ) = argmina {γT (λa ) + pen(a)} a = argmina {−2a′ br + a′ Ga + 2(d(r ) )′ |a|} M number of interacting processes, K number of bins, A maximal interaction delay. If linear stationnary Hawkes Oracle inequality in probability D If d = 2(1 + ε)VˆTµ x + bx/3 then with probability larger than i 1−!M 2 K log(T )2 e −x −P(∀t < T , i , N[t−A,t] > ! log(T )2 )−P(G ̸≥ !I /T ) , r ˆ(r ) ∥2 ≤ ! inf ∥λ(r ) − Rct a a P.Reynaud-Bouret ⎧ ⎨, ⎩ r 1 ∥λ(r ) − Rct a∥2 + T Hawkes , (r ) (di )2 i ∈supp(a) Aussois 2015 ⎫ ⎬ ⎭ . 102 / 129 Back to Lasso Theory Lasso criterion... (r ) ˆ(r ) = argmina {γT (λa ) + pen(a)} a = argmina {−2a′ br + a′ Ga + 2(d(r ) )′ |a|} M number of interacting processes, K number of bins, A maximal interaction √ delay. If linear stationnary Hawkes and K not too large (K ≤ T / log(T )3 ) Oracle inequality in probability D If d = 2(1 + ε)VˆTµ x + bx/3 then with probability larger than 1 − !M 2 K log(T )2 e −x − !T −1 P.Reynaud-Bouret Hawkes Aussois 2015 103 / 129 Back to Lasso Theory Lasso criterion... (r ) ˆ(r ) = argmina {γT (λa ) + pen(a)} a = argmina {−2a′ br + a′ Ga + 2(d(r ) )′ |a|} M number of interacting processes, K number of bins, A maximal interaction √ delay. If linear stationnary Hawkes and K not too large (K ≤ T / log(T )3 ) Oracle inequality in probability D If d = 2(1 + ε)VˆTµ x + bx/3 then with probability larger than 1 − !M 2 K log(T )2 e −x − !T −1 , r ˆ(r ) ∥2 ≤ ! inf ∥λ(r ) − Rct a a P.Reynaud-Bouret ⎧ ⎨, ⎩ r ∥λ(r ) − Rct a∥2 + Hawkes 1 T , (r ) (di )2 i ∈supp(a) Aussois 2015 ⎫ ⎬ ⎭ . 103 / 129 Back to Lasso Theory Choice of x If x = γ log(T ), with probability larger than 1 − M 2 K log(T )2 T −γ − !T −1 ⎧ ⎨, , 1 ˆ(r ) ∥2 ≤ ! inf ∥λ(r ) − Rct a ∥λ(r ) − Rct a∥2 + a ⎩ T r r P.Reynaud-Bouret Hawkes , (r ) (di )2 i ∈supp(a) Aussois 2015 ⎫ ⎬ . ⎭ 104 / 129 Back to Lasso Theory Choice of x If x = γ log(T ), with probability larger than 1 − M 2 K log(T )2 T −γ − !T −1 ⎧ ⎨, , 1 ˆ(r ) ∥2 ≤ ! inf ∥λ(r ) − Rct a ∥λ(r ) − Rct a∥2 + a ⎩ T r r , i ∈supp(a) good renormalization by T ", if piecewise constant true, loss on interaction function ≤ 0 +D (r ) (di )2 ⎫ ⎬ . ⎭ |a∗ |0 log(T )3 T with probability larger than 1 − !M 2 K log(T )2 T −γ − !T −1 P.Reynaud-Bouret Hawkes Aussois 2015 104 / 129 Back to Lasso Theory Choice of x If x = γ log(T ), with probability larger than 1 − M 2 K log(T )2 T −γ − !T −1 ⎧ ⎨, , 1 ˆ(r ) ∥2 ≤ ! inf ∥λ(r ) − Rct a ∥λ(r ) − Rct a∥2 + a ⎩ T r r , (r ) (di )2 i ∈supp(a) If distance bounded (reasonable), E( loss ) ≤D P.Reynaud-Bouret ⎫ ⎬ . ⎭ |a∗ |0 log(T )3 + !M 2 K log(T )2 T −γ + !T −1 T Hawkes Aussois 2015 104 / 129 Back to Lasso Theory Choice of x If x = γ log(T ), with probability larger than 1 − M 2 K log(T )2 T −γ − !T −1 ⎧ ⎨, , 1 ˆ(r ) ∥2 ≤ ! inf ∥λ(r ) − Rct a ∥λ(r ) − Rct a∥2 + a ⎩ T r r , (r ) (di )2 i ∈supp(a) If distance bounded (reasonable), E( loss ) ≤D ⎫ ⎬ . ⎭ |a∗ |0 log(T )3 + !M 2 K log(T )2 T −γ + !T −1 T Hence γ > 1 P.Reynaud-Bouret Hawkes Aussois 2015 104 / 129 Back to Lasso Theory Choice of x If x = γ log(T ), with probability larger than 1 − M 2 K log(T )2 T −γ − !T −1 ⎧ ⎨, , 1 ˆ(r ) ∥2 ≤ ! inf ∥λ(r ) − Rct a ∥λ(r ) − Rct a∥2 + a ⎩ T r r , (r ) (di )2 i ∈supp(a) If distance bounded (reasonable), E( loss ) ≤ ⎫ ⎬ . ⎭ |a∗ |0 log(T )3 + !M 2 K log(T )2 T −γ + !T −1 T Hence γ > 1 Good reason to believe (proof in Poisson case) that γ ∈ [1, !], γ smaller changes the rate, γ larger changes the constant ... P.Reynaud-Bouret Hawkes Aussois 2015 104 / 129 Back to Lasso Simulations Bernstein-type inequality for general counting process (Hansen, RB, Rivoirard) "t Let (Hs )s≥0 be a predictable process and Mt = 0 Hs (dNs − λ(s)ds). Let b > 0 and v > w > 0. For all x, µ > 0 such that µ > φ(µ), let # τ µ b2x µ ˆ Vτ = Hs2 dNs + , µ − φ(µ) 0 µ − φ(µ) =p where φ(u) = exp(u) − u − 1. Then for every stopping time τ and every ε > 0 . . D µ µ P Mτ ≥ 2(1 + ε)Vˆτ x + bx/3, w ≤ Vˆτ ≤ v and sup |Hs | ≤ b - s∈[0,τ ] ≤2 log(v /w ) −x e . log(1 + ε) " a generic choice of d for the Lasso whatever the underlying process.... P.Reynaud-Bouret Hawkes Aussois 2015 105 / 129 Back to Lasso Simulations Practical choice of d Recall that the oracle inequality needs ( (r ) ( (b − b ¯ (r ) ( ≤ d(r ) , with b= # 0 T (r ) Rct dNt ¯= and b # ∀r T Rct λ(r ) (t)dt. 0 - ’Bernstein Lasso’ and ’Bernstein Lasso + OLS’ D Bi γ log(T ) di = 2γ log(T )Vˆi + , 3 # T ˆ (Rct,i )2 dNt,ri , Bi = sup |Rct,i |. Vi = 0 P.Reynaud-Bouret t∈[0,T ],m Hawkes Aussois 2015 106 / 129 Back to Lasso Simulations Practical choice of d Recall that the oracle inequality needs ( (r ) ( (b − b ¯ (r ) ( ≤ d(r ) , with b= # 0 T (r ) Rct dNt ¯= and b # ∀r T Rct λ(r ) (t)dt. 0 - ’Bernstein Lasso’ and ’Bernstein Lasso + OLS’ D Bi γ log(T ) di = 2γ log(T )Vˆi + , 3 # T ˆ (Rct,i )2 dNt,ri , Bi = sup |Rct,i |. Vi = 0 t∈[0,T ],m - ’Adaptive Lasso’ (Zou) di = P.Reynaud-Bouret γ . 2|ˆ aiols | Hawkes Aussois 2015 106 / 129 Back to Lasso Simulations Why +OLS ? Lasso Least : . absolute Shrinkage and Selection Opera In If G = I , minimizing −2a′ b + a′ G a + 2d′ |a| " a soft-thresholding estimator, i.e. ˆLasso = (b − sign(b)d)+ . a P.Reynaud-Bouret Hawkes Aussois 2015 107 / 129 Back to Lasso Simulations Why +OLS ? If G = I , minimizing −2a′ b + a′ G a + 2d′ |a| " a soft-thresholding estimator, i.e. ˆLasso = (b − sign(b)d)+ . a Hence small bias anytime " once the support is estimated by Lasso, compute the ordinary least-square estimate (OLS) on the resulting support. P.Reynaud-Bouret Hawkes Aussois 2015 107 / 129 ff: Back to Lasso Simulations Choices of the weights dλ o o b o b 30 spont1 10 0.02 1->1 4.2 o dgamma=1 o dgamma=1.5 30 10 10 0 |b-b| 0 0.02 2->1 0 1.4 spont2 0 1->2 10 0 2->2 0 0 spont1 0 Coeff: 10 NB : Adaptive Lasso (Zou) dλ = γ/(2|ˆ aλ |) P.Reynaud-Bouret Hawkes 0.02 1->1 4.2 0.02 2->1 0 1.4 spont2 0 10 1->2 0 2->2 0 0 Aussois 2015 0 108 / 129 Back to Lasso Simulations Simulation study - Support recovery We perform 100 runs with T = 2, M = 8, K = 4 (264 coefficients to be estimated by using 636 observations) Bernstein Lasso 0.5 1 2 0 32 24 17 6 1 0 0 2 0 0 0 22 7 1 1 2 7 Tuning parameter γ Correct clusters identif. ∈ [0, 100] False non-zero interactions ∈ [0, 55] False zero interactions ∈ [0, 9] False zero spontaneous rates ∈ [0, 8] False non-zero coeff. ∈ [0, 238] False zero coeff. ∈ [0, 18] P.Reynaud-Bouret Hawkes Adaptive Lasso 2 200 1000 0 0 32 55 13 1 0 0 2 0 1 3 199 17 1 0 2 7 Aussois 2015 109 / 129 Back to Lasso Simulations Simulation study - Support recovery We perform 100 runs with T = 20, M = 8, K = 4 Bernstein Lasso 0.5 1 2 63 99 100 3 1 0 0 0 0 0 0 0 4 1 0 0 0 0 Tuning parameter γ Correct clusters identif. ∈ [0, 100] False non-zero interactions ∈ [0, 55] False zero interactions ∈ [0, 9] False zero spontaneous rates ∈ [0, 8] False non-zero coeff. ∈ [0, 238] False zero coeff. ∈ [0, 18] Adaptive Lasso 2 200 1000 0 0 90 55 10 0 0 0 0 0 0 0 197 13 0 0 0 0 For T = 20, γ = 1 or γ = 2 is convenient for Bernstein Lasso. It was also the case for T = 2. γ = 1000 is convenient for Adaptive Lasso. It was not the case for T = 2. P.Reynaud-Bouret Hawkes Aussois 2015 110 / 129 Back to Lasso Simulations Simulation study - Influence of the OLS step We perform 100 runs with T = 20, M = 8, K = 4. Bernstein Lasso γ=1 γ=2 37 69 10 9 3 6 0.5 0.4 Tuning parameter γ MSE for spontaneous rates MSE for spontaneous rates after OLS MSE for interaction functions MSE for interaction functions after OLS MSE for spontaneous rates = M , m=1 MSE for interaction functions = M , M # , m=1 ℓ=1 P.Reynaud-Bouret Hawkes Adaptive Lasso γ = 1000 27 10 0.5 0.4 (ˆ ν (m) − ν (m) )2 (m) (m) (hˆℓ (t) − hℓ (t))2 dt Aussois 2015 111 / 129 Back to Lasso Simulations Another (more realistic ?) neuronal network : Integrate and Fire 1 2 3 4 5 6 7 8 9 10 P.Reynaud-Bouret Hawkes Aussois 2015 112 / 129 Back to Lasso Simulations Another (more realistic ?) neuronal network : Integrate and Fire T = 60s P.Reynaud-Bouret Hawkes Aussois 2015 112 / 129 Back to Lasso Simulations Another (more realistic ?) neuronal network : Integrate and Fire 1 2 3 4 5 6 7 8 9 10 P.Reynaud-Bouret Hawkes Aussois 2015 112 / 129 Back to Lasso Simulations Another (more realistic ?) neuronal network : Integrate and Fire Number of times that a given interaction is detected P.Reynaud-Bouret Mean energy ( h2) when detected Hawkes Aussois 2015 112 / 129 Back to Lasso Simulations Open questions for Lasso Branching arguments ok for linear Hawkes Hence number of points controlled if intensity smaller than stationnary Hawkes (thinning) and in particular for (.)+ and inhibition. P.Reynaud-Bouret Hawkes Aussois 2015 113 / 129 Back to Lasso Simulations Open questions for Lasso Branching arguments ok for linear Hawkes Hence number of points controlled if intensity smaller than stationnary Hawkes (thinning) and in particular for (.)+ and inhibition. Minoration of E(G) can be done if intensity lower bounded and upper bounded by Hawkes P.Reynaud-Bouret Hawkes Aussois 2015 113 / 129 Back to Lasso Simulations Open questions for Lasso Branching arguments ok for linear Hawkes Hence number of points controlled if intensity smaller than stationnary Hawkes (thinning) and in particular for (.)+ and inhibition. Minoration of E(G) can be done if intensity lower bounded and upper bounded by Hawkes The ”Berbee’s lemma” : no idea how to get it without linear Hawkes. Hence what about (.)+ ? ? ? P.Reynaud-Bouret Hawkes Aussois 2015 113 / 129 Back to Lasso Simulations Open questions for Lasso Branching arguments ok for linear Hawkes Hence number of points controlled if intensity smaller than stationnary Hawkes (thinning) and in particular for (.)+ and inhibition. Minoration of E(G) can be done if intensity lower bounded and upper bounded by Hawkes The ”Berbee’s lemma” : no idea how to get it without linear Hawkes. Hence what about (.)+ ? ? ? Still on simulation works even if not Hawkes and γ = 1.5 works ! P.Reynaud-Bouret Hawkes Aussois 2015 113 / 129 Back to Lasso Simulations Open questions for Lasso Branching arguments ok for linear Hawkes Hence number of points controlled if intensity smaller than stationnary Hawkes (thinning) and in particular for (.)+ and inhibition. Minoration of E(G) can be done if intensity lower bounded and upper bounded by Hawkes The ”Berbee’s lemma” : no idea how to get it without linear Hawkes. Hence what about (.)+ ? ? ? Still on simulation works even if not Hawkes and γ = 1.5 works ! What about group Lasso ? (no sharp enough concentration inequality yet) What about non stationnarity ? Segmentation, clustering (F. Picard, C. Tuleau-Malot) P.Reynaud-Bouret Hawkes Aussois 2015 113 / 129 PDE and point processes Table of Contents 1 Point process and Counting process 2 Multivariate Hawkes processes and Lasso 3 Probabilistic ingredients 4 Back to Lasso 5 PDE and point processes Billions of neurons Microscopic PPS ? Microscopic to Macroscopic P.Reynaud-Bouret Hawkes Aussois 2015 114 / 129 PDE and point processes Billions of neurons Biological context Action potential Dep olari zatio n 0 -55 Failed initiations Threshold -70 Stimulus 0 ation Repolariz Voltage (mV) +40 1 Resting state Refractory period 2 3 Time (ms) 4 5 Physiological constraint : refractory period. P.Reynaud-Bouret Hawkes Aussois 2015 115 / 129 PDE and point processes Billions of neurons Age structured equations (Pakdaman, Perthame, Salort, 2010) Age = delay since last spike. E probability density of finding a neuron with age s at time t. n(t, s) = ratio of the population with age s at time t. P.Reynaud-Bouret Hawkes Aussois 2015 116 / 129 PDE and point processes Billions of neurons Age structured equations (Pakdaman, Perthame, Salort, 2010) Age = delay since last spike. E probability density of finding a neuron with age s at time t. n(t, s) = ratio of the population with age s at time t. ⎧ ∂n (t, s) ∂n (t, s) ⎪ ⎪ + p (s, X (t)) n (t, s) = 0 + ⎨ ∂t ∂s # +∞ ⎪ ⎪ ⎩ m (t) := n (t, 0) = p (s, X (t)) n (t, s) ds (PPS) 0 P.Reynaud-Bouret Hawkes Aussois 2015 116 / 129 PDE and point processes Billions of neurons Age structured equations (Pakdaman, Perthame, Salort, 2010) Age = delay since last spike. E probability density of finding a neuron with age s at time t. n(t, s) = ratio of the population with age s at time t. ⎧ ∂n (t, s) ∂n (t, s) ⎪ ⎪ + p (s, X (t)) n (t, s) = 0 + ⎨ ∂t ∂s # +∞ ⎪ ⎪ ⎩ m (t) := n (t, 0) = p (s, X (t)) n (t, s) ds (PPS) 0 p represents the firing rate. For example, p(s, X ) = 1{s>σ(X )} . # t d(x)m(t − v )dv (global neural activity) X (t) = 0 d = delay function. For example, d(v ) = e −τ v . P.Reynaud-Bouret Hawkes Aussois 2015 116 / 129 PDE and point processes Billions of neurons One neuron = point process " age ? Age = delay since last spike. Microscopic age We consider the continuous to the left (hence predictable) version of the age. The age at time 0 depends on the spiking times before time 0. The dynamic is characterized by the spiking times after time 0. P.Reynaud-Bouret Hawkes Aussois 2015 117 / 129 PDE and point processes Microscopic PPS ? Framework · · · < T−1 < T0 ≤ 0 < T1 < . . . the ordered sequence of points of N. Dichotomy of the behaviour of N with respect to time 0 : P.Reynaud-Bouret Hawkes Aussois 2015 118 / 129 PDE and point processes Microscopic PPS ? Framework · · · < T−1 < T0 ≤ 0 < T1 < . . . the ordered sequence of points of N. Dichotomy of the behaviour of N with respect to time 0 : N− = N ∩ (−∞, 0] is a point process with distribution P0 (initial condition). The age at time 0 is finite ⇔ N− ̸= ∅. N+ = N ∩ (0, +∞) is a point process admitting some intensity N ) " ”probability to find a new point at time t given the past” λ(t, Ft− P.Reynaud-Bouret Hawkes Aussois 2015 118 / 129 PDE and point processes Microscopic PPS ? Framework · · · < T−1 < T0 ≤ 0 < T1 < . . . the ordered sequence of points of N. Dichotomy of the behaviour of N with respect to time 0 : N− = N ∩ (−∞, 0] is a point process with distribution P0 (initial condition). The age at time 0 is finite ⇔ N− ̸= ∅. N+ = N ∩ (0, +∞) is a point process admitting some intensity N ) " ”probability to find a new point at time t given the past” λ(t, Ft− N ) are analogous. p(s, X (t)) and λ(t, Ft− P.Reynaud-Bouret Hawkes Aussois 2015 118 / 129 PDE and point processes Microscopic PPS ? Some classical point processes in neuroscience N ) = λ(t) = deterministic function. Poisson process : λ(t, Ft− N ) = f (S ) ⇔ i.i.d. ISIs. Renewal process : λ(t, Ft− t− ninth spikes intervals IT Th . . . . N )=µ+ Hawkes process : λ(t, Ft− # # t− −∞ h(t − v )N(dv ) P.Reynaud-Bouret ←→ Hawkes . , Tn t− −∞ # 0 h(t − v )N(dv ) t d(v )m(t − v )dv = X (t). Aussois 2015 119 / 129 PDE and point processes Microscopic PPS ? Dynamic = Ogata’s thinning : Poisson process : Point process N Π is a Poisson process with rate 1. , Π(dt, dx) = δX . E [Π(dt, dx)] = dtdx. λ is random. N admits λ as an intensity. 0 P.Reynaud-Bouret t Hawkes Aussois 2015 120 / 129 PDE and point processes Microscopic PPS ? Dynamic = Ogata’s thinning : Poisson process : Point process N Π is a Poisson process with rate 1. , Π(dt, dx) = δX . E [Π(dt, dx)] = dtdx. λ is random. N admits λ as an intensity. 0 P.Reynaud-Bouret t Hawkes Aussois 2015 120 / 129 PDE and point processes Microscopic PPS ? Dynamic = Ogata’s thinning : Poisson process : Point process N Π is a Poisson process with rate 1. , Π(dt, dx) = δX . E [Π(dt, dx)] = dtdx. λ is random. N admits λ as an intensity. 0 P.Reynaud-Bouret t Hawkes Aussois 2015 120 / 129 PDE and point processes Microscopic PPS ? Dynamic = Ogata’s thinning : Poisson process : Point process N Π is a Poisson process with rate 1. , Π(dt, dx) = δX . E [Π(dt, dx)] = dtdx. λ is random. N admits λ as an intensity. 0 P.Reynaud-Bouret t Hawkes Aussois 2015 120 / 129 PDE and point processes Microscopic PPS ? Dynamic = Ogata’s thinning : Poisson process : Point process N Π is a Poisson process with rate 1. , Π(dt, dx) = δX . E [Π(dt, dx)] = dtdx. λ is random. N admits λ as an intensity. 0 P.Reynaud-Bouret t Hawkes Aussois 2015 120 / 129 PDE and point processes Microscopic PPS ? A microscopic analogous to n n(t, .) is the probability density of the age at time t. At fixed time t, we are looking at a Dirac mass at St− . P.Reynaud-Bouret Hawkes Aussois 2015 121 / 129 PDE and point processes Microscopic PPS ? A microscopic analogous to n n(t, .) is the probability density of the age at time t. At fixed time t, we are looking at a Dirac mass at St− . What we need Random measure U on R2 . ∞ (R2 ), Action over test functions : ∀ϕ ∈ Cc,b + # ϕ(t, s)U(dt, ds) = # ϕ(t, St− )dt. What we define We construct an ad hoc random measure U which satisfies a system of stochastic differential equations similar to (PPS). P.Reynaud-Bouret Hawkes Aussois 2015 121 / 129 PDE and point processes Microscopic PPS ? Microscopic equation (Chevallier, C´aceres, Doumic, RB) B C N ) Let Π be a Poisson measure. Let λ(t, Ft− be some non negative t>0 1 predictable process which is Lloc a.s. The measure U satisfies the following system a.s. ⎧ -# . N λ(t,Ft− ) ⎪ ⎪ ⎪ Π (dt, dx) U (t, ds) = 0, ⎪ ⎨(∂t + ∂s ){U (dt, ds)} + x=0 -# . # N λ(t,Ft− ) ⎪ ⎪ ⎪ Π (dt, dx) U (t, ds) , ⎪ ⎩U (dt, 0) = s∈R x=0 in the weak sense with initial condition limt→0+ U(t, ·) = δ−T0 . (−T0 is the age at time 0) P.Reynaud-Bouret Hawkes Aussois 2015 122 / 129 PDE and point processes Microscopic PPS ? Microscopic equation (Chevallier, C´aceres, Doumic, RB) B C N ) Let Π be a Poisson measure. Let λ(t, Ft− be some non negative t>0 1 predictable process which is Lloc a.s. The measure U satisfies the following system a.s. ⎧ -# . N λ(t,Ft− ) ⎪ ⎪ ⎪ Π (dt, dx) U (t, ds) = 0, ⎪ ⎨(∂t + ∂s ){U (dt, ds)} + x=0 -# . # N λ(t,Ft− ) ⎪ ⎪ ⎪ Π (dt, dx) U (t, ds) , ⎪ ⎩U (dt, 0) = s∈R x=0 in the weak sense with initial condition limt→0+ U(t, ·) = δ−T0 . (−T0 is the age at time 0) # λ(t,F N ) t− Π (dt, dx). " p(s, X (t)) is replaced by x=0 ( H G# N ( * ) λ(t,Ft− ) ( N N dt. E Π (dt, dx)(Ft− = λ t, Ft− ( x=0 P.Reynaud-Bouret Hawkes Aussois 2015 122 / 129 PDE and point processes Microscopic to Macroscopic Taking the expectation [...] Bdefining u C= ”E(U)”, u(t, .) distribution of St− . N ) Let λ(t, Ft− be some non negative predictable process which is L1loc t>0 a.s. The measure U satisfies the following system, ⎧ -# . N λ(t,Ft− ) ⎪ ⎪ ⎪ Π (dt, dx) U (t, ds) = 0, ⎪ ⎨(∂t + ∂s ){U (dt, ds)} + x=0 . # λ(t,F N ) # ⎪ t− ⎪ ⎪ Π (dt, dx) U (t, ds) , ⎪ ⎩U (dt, 0) = s∈R x=0 in the weak sense with initial condition limt→0+ U(t, ·) = δ−T0 . P.Reynaud-Bouret Hawkes Aussois 2015 123 / 129 PDE and point processes Microscopic to Macroscopic Taking the expectation [...] Bdefining u C= ”E(U)”, u(t, .) distribution of St− . N ) Let λ(t, Ft− be some non negative predictable process which is t>0 1 Lloc in expectation, and which admits a finite mean. The measure u = ”E(U)” satisfies the following system, ⎧ ⎨(∂t + ∂s )u (dt, # ds) + ρλ,P0 (t, s)u (dt, ds) = 0, ⎩u (dt, 0) = s∈R ρλ,P0 (t, s)u (t, ds) dt, ; B C( < N (S in the weak sense where ρλ,P0 (t, s) = E λ t, Ft− t− = s for almost every t. The initial condition limt→0+ u (t, ·) is given by the distribution of −T0 . P.Reynaud-Bouret Hawkes Aussois 2015 123 / 129 PDE and point processes Microscopic to Macroscopic Law of large numbers Law of large numbers. Population-based approach. Theorem Let (N i )i ≥1 be some i.i.d. point on R with L1loc intensity in B i processes C expectation. For each i , let St− t>0 denote the age process associated to N i . Then, for every test function ϕ, . - n # # 1, a.s. δS i (ds) dt −−−→ ϕ(t, s)u(dt, ds), ϕ(t, s) t− n→∞ n i =1 with u satisfying the deterministic system. P.Reynaud-Bouret Hawkes Aussois 2015 124 / 129 PDE and point processes Microscopic to Macroscopic Review of the examples The system in expectation ⎧ ⎨(∂t + ∂s )u (dt, # ds) + ρλ,P0 (t, s) u (dt, ds) = 0, ⎩u (dt, 0) = s∈R ρλ,P0 (t, s) u (t, ds) dt. ; B C( < N (S where ρλ,P0 (t, s) = E λ t, Ft− t− = s . This result may seem OK to a probabilist, But analysts need some explicit expression for ρ. In particular, this system may seem linear, but it is non-linear in general. P.Reynaud-Bouret Hawkes Aussois 2015 125 / 129 PDE and point processes Microscopic to Macroscopic Review of the examples The system in expectation ⎧ ⎨(∂t + ∂s )u (dt, # ds) + ρλ,P0 (t, s) u (dt, ds) = 0, ⎩u (dt, 0) = s∈R ρλ,P0 (t, s) u (t, ds) dt. ; B C( < N (S where ρλ,P0 (t, s) = E λ t, Ft− t− = s . This result may seem OK to a probabilist, But analysts need some explicit expression for ρ. In particular, this system may seem linear, but it is non-linear in general. Poisson process. Renewal process. P.Reynaud-Bouret → ρλ,P0 (t, s) = f (t). → ρλ,P0 (t, s) = f (s). Hawkes Aussois 2015 125 / 129 PDE and point processes Microscopic to Macroscopic Review of the examples The system in expectation ⎧ ⎨(∂t + ∂s )u (dt, # ds) + ρλ,P0 (t, s) u (dt, ds) = 0, ⎩u (dt, 0) = s∈R ρλ,P0 (t, s) u (t, ds) dt. ; B C( < N (S where ρλ,P0 (t, s) = E λ t, Ft− t− = s . This result may seem OK to a probabilist, But analysts need some explicit expression for ρ. In particular, this system may seem linear, but it is non-linear in general. Poisson process. Renewal process. Hawkes process. P.Reynaud-Bouret → ρλ,P0 (t, s) = f (t). → ρλ,P0 (t, s) = f (s). → ρλ,P0 is much more complex. Hawkes Aussois 2015 125 / 129 PDE and point processes Microscopic to Macroscopic For Hawkes Recall that # t− −∞ h(t − x)N(dx) P.Reynaud-Bouret ←→ # t 0 Hawkes d(x)m(t − x)dx = X (t). Aussois 2015 126 / 129 PDE and point processes Microscopic to Macroscopic For Hawkes Recall that # t− −∞ h(t − x)N(dx) ←→ # t 0 d(x)m(t − x)dx = X (t). What we expected Replacement of p(s, X (t)) by # t ? @ N E λ(t, Ft− ) = µ + h (t − x) u(dx, 0) ←→ X (t) 0 What we find p(s, X (t)) is replaced by ρλ,P0 (t, s) which is the conditional expectation, not the full expectation. q Technical difficulty ; B C( < N (S ρλ,P0 (t, s) = E λ t, Ft− is not so easy to compute. t− = sHawkes P.Reynaud-Bouret Aussois 2015 126 / 129 PDE and point processes Microscopic to Macroscopic Conclusions : work in progress of J. Chevallier Univariate Hawkes processes do NOT lead to a PPS of the form p(s, Xt ) = ν + Xt , but to another completely explicit closed PDE system (NB : Non Markovian process) P.Reynaud-Bouret Hawkes Aussois 2015 127 / 129 PDE and point processes Microscopic to Macroscopic Conclusions : work in progress of J. Chevallier Univariate Hawkes processes do NOT lead to a PPS of the form p(s, Xt ) = ν + Xt , but to another completely explicit closed PDE system (NB : Non Markovian process) Interacting Hawkes processes without refractory periods DO lead to this intuition (mean field) Even true for more realistic models with refractory periods and special p(s, Xt ). Propagation of chaos, CLT ... P.Reynaud-Bouret Hawkes Aussois 2015 127 / 129 PDE and point processes Microscopic to Macroscopic Open questions But then when we pick some neurons and measure their activity via electrodes, why do we measure interactions ? What kind of models would allow both : independence with most of the neurons, sparse dependence with some (functional connectivity) ? What can we exactly infer in the reconstructed graphs ? Can it be couple to a more global measure of activity (LFP) to infer more information ? P.Reynaud-Bouret Hawkes Aussois 2015 128 / 129 PDE and point processes Microscopic to Macroscopic Many thanks to : M. Albert, Y. Bouret, J. Chevallier, F. Delarue, F. Grammont, T. Lalo¨e, A. Rouis, C. Tuleau-Malot (Nice), N.R Hansen (Copenhagen), V. Rivoirard (Dauphine), M. Fromont (Rennes 2), M. Doumic (Inria Rocquencourt), M. C´ aceres (Granada), L. Sansonnet (AgroParisTech), S. Schbath (INRA Jouy-en-Josas), F. Picard (LBBE Lyon), T. Bessa¨ıh, R. Lambert, N. Leresche (Neuronal Networks and Physiopathological Rhythms, NPA, Paris 6) and to Sylvie M´el´eard and Vincent Bansaye for this very nice opportunity ! Slides and references on http://math.unice.fr/ reynaudb/ N P.Reynaud-Bouret Hawkes Aussois 2015 129 / 129 PDE and point processes Microscopic to Macroscopic Many thanks to : M. Albert, Y. Bouret, J. Chevallier, F. Delarue, F. Grammont, T. Lalo¨e, A. Rouis, C. Tuleau-Malot (Nice), N.R Hansen (Copenhagen), V. Rivoirard (Dauphine), M. Fromont (Rennes 2), M. Doumic (Inria Rocquencourt), M. C´ aceres (Granada), L. Sansonnet (AgroParisTech), S. Schbath (INRA Jouy-en-Josas), F. Picard (LBBE Lyon), T. Bessa¨ıh, R. Lambert, N. Leresche (Neuronal Networks and Physiopathological Rhythms, NPA, Paris 6) and to Sylvie M´el´eard and Vincent Bansaye for this very nice opportunity ! Slides and references on http://math.unice.fr/ reynaudb/ P.Reynaud-Bouret Hawkes Aussois 2015 129 / 129
© Copyright 2025