Extreme Value Theory Fitting Models Investigation of an Automated Approach to Threshold Selection for Generalized Pareto Kate R. Saunders Supervisors: Peter Taylor & David Karoly University of Melbourne April 8, 2015 Extreme Value Theory Fitting Models Outline 1 Extreme Value Theory 2 Fitting Models Extreme Value Theory Fitting Models Problem What are the climate processes that drive extreme rainfall? (El Ni˜ no Southern Oscillation, Interdecadal Pacific Oscillation) How do these drivers differ at different timescales; sub-daily, daily, consecutive day totals? Extreme Value Theory Fitting Models Data Extreme Value Theory Fitting Models Extreme Value Theory Extreme Value Theory Fitting Models Block Maxima Extreme Value Theory Fitting Models Block Maxima Let X1 , X2 , ... , Xn be a sequence of i.i.d. random variables with distribution function F . Define Mn = max{X1 , X2 , . . . , Xn }. (Xi might be daily rainfall observations and M365 the annual maximum rainfall.) Pr (Mn ≤ x) = Pr (X1 ≤ x, . . . , Xn ≤ x) = Pr (X1 ≤ x) × · · · × Pr (Xn ≤ x) = F (x)n . As n → ∞, the distribution of the Mn converges to a generalised extreme value distribution. Extreme Value Theory Fitting Models Generalized Extreme Value Theorem (Fisher-Tippett-Gnendenko) If there exists sequences of constants {an > 0} and {bn } such that Mn − bn ≤ z → G (z) as n → ∞ Pr an for a non-degenerate distribution function G , then G is a member of the Generalized Extreme Value family − 1 z −µ ξ G (z) = exp − 1 + ξ σ defined on {z : 1 + ξ(z − µ)/σ > 0}, where ∞ < µ < ∞, σ > 0 and −∞ < ξ < ∞. Extreme Value Theory Fitting Models Leveraging more data Extreme Value Theory Fitting Models Generalized Pareto Distribution Let X1 , X2 , ... , Xn be a sequence of iid random variables with marginal distribution function F . Pr {X > u + y | X > u} = 1 − F (u + y ) 1 − F (u) y > 0. If F satisfies Generalized Extreme Value Theorem then for a large enough threshold u, the distribution function of (X − u) conditional on X > u is the GPD. Generalized Pareto Distribution - Picklands (1975) ξy −1/ξ H(y ) = 1 − 1 + σ ˜ defined on {y : y > 0} and (1 + ξy /˜ σ > 0) where, σ ˜ = σ + ξ(u − µ). Extreme Value Theory Fitting Models Dependence Rainfall observations are dependent Heavy rainfall yesterday effects the probability of heavy rain today Heavy rainfall a year ago doesn’t Extreme Value Theory extends to stationary series with weak long range dependence However, for processes with short range dependence extremes occur in clusters Extreme Value Theory Fitting Models Clusters Extreme Value Theory Fitting Models Dependent Series Let {Xi }i≥1 be a stationary series and {Xi∗ }i≥1 be an independent series of variables with the same marginal distribution. Define Mn = max{X1 , . . . , Xn } and Mn∗ = max{X1∗ , . . . , Xn∗ }. Under suitable regularity conditions, ∗ (Mn − bn ) Pr ≤ z → G (z), an as n → ∞ for normalizing sequences {an > 0} and {bn }, where G is a non-degenerate distribution functions, if and only if (Mn − bn ) Pr ≤ z → G θ (z), an for a constant θ such that 0 < θ ≤ 1. Extreme Value Theory Fitting Models Extremal Index θ = {Limiting mean cluster size}−1 ∈ (0, 1] θ = 0.5 ⇒ 2 observations per cluster on average. Extreme Value Theory Fitting Models Fitting Models Extreme Value Theory Fitting Models Fitting Models Select a threshold Decluster the data for independent observations Extreme Value Theory Fitting Models Declustering Blocks Partition the observation sequence into blocks of length, b Assume extreme observations within the same block belong to the same same cluster. Runs Specify a run length, K Assume extreme observations with an inter-exceedance time of less than K belong to the same cluster. Extreme Value Theory Fitting Models Intervals The limiting process of exceedance times is compound Poisson for stationary series (Hsing et al. 1988). Ferro and Segers (2003) showed the limiting distribution of inter-exceedance times is a mixture distribution with weight θ, Tθ (t) = (1 − θ)0 + θ · θ exp(−θt), where 0 is a degenerate distribution, Tθ is the distribution of arrival times of exceedances at threshold u. By equating moments a non-parametric estimator can be found for θ. The largest θ(N − 1) inter-exceedance times can be interpreted as between cluster arrivals. Extreme Value Theory Fitting Models Fitting Models → Select a threshold Decluster the data for independent observations Extreme Value Theory Fitting Models Mean Residual Life Plots For sufficiently high thresholds, as the threshold increases the expected exceedance above the threshold should grow linearly. Extreme Value Theory Fitting Models Parameter Stability Plots Parameter estimates of (modified) scale and shape parameters should be constant for the range of valid thresholds. Extreme Value Theory Fitting Models Alternative Set the threshold according to a high quantile of non-zero observations Eg. 90th percentile. Is this an appropriate threshold? Is our model is misspecified? Suggested approach by S¨ uveges and Davison et al. (2010) is to test the threshold, u, and run parameter, K pair for model misspecification. Extreme Value Theory Fitting Models Log-Likelihood Limiting distribution of inter-exceedance times: Tθ (t) = (1 − θ)0 + θ2 exp(−θt), Log-Likelihood (strictly positive inter-exceedance times): N−1 X i=1 I(ti =0) log (1 − θ) 2 I(ti >0) (θ exp(θti ) = N−1 X 2I(ti > 0) log(θ) − θti , i=1 i where ti = NT n , n is the total number of observations and N is the number of exceedances. ˆ tends to 1 suggesting However as n gets large our estimate, θ, independence. Extreme Value Theory Fitting Models Log-Likelihood Adjustment of the inter-exceedance times using the run parameter K : ci = max{ti − K , 0} Log-likelihood: `(θ; ci ) = N−1 X I(ci = 0) log(1 − θ) + 2I(ci > 0) log(θ) − θci i=1 Approach used in Fukutome et al. (2014) and S¨ uveges and Davison (2010). Test combinations of threshold, u, and run parameter, K , for misspecification of the likelihood function. Select the (u, K ) pair that maximizes the number of independent clusters. Extreme Value Theory Fitting Models Model Misspecification If a parametric model is misspecified then there is no θ such that g = f (θ), where g is the true model and f is the misspecified parametric model. For a well specified model, the Fisher’s information matrix, I (θ) = E {`00 (θ; cj } is equal to the variance of the score vector, J(θ) = Var {`0 (θ; cj )}. Test the hypothesis: D(θ) = J(θ) − I (θ), where H0 : D(θ) = 0 and H1 : D(θ) 6= 0. Extreme Value Theory Fitting Models Empirically: ˆ = IN−1 (θ) N−1 −1 X 00 ˆ ` (θ; cj ) (N − 1) j=1 ˆ = JN−1 (θ) N−1 X 1 ˆ cj )2 `0 (θ; (N − 1) j=1 ˆ = JN−1 (θ) ˆ − IN−1 (θ) ˆ DN−1 (θ) ˆ = VN−1 (θ) N−1 X 2 1 0 −1 0 ˆ ˆ ˆ ˆ dj (θ; cj ) − DN−1 (θ)IN−1 (θ) ` (θ; cj ) (N − 1) j=1 ˆ is the sample variance of DN−1 (θ). ˆ where VN−1 (θ) Extreme Value Theory Fitting Models Model Misspecification Theorem: (Information Matrix Test - Whyte 1982) If the assumed model `(θ; ci ) contains the true model for some θ = θ0 , then as n → ∞, p w ˆ − (i) (N − 1)DN−1 (θ) → N(0, V (θ0 )), a.s. ˆ ) −−→ V (θ0 ), and VN−1 (θ) ˆ is non-singular for sufficiently (ii) VN−1 (θN−1 large N, (iii) Then the Information Matrix Test statistic, ˆ 0 VN−1 (θ) ˆ −1 DN−1 (θ) ˆ is asymptotically χ2 distributed. (N − 1)DN−1 (θ) 1 Extreme Value Theory Fitting Models Example: AR(2) Yi = 0.95Yi−1 − 0.89Yi−2 + Zi where Zi ∼ GP(1, 1/2) and n = 8000. 100 simulations Extreme Value Theory Fitting Models Adjusting inter-exceedance times Common to assume stationarity by enforcing seasonal blocking. Collapse inter-exceedance times across seasonal blocks using the memoryless property of the exponential for fitting. Extreme Value Theory Fitting Models Results: Gatton, South East Queensland Extreme Value Theory Fitting Models Results: Oenpelli, Northern Territory Extreme Value Theory Fitting Models Summary Shown how to check if the threshold and run parameter selected violate the assumptions of the model Given confidence to threshold selection in the absence of a hard and fast rule and in the presence of subjectivity Extreme Value Theory Fitting Models References Ferro, C. and Segers, J. (2003). Inference for clusters of extreme values. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 65(2), pp.545-556. Fukutome, S., Liniger, M. and Sveges, M. (2014). Automatic threshold and run parameter selection: a climatology for extreme hourly precipitation in Switzerland. Theoretical and Applied Climatology. Hsing, T., H¨ usler, J. and Leadbetter, M. (1988). On the exceedance point process for a stationary sequence. Probability Theory and Related Fields, 78(1), pp.97-112. S¨ uveges, M. and Davison, A. (2010). Model misspecification in peaks over threshold analysis. The Annals of Applied Statistics, 4(1), pp.203-221. White, H. (1982). Maximum Likelihood Estimation of Misspecified Models. Econometrica, 50(1), p.1. Extreme Value Theory Fitting Models ANZAPW 2015: Barossa Valley, South Australia This work has been supported by the ARC through the Laureate Fellowship FL130100039. Questions? Extreme Value Theory Fitting Models Results: Kalamia, Far North Queensland Extreme Value Theory Fitting Models Results: Yamba, New South Wales
© Copyright 2024