Ranks and Rank Statistics

Lecture 1. (Rouen)
Ranks and Rank Statistics
Plan of my talk
: (1) What in general we are doing.
: Theory of Rank Tests (Hajek&Sidak)  Theory of Estimators derived
from Rank Tests
; Generalized Lehmann’s Alternative Models
Some Finance data= rather skew distributed
: iid sequence to weakly dependent sequence.
Finance data are rather weakly dependent
Hedge Fund return data, Tick(trade-by-trade) data
: (2).Statistical Problems
: One Sample , Two Sample, A Simple Linear Regression Model
With Generalized Lehmann’s Alternative Models (GLAM)
Dabrowska,D.M., Doksum, K.A. and Miura, R.(1989). “Rank Estimates in a class of
Semiparametric Two Sample Problems.” Ann. Instit. Statistical Mathematics.41. 6379.
Lecture 1 today explains some introductory basic materials in the problems
in which Lectures 2, 3, and 4 follow for more details. .
: (1) First show typical rank statistics
: (2) Then go for general expression of rank statistics.
: (3) Then some brief ideas for asymptotic theory and to asymptotic efficiency.
: (4) Then introduce Generalized Lehmann’s alternative and its special property
(probability can be computed under alternative hypothesis that is a fortunate case
for rank statistics).
:****** Generalized Lehmann’s Alternative Models ********************
Remark. Extremal Index may be used in example (iv) because of its background and
hypothesis of maximum observed or not tIndex in place of t
;++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
: 1. Two sample Wilcoxon rank-sum
Xs and Ys. iid F and G respectively.
Testing Hypothesis.
Testing Hypothesis F=G against Alternative Hypothesis that “G is stochastically
larger”
Test statistics is “Sum of ranks of Yis among all the Xs and Ys”.
Reject Null Hypothesis when the rank sum is large.
Remark 1
Rejection region can be determined based on probability of ranks under Null
hypothesis. Power of the test is the probability of the rejection region under
alternative hypothesis.
It is not always possible to calculate the power, but it is possible under (generalized)
Lehmann’s alternative
Later, we introduce Lehmann’s alternative models with Hoeffding Theorem
Remark 2
Performance of two portfolio may be a Two-Sample problem. But they are not
stochastically independent. So e may take difference of their daily (weekly, or
Monthly ) returns. And Try to see if the distribution is symmetric or SKEWD to Right
(or left)
Estimation.
What to estimate in Two-Sample problems?
: (a) Shift
Show estimation based on rank statistics. Two sample shift parameter.
Two sample Hodges-Lehmann estimator.
Need empirical distribution function and its asymptotic convergence.
: (b) Transformation parameter in GLAM  Lecture 3.
: 2. One sample signed rank statistics
Signed rank
Probability of Signed Ranks
Test statistics and its rejection region
𝑅+
1
1
𝑅∗
𝑖
𝑖
S=∑𝑛𝑖=1 𝑅𝑖+ 𝑠𝑔𝑛(𝑋𝑖 )=∑𝑖:𝑋𝑖>0 𝐽(𝑛+1
+ 2) + ∑𝑖:𝑋𝑖≤0 𝐽(2 − 𝑛+1
)
This can test the Null Hypothesis of “Center of Symmetry is zero” assuming the
distribution is symmetric.
If S is close to zero, we accept the Null Hypothesis. If S is large, then we reject it
(Alternative Hypothesis is the center is positive).
Rejection region and its probability under alternative hypothesis
Later on, we introduce Lehmann’s alternative models with Hoeffding Theorem
Show estimation based on rank statistics. Location parameter or center of symmetry
parameter. Hodges-Lehmann estimator.
Need empirical distribution function and its asymptotic convergence
Score functions J(.)
If the distribution function K(.) is known,
If K(.) is Logistic distribution, then K(t) = 2t-1.
Note.
d/dx {-log f(x)}= - {f ’(x)/f(x)}
f is called strongly unimodal if {-log f(x)} is strongly convex.
In Hajek and Sidak (1967) [Theory of Rank Test], testing Hypothesis for theta is treated.
Here we work on Estimation.
Show some ideas in Robustness study
Gross error model and least informative distribution
Its relation to trimmed mean (and to trimmed Wilcoxon or Trimmed H-L estimator)
We can also take Logistic distribution for g(.).
Then, it will be Trancated “Wilcoxon”.
: 3. A Simple Linear Regression model
Jeackel’s explanation= What we are minimizing?
Compare to Least Square method.
Estimate derived from rank statistics.==Hodges-Lehmann type estimator for
regression parameter.
Basic Probability theory for Ranks and Order Statistics.
Rank in one sample symmetry problem
Order statistics (Order Statistics)
We use the convergence of Empirical distribution functions
Thank you for your attention.
:**************************************
Lehmann’s Alternative : description in Hajek&Sidak Book
Ferguson’s Book (1967)
Description of Hoeffding theorem or calculation of probability of rank(Ordered rank in
his case) for the case of Two sample problem
Mathematics used in Lectures 2, 3, and 4 of Miura
The following materials are rather additional items.
They may be used as references.
In Miura (1987) where a principle of H-L estimators are discussed, a little bit
different use of phai is seen but it is essentially the same.
+
Note.
d/dx {-log f(x)}= - {f ’(x)/f(x)}
f is called strongly unimodal if {-log f(x)} is strongly convex.
Remark. R-estimate in Example5.4 can be found in Miura (1981a and b)