Introduction & Random Sample

Introduction & Random Sample
Contents:
I
Random Sample
I
Distribution of sum of random variables from a random sample
I
Random sample from a normal distribution
I
Large sample behavior of the important statistics
Definition
If X1 , · · · , Xn are independent random variables with common
marginal distribution with cdf F (x) then we say that they are
independent and identically distributed (iid) with common cdf
F (x) or X1 , · · · , Xn are random sample from a infinite population
with distribution F (x).
Introduction & Random Sample
X1 , · · · , Xn is a random sample from F (x)
I
iid
X1 , · · · , Xn ∼ F (x)[ or f (x)]
I
FX1 ,··· ,Xn (x1 , · · · , xn ) = P [X1 ≤ x1 , · · · , Xn ≤ xn ]
n
Y
=
P [Xi ≤ xi ]
i=1
=
n
Y
F (xi ).
i=1
I
fX1 ,··· ,Xn (x1 , · · · , xn ) =
n
Y
i=1
f (xi ).
Sum of RV from a RS
Definition
Any function of random variables X1 , · · · , Xn is called Statistic.
[function of X1 , · · · , Xn only not parameters]
B Example:
n
X
¯ 2
(Xi − X)
¯ = X1 + · · · + Xn , S 2 =
X
,
n
n−1
i=1
T (X1 , · · · , Xn ) = max(X1 , · · · , Xn )
• Statistic is also a random variable
• Interest in distribution of a statistic
Sum of RV from a RS
Definition
The distribution of the statistic is called sampling distribution (of
the statistic) in contrast to the population distribution
Definition
¯ = X1 + · · · + Xn ,
Sample Mean: X
n
n
X
¯ 2
(Xi − X)
Sample Variance: S 2 =
n−1
i=1
Sum of RV from a RS
Theorem
iid
Let X1 , · · · , Xn ∼ µ, σ 2 . Then
¯ = µ, V ar X
¯ = σ2, E S 2 = σ2 .
E X
Proof:
Sampling from the normal distribution
• Find the sampling distributions of a certain statistics that are
functions of random sample X1 , · · · , Xn from a normal
distribution.
Lemma
Let X1 , · · · , Xn be independent random variables. Let gi (xi ) be a
function of xi . Then the random variables
Ui = gi (Xi ), i = 1, · · · , n are mutually independent.
Theorem
1. If Z ∼ N (0, 1) then Z 2 ∼ χ2 (1)
2. Xi , i = 1 · · · , n are independent random variables,
Xi ∼ χ2 (pi ). Then X1 + · · · + Xn ∼ χ2 (p1 + · · · , pn ).
Sampling from the normal distribution
Theorem
iid
X1 , · · · , Xn ∼ µ, σ 2
1.
2.
3.
¯ and S 2 are independent
X
¯
X ∼ N (µ, σ 2 /n)
[(n − 1)S 2 ]/σ 2 ∼ χ2 (n − 1)
• Other sampling distributions of sample mean and the ratio of
sample variances.
:Student’s t-distribution and Snedecor’s F -distribution.
Sampling from the normal distribution
Definition (Students’s t distribution)
The t-distribution with d.f. ν is the distribution of
T =p
Z
W/ν
,
where Z and W are independent with Z ∼ N (0, 1), W ∼ χ2 (ν).
The pdf of the t distribution is
fT (t) =
Γ [(ν + 1)/2] 1
√
Γ [ν/2]
πν
1+
t2
ν
−(ν+1)/2
- Similar shape with N (0, 1)
- Approaches to N (0, 1) as ν → ∞
- Has a heavier and flatter tail than N (0, 1)
, − ∞ < t < ∞.
Sampling from the normal distribution
Definition (Snedecor’s F distribution)
Let W ∼ χ2 (ν1 ) and V ∼ χ2 (ν2 ). Assume W and V are
independent. Then the distribution of
F =
W/ν1
W/ν2
has a F distribution with d.f.’s (ν1 , ν2 ). The pdf of the F
distribution is
Γ [(ν1 + ν2 )/2] ν1 ν1 /2
f (ν1 /2)−1
fF (f ) =
Γ [ν1 /2] Γ [ν2 /2] ν2
[1 + (ν1 /ν2 )f ](ν1 +ν2 )/2
where 0 < f < ∞.
Sampling from the normal distribution
B Example of t-statistics
B Example of F -statistics
Sampling from the normal distribution
Theorem
1. If X ∼ F (ν1 , ν2 ) then 1/X ∼ F (ν2 , ν1 )
2. If X ∼ t(ν) then X 2 ∼ F (1, ν)
3. If X ∼ F (ν1 , ν2 ) then
(ν1 /ν2 )X
∼ Beta(ν1 /2, ν2 /2)
1 + (ν1 /ν2 )X
Proof:
Convergence Concepts
• Investigate the large sample behaviors of the sequence of random
variables.
C Convergence in probability
C Almost sure convergence
C Convergence in distribution
C Delta method
Convergence Concepts
Convergence in probability
P
Definition (Xn → X)
A sequence of random variables X1 , X2 , · · · converges in
probability to a random variable X if, for every > 0,
lim P [|Xn − X| ≥ ] = 0
n→∞
or equivalently
lim P [|Xn − X| < ] = 1.
n→∞
P
B Example: X ∼ FX (x), Xn = [(n − 1)/n]X. Then Xn → X.
Convergence Concepts
Convergence in probability
Theorem (WLLN)
iid
P
¯n →
If X1 , · · · Xn ∼ (µ, σ 2 ). Then X
µ.
Proof.
Theorem
P
If Xn → X and g is a function defined on the range of X such
that Dg = {x|g is discontinuous at x} has P [X ∈ Dg ] = 0, then
P
g(Xn ) → g(X).
P
B Example: S 2 → σ 2 ?
Convergence Concepts
Almost sure convergence
Definition (Xn → X a.s. )
A sequence of random variables X1 , X2 , · · · converges almost
surely to a random variable X if, for every > 0,
n
o
P lim [|Xn − X| < ] = 1.
n→∞
P
B Example: X ∼ FX (x), Xn = [(n − 1)/n]X. Then Xn → X.
Convergence Concepts
Almost sure convergence
• Almost sure convergence −→ Convergence in probability
B Example: Almost sure convergence ←− Convergence in
probability ?
Let U ∼ Uniform(0, 1), Xn = U + In and X = U , where
I1 = I(0 < U < 1)
, p1 = P [I1 = 1] = 1
I2 = I(0 < U ≤ 1/2)
, p2 = P [I2 = 1] = 1/2
I3 = I(1/2 < U ≤ 1)
, p3 = P [I3 = 1] = 1/2
I4 = I(0 < U ≤ 1/3)
, p4 = P [I4 = 1] = 1/3
I5 = I(1/3 < U ≤ 2/3) , p5 = P [I5 = 1] = 1/3
I6 = I(2/3 < U ≤ 1)
, p6 = P [I6 = 1] = 1/3
..
..
.
.
Convergence Concepts
Convergence in probability
P [|Xn − X| ≥ ] = P [In ≥ ]
Is the sequence In for a give value of U = u converge ?
Theorem (SLLN)
iid
¯ n → µ a.s..
If X1 , · · · Xn ∼ (µ, σ 2 ). Then X
Convergence Concepts
Convergence in distribution
D
Definition (Xn → X )
A sequence of random variables Xi ∼ Fi , i = 1, · · · , i.e.,
Fi (t) = P r[Xi ≤ t]. Suppose that X is a random variable with cdf
F , i.e., F (t) = P r[X ≤ t]. Then the sequence of random variables
X1 , X2 , · · · converges in distribution to a random variable X if
lim Fn (t) = F (t),
n→∞
for all continuity points of F .
B Example: Let X1 , X2 , · · · be iid random variables. Then
D
Xn → X1 .
Convergence Concepts
Convergence in distribution
Theorem (CLT)
If X1 , X2 , · · · are iid p-dimensional random vectors with finite
mean vector µ and covariance matrix Σ. Then
√
D
¯n −µ →
n X
Np (0, Σ) ,
¯ n = Pn Xi ,
where Xi = (x1i , x2i , · · · , xpi )0 , X
i=1
µ = (µ1 , µ2 , · · · , µp )0 , and
 2

σ1 σ12 · · · σ1p
σ21 σ 2 · · · σ2p 
2


Σ =  ..
..
..
.. 
 .
.
.
. 
σp1 σp2 · · · σp2
Convergence Concepts
Convergence in distribution
Theorem (CLT with p = 1)
If X1 , X2 , · · · are iid random variables with finite mean µ and
variance σ 2 . Then
D
√
¯n − µ →
n X
N 0, σ 2 ,
or equivalently
Proof:
√
¯n − µ D
n X
→ N (0, 1) .
σ
Convergence Concepts
Convergence in distribution
Theorem (Slutsky’s theorem)
D
P
Let Xn → X and Yn → c, where c is a constant. Then
D
1. Yn Xn → cX
D
2. Yn + Xn → c + X
D
3. g(Yn , Xn ) → g(c, X) in general when g is continuous.
B Example:
√
¯ n − µ) D
n(X
→
S
Convergence Concepts
Delta method
Theorem
Let Yn , n = 1, 2, · · · be a sequence of random variables that
√
D
satisfies n(Yn − θ) → N (0, σ 2 ). For a given function g and a
specific value of θ, suppose that g 0 (θ) exists and is not 0. Then
h
2 i
√
D
.
n [g(Yn ) − g(θ)] → N 0, σ 2 g 0 (θ)
iid
B Example: X1 , · · · , Xn ∼ (µ, σ 2 ). g(µ) = eµ , g(µ) = 1/µ.
Convergence Concepts
Delta method
• Extension to multi-parametersituation 0
ˆ = θˆ1 , · · · , θˆn . Assume
Let θ = (θ1 , · · · , θn )0 and θ
ˆ ∼ Np [θ, Σ(θ)] ,
θ
approximately.
Consider the function g : Rp → Rk such that
 
g1 θˆ1 , · · · , θˆn




..
τ = g(θ) = 
.

 .
gk θˆ1 , · · · , θˆn
Then

∂g1 /∂θ1 · · ·
∂g 
..
..
=
.
.
∂θ
∂gk /∂θ1 · · ·

∂g1 /∂θp

..
,
.
∂gk /∂θp
Convergence Concepts
Delta method
and
0 ∂g
∂g
τˆ ∼ Nk g(θ),
Σ(θ)
,
∂θ
∂θ
iid
approximately.
B Example: (Xi , Yi ) ∼ (µ, Σ), where µ = (µX , µY )0 and
2
σX σXY
Σ=
.
σY X σY2
Consider the function g(µX , µy ) = µX µY .
Order Statistics
Definition (OS)
The order statistics of a random sample X1 , · · · , Xn are the
sample values placed in ascending order. They are denoted by
X(1) , X(2) , · · · , X(n) ,
where X(i) is the ith smallest of X1 , · · · , Xn .
• Statistics defined in terms of OS
1. Sample Range: = X(n) − X(1)
2. Sample Median:
X([n+1]/2) ,
if n is odd
[X(n/2) + X(n/2+1) ]/2, if n is even
Order Statistics
• Distribution of order statistics
Let X1 , · · · , Xn be a random sample with a common pdf fX (x)
and a common cdf FX (x). Then the joint and marginal
distributions of order statistics are as follow:
fX(1) ,··· ,X(n) (x1 , · · · , xn ) = n!fX (x1 ) · · · fX (xn ),
for −∞ < x1 < · · · < xn < ∞,
fX(j) ,X(k) (xj , xk ) =
n!
(j − 1)!(k − j − 1)!(n − k)!
[FX (xj )]j−1 [FX (xk ) − FX (xj )]k−j−1
[1 − FX (xk )]n−k fX (xj )fX (xk ),
for −∞ < xj < xk < ∞, and
Order Statistics
fX(j) (xj ) =
n!
[FX (xj )]j−1
(j − 1)!(n − j)!
[1 − FX (xj )]n−j fX (xj ),
for −∞ < xj < ∞.
iid
B Example: X1 , · · · , Xn ∼ Uniform(0,1). Find the distribution of
R = X(n) − X(1) .
Simulation
- Approximation of the parameter or distribution of statistic
B Example: Suppose that a particular electrical component is to
be modeled with an exponential(λ) life time.
p1 =P [component lasts at least h hours]
=P [X ≥ h; λ] = e−h/λ .
Assuming the components are independent. Consider the
probability that out of c components, at least t will last h hours.
p2 =P [at least t components last h hours]
c X
c k
=
p (1 − p1 )c−k .
k 1
k=t
Simulation
With c = 20, t = 15, h = 150, λ = 300, p1 = 0.60653 and
p2 = 0.1382194.
If the distribution is complicate like Gamma distribution, there is
no close form of the probability for p1 . → approximation using
simulation.
1. Generate X1 , · · · , Xn=20 from Exponential(λ = 300)
2. Define Yj = 1 if at least t = 15 Xj ’s are greater than or
equal to h = 150, otherwise Yj = 0.
Simulation
¯ n for n = 10 and n = 50, where
B Example: Distribution of X
iid
X1 , · · · , Xn ∼ Bernoulli(p = 0.4).
Simulation