Introduction & Random Sample Contents: I Random Sample I Distribution of sum of random variables from a random sample I Random sample from a normal distribution I Large sample behavior of the important statistics Definition If X1 , · · · , Xn are independent random variables with common marginal distribution with cdf F (x) then we say that they are independent and identically distributed (iid) with common cdf F (x) or X1 , · · · , Xn are random sample from a infinite population with distribution F (x). Introduction & Random Sample X1 , · · · , Xn is a random sample from F (x) I iid X1 , · · · , Xn ∼ F (x)[ or f (x)] I FX1 ,··· ,Xn (x1 , · · · , xn ) = P [X1 ≤ x1 , · · · , Xn ≤ xn ] n Y = P [Xi ≤ xi ] i=1 = n Y F (xi ). i=1 I fX1 ,··· ,Xn (x1 , · · · , xn ) = n Y i=1 f (xi ). Sum of RV from a RS Definition Any function of random variables X1 , · · · , Xn is called Statistic. [function of X1 , · · · , Xn only not parameters] B Example: n X ¯ 2 (Xi − X) ¯ = X1 + · · · + Xn , S 2 = X , n n−1 i=1 T (X1 , · · · , Xn ) = max(X1 , · · · , Xn ) • Statistic is also a random variable • Interest in distribution of a statistic Sum of RV from a RS Definition The distribution of the statistic is called sampling distribution (of the statistic) in contrast to the population distribution Definition ¯ = X1 + · · · + Xn , Sample Mean: X n n X ¯ 2 (Xi − X) Sample Variance: S 2 = n−1 i=1 Sum of RV from a RS Theorem iid Let X1 , · · · , Xn ∼ µ, σ 2 . Then ¯ = µ, V ar X ¯ = σ2, E S 2 = σ2 . E X Proof: Sampling from the normal distribution • Find the sampling distributions of a certain statistics that are functions of random sample X1 , · · · , Xn from a normal distribution. Lemma Let X1 , · · · , Xn be independent random variables. Let gi (xi ) be a function of xi . Then the random variables Ui = gi (Xi ), i = 1, · · · , n are mutually independent. Theorem 1. If Z ∼ N (0, 1) then Z 2 ∼ χ2 (1) 2. Xi , i = 1 · · · , n are independent random variables, Xi ∼ χ2 (pi ). Then X1 + · · · + Xn ∼ χ2 (p1 + · · · , pn ). Sampling from the normal distribution Theorem iid X1 , · · · , Xn ∼ µ, σ 2 1. 2. 3. ¯ and S 2 are independent X ¯ X ∼ N (µ, σ 2 /n) [(n − 1)S 2 ]/σ 2 ∼ χ2 (n − 1) • Other sampling distributions of sample mean and the ratio of sample variances. :Student’s t-distribution and Snedecor’s F -distribution. Sampling from the normal distribution Definition (Students’s t distribution) The t-distribution with d.f. ν is the distribution of T =p Z W/ν , where Z and W are independent with Z ∼ N (0, 1), W ∼ χ2 (ν). The pdf of the t distribution is fT (t) = Γ [(ν + 1)/2] 1 √ Γ [ν/2] πν 1+ t2 ν −(ν+1)/2 - Similar shape with N (0, 1) - Approaches to N (0, 1) as ν → ∞ - Has a heavier and flatter tail than N (0, 1) , − ∞ < t < ∞. Sampling from the normal distribution Definition (Snedecor’s F distribution) Let W ∼ χ2 (ν1 ) and V ∼ χ2 (ν2 ). Assume W and V are independent. Then the distribution of F = W/ν1 W/ν2 has a F distribution with d.f.’s (ν1 , ν2 ). The pdf of the F distribution is Γ [(ν1 + ν2 )/2] ν1 ν1 /2 f (ν1 /2)−1 fF (f ) = Γ [ν1 /2] Γ [ν2 /2] ν2 [1 + (ν1 /ν2 )f ](ν1 +ν2 )/2 where 0 < f < ∞. Sampling from the normal distribution B Example of t-statistics B Example of F -statistics Sampling from the normal distribution Theorem 1. If X ∼ F (ν1 , ν2 ) then 1/X ∼ F (ν2 , ν1 ) 2. If X ∼ t(ν) then X 2 ∼ F (1, ν) 3. If X ∼ F (ν1 , ν2 ) then (ν1 /ν2 )X ∼ Beta(ν1 /2, ν2 /2) 1 + (ν1 /ν2 )X Proof: Convergence Concepts • Investigate the large sample behaviors of the sequence of random variables. C Convergence in probability C Almost sure convergence C Convergence in distribution C Delta method Convergence Concepts Convergence in probability P Definition (Xn → X) A sequence of random variables X1 , X2 , · · · converges in probability to a random variable X if, for every > 0, lim P [|Xn − X| ≥ ] = 0 n→∞ or equivalently lim P [|Xn − X| < ] = 1. n→∞ P B Example: X ∼ FX (x), Xn = [(n − 1)/n]X. Then Xn → X. Convergence Concepts Convergence in probability Theorem (WLLN) iid P ¯n → If X1 , · · · Xn ∼ (µ, σ 2 ). Then X µ. Proof. Theorem P If Xn → X and g is a function defined on the range of X such that Dg = {x|g is discontinuous at x} has P [X ∈ Dg ] = 0, then P g(Xn ) → g(X). P B Example: S 2 → σ 2 ? Convergence Concepts Almost sure convergence Definition (Xn → X a.s. ) A sequence of random variables X1 , X2 , · · · converges almost surely to a random variable X if, for every > 0, n o P lim [|Xn − X| < ] = 1. n→∞ P B Example: X ∼ FX (x), Xn = [(n − 1)/n]X. Then Xn → X. Convergence Concepts Almost sure convergence • Almost sure convergence −→ Convergence in probability B Example: Almost sure convergence ←− Convergence in probability ? Let U ∼ Uniform(0, 1), Xn = U + In and X = U , where I1 = I(0 < U < 1) , p1 = P [I1 = 1] = 1 I2 = I(0 < U ≤ 1/2) , p2 = P [I2 = 1] = 1/2 I3 = I(1/2 < U ≤ 1) , p3 = P [I3 = 1] = 1/2 I4 = I(0 < U ≤ 1/3) , p4 = P [I4 = 1] = 1/3 I5 = I(1/3 < U ≤ 2/3) , p5 = P [I5 = 1] = 1/3 I6 = I(2/3 < U ≤ 1) , p6 = P [I6 = 1] = 1/3 .. .. . . Convergence Concepts Convergence in probability P [|Xn − X| ≥ ] = P [In ≥ ] Is the sequence In for a give value of U = u converge ? Theorem (SLLN) iid ¯ n → µ a.s.. If X1 , · · · Xn ∼ (µ, σ 2 ). Then X Convergence Concepts Convergence in distribution D Definition (Xn → X ) A sequence of random variables Xi ∼ Fi , i = 1, · · · , i.e., Fi (t) = P r[Xi ≤ t]. Suppose that X is a random variable with cdf F , i.e., F (t) = P r[X ≤ t]. Then the sequence of random variables X1 , X2 , · · · converges in distribution to a random variable X if lim Fn (t) = F (t), n→∞ for all continuity points of F . B Example: Let X1 , X2 , · · · be iid random variables. Then D Xn → X1 . Convergence Concepts Convergence in distribution Theorem (CLT) If X1 , X2 , · · · are iid p-dimensional random vectors with finite mean vector µ and covariance matrix Σ. Then √ D ¯n −µ → n X Np (0, Σ) , ¯ n = Pn Xi , where Xi = (x1i , x2i , · · · , xpi )0 , X i=1 µ = (µ1 , µ2 , · · · , µp )0 , and  2  σ1 σ12 · · · σ1p σ21 σ 2 · · · σ2p  2   Σ =  .. .. .. ..   . . . .  σp1 σp2 · · · σp2 Convergence Concepts Convergence in distribution Theorem (CLT with p = 1) If X1 , X2 , · · · are iid random variables with finite mean µ and variance σ 2 . Then D √ ¯n − µ → n X N 0, σ 2 , or equivalently Proof: √ ¯n − µ D n X → N (0, 1) . σ Convergence Concepts Convergence in distribution Theorem (Slutsky’s theorem) D P Let Xn → X and Yn → c, where c is a constant. Then D 1. Yn Xn → cX D 2. Yn + Xn → c + X D 3. g(Yn , Xn ) → g(c, X) in general when g is continuous. B Example: √ ¯ n − µ) D n(X → S Convergence Concepts Delta method Theorem Let Yn , n = 1, 2, · · · be a sequence of random variables that √ D satisfies n(Yn − θ) → N (0, σ 2 ). For a given function g and a specific value of θ, suppose that g 0 (θ) exists and is not 0. Then h 2 i √ D . n [g(Yn ) − g(θ)] → N 0, σ 2 g 0 (θ) iid B Example: X1 , · · · , Xn ∼ (µ, σ 2 ). g(µ) = eµ , g(µ) = 1/µ. Convergence Concepts Delta method • Extension to multi-parametersituation 0 ˆ = θˆ1 , · · · , θˆn . Assume Let θ = (θ1 , · · · , θn )0 and θ ˆ ∼ Np [θ, Σ(θ)] , θ approximately. Consider the function g : Rp → Rk such that   g1 θˆ1 , · · · , θˆn     .. τ = g(θ) =  .   . gk θˆ1 , · · · , θˆn Then  ∂g1 /∂θ1 · · · ∂g  .. .. = . . ∂θ ∂gk /∂θ1 · · ·  ∂g1 /∂θp  .. , . ∂gk /∂θp Convergence Concepts Delta method and 0 ∂g ∂g τˆ ∼ Nk g(θ), Σ(θ) , ∂θ ∂θ iid approximately. B Example: (Xi , Yi ) ∼ (µ, Σ), where µ = (µX , µY )0 and 2 σX σXY Σ= . σY X σY2 Consider the function g(µX , µy ) = µX µY . Order Statistics Definition (OS) The order statistics of a random sample X1 , · · · , Xn are the sample values placed in ascending order. They are denoted by X(1) , X(2) , · · · , X(n) , where X(i) is the ith smallest of X1 , · · · , Xn . • Statistics defined in terms of OS 1. Sample Range: = X(n) − X(1) 2. Sample Median: X([n+1]/2) , if n is odd [X(n/2) + X(n/2+1) ]/2, if n is even Order Statistics • Distribution of order statistics Let X1 , · · · , Xn be a random sample with a common pdf fX (x) and a common cdf FX (x). Then the joint and marginal distributions of order statistics are as follow: fX(1) ,··· ,X(n) (x1 , · · · , xn ) = n!fX (x1 ) · · · fX (xn ), for −∞ < x1 < · · · < xn < ∞, fX(j) ,X(k) (xj , xk ) = n! (j − 1)!(k − j − 1)!(n − k)! [FX (xj )]j−1 [FX (xk ) − FX (xj )]k−j−1 [1 − FX (xk )]n−k fX (xj )fX (xk ), for −∞ < xj < xk < ∞, and Order Statistics fX(j) (xj ) = n! [FX (xj )]j−1 (j − 1)!(n − j)! [1 − FX (xj )]n−j fX (xj ), for −∞ < xj < ∞. iid B Example: X1 , · · · , Xn ∼ Uniform(0,1). Find the distribution of R = X(n) − X(1) . Simulation - Approximation of the parameter or distribution of statistic B Example: Suppose that a particular electrical component is to be modeled with an exponential(λ) life time. p1 =P [component lasts at least h hours] =P [X ≥ h; λ] = e−h/λ . Assuming the components are independent. Consider the probability that out of c components, at least t will last h hours. p2 =P [at least t components last h hours] c X c k = p (1 − p1 )c−k . k 1 k=t Simulation With c = 20, t = 15, h = 150, λ = 300, p1 = 0.60653 and p2 = 0.1382194. If the distribution is complicate like Gamma distribution, there is no close form of the probability for p1 . → approximation using simulation. 1. Generate X1 , · · · , Xn=20 from Exponential(λ = 300) 2. Define Yj = 1 if at least t = 15 Xj ’s are greater than or equal to h = 150, otherwise Yj = 0. Simulation ¯ n for n = 10 and n = 50, where B Example: Distribution of X iid X1 , · · · , Xn ∼ Bernoulli(p = 0.4). Simulation