1. To show Cov( XY ) = Cov ( XY ) = 1 N N 1 N _ _ ⎡1 ( X X )( Y Y )=⎢ − − ∑ i i 1−1 ⎣N _ N _ ∑ ( X i − X )(Yi − Y ) = 1−1 1 N N _ ⎤ _ _ X Y ∑ i i⎥− X Y i =1 ⎦ N _ _ _ ∑ ( X iYi − X Yi − X i Y + X Y ) i =1 _ _ _ _⎤ 1⎡ Cov( XY ) = ⎢∑ X iY − X ∑Yi − Y ∑ X i + N X Y ⎥ N⎣ ⎦ Cov ( XY ) = _ _ _ _ _ _ _⎤ 1 ⎡ − − + X Y N X Y N X Y N X Y⎥ ⎢∑ i i N⎣ ⎦ ⎡1 Cov ( XY ) = ⎢ ⎣N ⎤ N _ _ ∑ X Y ⎥⎦ − X Y i =1 _ i i _ First find X and Y _ X = _ Y= 1 (11 + 14 + 12 + 16 + 12) = 65 = 13 5 5 1 (1 + 6 + 8 + 10 + 5) = 30 = 6 5 5 _ ⎞ ⎛ ⎜ Xi − X ⎟ ⎠ ⎝ -2 1 -1 3 -1 I 1 2 3 4 5 _ ⎞ ⎛ ⎜ Yi − Y ⎟ ⎠ ⎝ -5 0 2 4 -1 So using either formula Cov ( XY ) = 1 (10 + 0 − 2 + 12 + 1) = 21 = 4.2 5 5 or ⎡1 Cov( XY ) = ⎢ ⎣N = 82.2-78 = 4.2 ⎤ _ _ 1 X Y (11 + 84 + 96 + 160 + 60) − (13 * 6) ∑ i i⎥− X Y = 5 i =1 ⎦ N _ ⎞ ⎛ ⎜ X i Yi ⎟ ⎠ ⎝ 11 84 96 160 60 Stata commands to obtain sample variance and covariance . list 1. 2. 3. 4. 5. +------------------------+ | age yearsed tenure | |------------------------| | 18 11 1 | | 29 14 6 | | 33 12 8 | | 35 16 10 | | 45 12 5 | +------------------------+ . su yearsed, detail yearsed ------------------------------------------------------------Percentiles Smallest 1% 11 11 5% 11 12 10% 11 12 Obs 5 25% 12 14 Sum of Wgt. 5 50% 75% 90% 95% 99% 12 14 16 16 16 Largest 12 12 14 16 . di (4*4)/5 3.2 . corr yearsed tenure, cov (obs=5) | yearsed tenure -------------+-----------------yearsed | 4 tenure | 5.25 11.5 . di (5.25*4)/5 4.2 Mean Std. Dev. Variance Skewness Kurtosis 13 2 4 .6288941 1.953125 2. To show _ _ 2 1 1 2 2 Var(X) = ∑i(X i − X) = ∑ (X i ) - X N i N _ 1 1 (X − X )2 = ∑ i i N N 1 = N N ∑(X i =1 Using 2 i N _ _ 1 ( X i − X )( X i − X ) = ∑ N i =1 N ∑(X i =1 2 i _ _ 2 _ − X Xi − X Xi + X ) _ 2 _ − 2 X Xi + X ) N _ ∑ X i = N X and separating terms in brackets i =1 _ 2 1 = N = N _ 2 2N X NX (X i ) − + ∑ N N i =1 2 _ 1 2 (X ) X ∑ i N i 2 So to find Var(X) _ ⎛ ⎞ ⎜ Xi − X ⎟ ⎝ ⎠ -2 1 -1 3 -1 i 1 2 3 4 5 _ ⎛ ⎞ ⎜ Xi − X ⎟ ⎝ ⎠ 4 1 1 9 1 2 (X ) 2 i 121 196 144 256 144 Either Var ( X ) = 1 N _ ∑ (X i − X )2 = 1 (4 + 1 + 1 + 9 + 1) = 16 = 3.2 5 5 or 2 _ 1 1 861 Var ( X ) = ∑i(X i ) 2 - X = (121 + 196 + 144 + 256 + 144) − 169 = − 169 5 5 N =172.2-169 = 3.2 Note that if the X data are multiplied by 10 _ X = 1 (110 + 140 + 120 + 160 + 120) = 650 = 130 5 5 then the mean is also multiplied by 10 and the variance 2 Var ( X ) = _ 1 1 86100 2 (X ) X ( 12100 19600 14400 25600 14400 ) 16900 = + + + + − = − 16900 ∑ i N i 5 5 =320 the variance is therefore multiplied by 100 if the data are multiplied by 10 [and in general Var(aX) = a2Var(X) if a is a constant ] Similarly the rules on covariances imply that Cov(aX,Y) = aCov(XY) (see question 3) So ⎡1 Cov( XY ) = ⎢ ⎣N ⎤ N _ _ 1 ∑ X Y ⎥⎦ − X Y = 5 (110 + 840 + 960 + 1600 + 600) − (130 * 6) i =1 i i = 822-780 =42 so the covariance is multiplied by 10 when the X data are multiplied by 10 These results help illustrate that neither the variance nor the covariance are scale invariant – their values will depend on the units of measurement of the variables 3. If Y= A+B, show that Cov(X,Y) = Cov(X,A) + Cov(X,B) Cov ( XY ) = 1 N N _ _ ∑ ( X i − X )(Yi − Y ) 1−1 N _ and since Y = A + B then Y = So Cov ( XY ) = = 1 N N ∑y i =1 N i N _ = _ ∑A i i =1 + Bi N N = N ∑ A ∑B i =1 i + N i =1 N i _ _ = A+ B _ ∑ ( X i − X )( Ai + Bi − A− B) 1−1 N _ _ _ _ 1⎛ N ⎞ ⎜ ∑ ( X i − X )( Ai − A) + ∑ ( X i − X )(Bi − B) ⎟ N ⎝ 1−1 i =1 ⎠ = Cov(XA) + Cov(XB) It follows that Var(Y)=Var(A) +Var(B) + 2Cov(A,B) Since Var(Y) = _ _ _ 1 1 2 (Y Y ) − = (Y − Y )((Y − Y ) = Cov (YY ) ∑ ∑ i i i N i N i = Cov(Y,(A+B)) = Cov[ (A+B), (A+B) ] = Cov[ (A+B), A] + Cov[(A+B), B] (since Cov(XY) = Cov(XA) + Cov(XB) if Y=A+B ) = Cov(A,A) + Cov(B,A) + Cov(A,B) + Cov(B,B) = Var(A) + Var(B) + 2Cov(AB) ii) To show Cov(XY) = aCov(XZ) if Y = aZ Cov ( XY ) = 1 N N _ 1−1 N _ (since Y = _ ∑ ( X i − X )(Yi − Y ) = ∑ yi i =1 N N = ∑ aZ i i =1 N 1 N N ∑(X 1−1 N = a∑ Z i i =1 N _ = aZ ) _ i _ − X )(aZ i − a Z ) a N Cov ( XY ) = N ∑(X 1−1 _ i _ − X )( Z i − Z ) So Cov(XY)=aCov(XZ) iii) To show Cov(X,Y) = 0 if y = a (constant) N _ a= ∑a i =1 i = N Na =a N _ So ( ai − a ) = 0 So Cov ( XY ) = 1 N for all i N ∑(X 1−1 _ i _ − X )(a i − a ) = 0 To show ^ Cov(Yˆ , u ) = 0 ^ Let y = b So ^ ^ 0 + b1 X ^ ^ ^ ^ ^ ^ ^ ^ Cov(Yˆ , u ) = Cov ( y = b0 + b1 X , u ) = Cov (b0 u ) + Cov (b1 X u ) ^ ^ Cov (b0 u ) = 0 Since b 0 is a constant then ^ And since b1 is a constant then ^ ^ ^ ^ ^ Let u = y − y = y − b0 − b1 X ^ ^ ^ So b1Cov ( X , u ) = b1Cov[ X , ( y − b0 − b1 X )] ^ ^ = b1 [Cov ( X , y ) − Cov ( X , b0 ) − Cov ( X , b1 X )] ^ ^ = b1 [Cov ( X , y ) − Cov ( X , b0 ) − b1 Var ( X )] from above ^ Cov (b0 u ) = 0 ^ ^ Cov(b1 X , u ) = b1Cov( X , u ) and we know the OLS formula ^ b1 = So ^ ^ ^ Cov ( y, u ) = b1 [Cov( X , Y ) − 4. rxy = Cov ( X , Y ) Var ( X ) Cov ( X , Y ) Var ( X )] = 0 Var ( X ) Correlation coefficient given by Cov ( X , Y ) Var ( X )Var (Y ) from answers to earlier questions it follows that rxy = 4.2 3.2 * 9.2 = 0.77 so X (years of education) and Y (job tenure) are positively related Note that if the X data are multiplied by 10 then rxy = 42 320 * 9.2 = 0.77 so correlation coefficient (unlike the variance and covariance) is unchanged when the data are re-scaled - said to be scale invariant 5. Given Y = 4000 + 0.7X this is a simple linear equation which traces out a straight line with an intercept (= 4000) and a slope (=0.7) So for every £1 of before tax income after tax income rises by 70 pence (slope = dY/dX so dY=slope*dX ) Follows that the mean E(Y) = E[4000 + 0.7X] Since 4000 is a constant its expected value is always the same, E(4000) = 4000 Since X is a random variable it fluctuates around an average (mean) value E(0.7X) = 0.7E(X) = 0.7μx So E(Y) = 4000 + 0.7 μx Follows that the variance of Y given by [ ] [ [ ] [ E (Y − μ y ) 2 = E (4000 + 0.7 X − 4000 − 0.7 μ x ) 2 ] [ ] E (Y − μ y ) 2 = E {(0.7( X − μ x )} = 0.49 E ( X − μ x ) 2 2 ] so Var(y) = 0.49Var(X) Since standard deviation is square root of variance s.d.(y) = 0.7s.d.(X) (standard deviation of after tax income is 70% of standard deviation in before-tax income)
© Copyright 2024