1. To show Cov( XY ) =
Cov ( XY ) =
1
N
N
1
N
_
_
⎡1
(
X
X
)(
Y
Y
)=⎢
−
−
∑
i
i
1−1
⎣N
_
N
_
∑ ( X i − X )(Yi − Y ) =
1−1
1
N
N
_
⎤ _ _
X
Y
∑
i i⎥− X Y
i =1
⎦
N
_
_
_
∑ ( X iYi − X Yi − X i Y + X Y )
i =1
_
_
_ _⎤
1⎡
Cov( XY ) = ⎢∑ X iY − X ∑Yi − Y ∑ X i + N X Y ⎥
N⎣
⎦
Cov ( XY ) =
_ _ _
_ _
_ _⎤
1 ⎡
−
−
+
X
Y
N
X
Y
N
X
Y
N
X
Y⎥
⎢∑ i i
N⎣
⎦
⎡1
Cov ( XY ) = ⎢
⎣N
⎤
N
_ _
∑ X Y ⎥⎦ − X Y
i =1
_
i i
_
First find X and Y
_
X =
_
Y=
1
(11 + 14 + 12 + 16 + 12) = 65 = 13
5
5
1
(1 + 6 + 8 + 10 + 5) = 30 = 6
5
5
_
⎞
⎛
⎜ Xi − X ⎟
⎠
⎝
-2
1
-1
3
-1
I
1
2
3
4
5
_
⎞
⎛
⎜ Yi − Y ⎟
⎠
⎝
-5
0
2
4
-1
So using either formula
Cov ( XY ) =
1
(10 + 0 − 2 + 12 + 1) = 21 = 4.2
5
5
or
⎡1
Cov( XY ) = ⎢
⎣N
= 82.2-78
= 4.2
⎤ _ _ 1
X
Y
(11 + 84 + 96 + 160 + 60) − (13 * 6)
∑
i i⎥− X Y =
5
i =1
⎦
N
_
⎞
⎛
⎜ X i Yi ⎟
⎠
⎝
11
84
96
160
60
Stata commands to obtain sample variance and covariance
. list
1.
2.
3.
4.
5.
+------------------------+
| age
yearsed
tenure |
|------------------------|
| 18
11
1 |
| 29
14
6 |
| 33
12
8 |
| 35
16
10 |
| 45
12
5 |
+------------------------+
. su yearsed, detail
yearsed
------------------------------------------------------------Percentiles
Smallest
1%
11
11
5%
11
12
10%
11
12
Obs
5
25%
12
14
Sum of Wgt.
5
50%
75%
90%
95%
99%
12
14
16
16
16
Largest
12
12
14
16
. di (4*4)/5
3.2
. corr yearsed tenure, cov
(obs=5)
| yearsed
tenure
-------------+-----------------yearsed |
4
tenure |
5.25
11.5
. di (5.25*4)/5
4.2
Mean
Std. Dev.
Variance
Skewness
Kurtosis
13
2
4
.6288941
1.953125
2. To show
_
_ 2
1
1
2
2
Var(X) = ∑i(X i − X) =
∑ (X i ) - X
N i
N
_
1
1
(X
−
X
)2 =
∑
i
i
N
N
1
=
N
N
∑(X
i =1
Using
2
i
N
_
_
1
( X i − X )( X i − X ) =
∑
N
i =1
N
∑(X
i =1
2
i
_
_ 2
_
− X Xi − X Xi + X )
_ 2
_
− 2 X Xi + X )
N
_
∑ X i = N X and separating terms in brackets
i =1
_ 2
1
=
N
=
N
_ 2
2N X
NX
(X i ) −
+
∑
N
N
i =1
2
_
1
2
(X
)
X
∑
i
N i
2
So to find Var(X)
_
⎛
⎞
⎜ Xi − X ⎟
⎝
⎠
-2
1
-1
3
-1
i
1
2
3
4
5
_
⎛
⎞
⎜ Xi − X ⎟
⎝
⎠
4
1
1
9
1
2
(X )
2
i
121
196
144
256
144
Either
Var ( X ) =
1
N
_
∑ (X i − X )2 =
1
(4 + 1 + 1 + 9 + 1) = 16 = 3.2
5
5
or
2
_
1
1
861
Var ( X ) = ∑i(X i ) 2 - X = (121 + 196 + 144 + 256 + 144) − 169 =
− 169
5
5
N
=172.2-169 = 3.2
Note that if the X data are multiplied by 10
_
X =
1
(110 + 140 + 120 + 160 + 120) = 650 = 130
5
5
then the mean is also multiplied by 10
and the variance
2
Var ( X ) =
_
1
1
86100
2
(X
)
X
(
12100
19600
14400
25600
14400
)
16900
=
+
+
+
+
−
=
− 16900
∑
i
N i
5
5
=320
the variance is therefore multiplied by 100 if the data are multiplied by 10
[and in general Var(aX) = a2Var(X) if a is a constant ]
Similarly the rules on covariances imply that
Cov(aX,Y) = aCov(XY)
(see question 3)
So
⎡1
Cov( XY ) = ⎢
⎣N
⎤
N
_ _
1
∑ X Y ⎥⎦ − X Y = 5 (110 + 840 + 960 + 1600 + 600) − (130 * 6)
i =1
i i
= 822-780
=42
so the covariance is multiplied by 10 when the X data are multiplied by 10
These results help illustrate that neither the variance nor the covariance are scale
invariant – their values will depend on the units of measurement of the variables
3. If Y= A+B, show that
Cov(X,Y) = Cov(X,A) + Cov(X,B)
Cov ( XY ) =
1
N
N
_
_
∑ ( X i − X )(Yi − Y )
1−1
N
_
and since Y = A + B then Y =
So Cov ( XY ) =
=
1
N
N
∑y
i =1
N
i
N
_
=
_
∑A
i
i =1
+ Bi
N
N
=
N
∑ A ∑B
i =1
i
+
N
i =1
N
i
_
_
= A+ B
_
∑ ( X i − X )( Ai + Bi − A− B)
1−1
N
_
_
_
_
1⎛ N
⎞
⎜ ∑ ( X i − X )( Ai − A) + ∑ ( X i − X )(Bi − B) ⎟
N ⎝ 1−1
i =1
⎠
= Cov(XA) + Cov(XB)
It follows that
Var(Y)=Var(A) +Var(B) + 2Cov(A,B)
Since
Var(Y) =
_
_
_
1
1
2
(Y
Y
)
−
=
(Y
−
Y
)((Y
−
Y
) = Cov (YY )
∑
∑
i
i
i
N i
N i
= Cov(Y,(A+B)) = Cov[ (A+B), (A+B) ] = Cov[ (A+B), A] + Cov[(A+B), B]
(since Cov(XY) = Cov(XA) + Cov(XB) if Y=A+B )
= Cov(A,A) + Cov(B,A) + Cov(A,B) + Cov(B,B)
= Var(A) + Var(B) + 2Cov(AB)
ii) To show Cov(XY) = aCov(XZ) if Y = aZ
Cov ( XY ) =
1
N
N
_
1−1
N
_
(since Y =
_
∑ ( X i − X )(Yi − Y ) =
∑ yi
i =1
N
N
=
∑ aZ i
i =1
N
1
N
N
∑(X
1−1
N
=
a∑ Z i
i =1
N
_
= aZ )
_
i
_
− X )(aZ i − a Z )
a
N
Cov ( XY ) =
N
∑(X
1−1
_
i
_
− X )( Z i − Z )
So Cov(XY)=aCov(XZ)
iii) To show
Cov(X,Y) = 0
if y = a (constant)
N
_
a=
∑a
i =1
i
=
N
Na
=a
N
_
So ( ai − a ) = 0
So Cov ( XY ) =
1
N
for all i
N
∑(X
1−1
_
i
_
− X )(a i − a ) = 0
To show
^
Cov(Yˆ , u ) = 0
^
Let y = b
So
^
^
0
+ b1 X
^
^
^
^
^
^
^
^
Cov(Yˆ , u ) = Cov ( y = b0 + b1 X , u ) = Cov (b0 u ) + Cov (b1 X u )
^
^
Cov (b0 u ) = 0
Since b 0 is a constant then
^
And since b1 is a constant then
^
^
^
^
^
Let u = y − y = y − b0 − b1 X
^
^
^
So b1Cov ( X , u ) = b1Cov[ X , ( y − b0 − b1 X )]
^
^
= b1 [Cov ( X , y ) − Cov ( X , b0 ) − Cov ( X , b1 X )]
^
^
= b1 [Cov ( X , y ) − Cov ( X , b0 ) − b1 Var ( X )]
from above
^
Cov (b0 u ) = 0
^
^
Cov(b1 X , u ) = b1Cov( X , u )
and we know the OLS formula
^
b1 =
So
^
^
^
Cov ( y, u ) = b1 [Cov( X , Y ) −
4.
rxy =
Cov ( X , Y )
Var ( X )
Cov ( X , Y )
Var ( X )] = 0
Var ( X )
Correlation coefficient given by
Cov ( X , Y )
Var ( X )Var (Y )
from answers to earlier questions it follows that
rxy =
4.2
3.2 * 9.2
= 0.77
so X (years of education) and Y (job tenure) are positively related
Note that if the X data are multiplied by 10 then
rxy =
42
320 * 9.2
= 0.77
so correlation coefficient (unlike the variance and covariance) is unchanged when the data
are re-scaled
- said to be scale invariant
5. Given Y = 4000 + 0.7X
this is a simple linear equation which traces out a straight line with an intercept (= 4000)
and a slope (=0.7)
So for every £1 of before tax income after tax income rises by 70 pence
(slope = dY/dX so dY=slope*dX )
Follows that the mean E(Y) = E[4000 + 0.7X]
Since 4000 is a constant its expected value is always the same, E(4000) = 4000
Since X is a random variable it fluctuates around an average (mean) value
E(0.7X) = 0.7E(X) = 0.7μx
So E(Y) = 4000 + 0.7 μx
Follows that the variance of Y given by
[
] [
[
] [
E (Y − μ y ) 2 = E (4000 + 0.7 X − 4000 − 0.7 μ x ) 2
]
[
]
E (Y − μ y ) 2 = E {(0.7( X − μ x )} = 0.49 E ( X − μ x ) 2
2
]
so
Var(y) = 0.49Var(X)
Since standard deviation is square root of variance
s.d.(y) = 0.7s.d.(X)
(standard deviation of after tax income is 70% of standard deviation in before-tax income)