MATH 38061/MATH48061/MATH68061: MULTIVARIATE STATISTICS Solutions to Problems on One-Sample Inference

MATH 38061/MATH48061/MATH68061: MULTIVARIATE STATISTICS
Solutions to Problems on One-Sample Inference
1. Consider testing H0 : µT = [7, 11] using the data
"
X=
2 8 6 8
12 9 9 10
#
.
For this data,
"
µ0 =
"
¯ =
X
"
S=
7
11
#
6
10
#
,
,
#
8
−10/3
−10/3
2
,
and
S
−1
9
=
44
"
#
2
10/3
10/3
8
.
So,
T 2 = n X − µ0
T
S−1 X − µ0 = 13.63636.
The distribution of T 2 is
(n − 1)p
Fp,n−p = 3F2,2 .
n−p
At α = 0.05, we have 3F2,2 (0.05) = 57. So, there is no evidence against the hypothesis that
H0 : µT = [7, 11].
2. Measurements of cranial length (x1 ) and cranial breadth (x2 ) on a sample of 35 mature female
frogs led to the following statistics:
¯ T = [22.860, 24.397]
x
and
"
S=
17.178 19.710
23.710
1
#
.
To test the hypothesis that µ1 = µ2 , set l = [1, −1]T . Then
s
T
l x−
p(n − 1)
Fp,n−p (α)lT Sl, lT x +
n(n − p)
s
22.860 − 24.397 −
=
s
!
p(n − 1)
Fp,n−p (α)lT Sl
n(n − p)
2 × 34
× F2,33 (0.05) × 1.468,
35 × 33
s
22.860 − 24.397 +
!
2 × 34
× F2,33 (0.05) × 1.468 .
35 × 33
This interval does not contain zero. So, there is evidence against the hypothesis that µ1 = µ2
at the five percent level.
3. A physical anthropologist performed a mineral analysis of nine ancient Peruvian hairs. The
result for the chromium (x1 ) and strontium (x2 ) levels, in parts per million (ppm) were as
follows.
x1
x2
0.48
12.57
40.53
73.68
2.19
11.13
0.55
20.03
0.74
20.29
0.66
0.78
0.93
4.64
0.37
0.43
0.22
1.08
For this data,
"
¯ =
X
"
S=
5.185556
16.070000
#
,
176.0042 287.2412
287.2412 527.8493
#
,
and
"
S
−1
=
0.05077341 −0.02762951
−0.02762951 0.01692971
#
.
Then a 90% joint confidence ellipse for the population mean vector µT = (µ1 , µ2 ) is
"
9
≤
5.185556 − µ1
16.070000 − µ2
#T "
0.05077341 −0.02762951
−0.02762951 0.01692971
#"
5.185556 − µ1
16.070000 − µ2
#
2×8
F2,7 (0.1) = 21.82075
7
which can be rewritten as 9 × {0.05077341(5.185556 − µ1 )2 + 0.01692971(16.070000 − µ2 )2 −
2 × 0.02762951(5.185556 − µ1 )(16.070000 − µ2 )} ≤ 21.82075.
The individual simultaneous 90% confidence intervals for µ1 and µ2 are
s
x1 −
p(n − 1)
Fp,n−p (α)
(n − p)
≡
5.185556 −
r
q
s11
, x1 +
n
s
p(n − 1)
Fp,n−p (α)
(n − p)
21.82075 × (176.0042/9), 5.185556 +
2
q
r
s11
n
!
21.82075 × (176.0042/9)
and
s
x2 −
p(n − 1)
Fp,n−p (α)
(n − p)
r
≡
16.070000 −
s22
, x2 +
n
s
p(n − 1)
Fp,n−p (α)
(n − p)
q
21.82075 × (527.8493/9), 16.070000 +
q
r
s22
n
!
21.82075 × (527.8493/9) .
4. Perspiration from 10 healthy females were analyzed. Three components, X1 = sweat rate,
X2 = sodium content, X3 = potassium content, were measured, and the results are presented
in the table below.
Individual
1
2
3
4
5
6
7
8
9
10
X1
3.7
5.7
3.8
3.2
3.1
4.6
2.4
7.2
6.7
5.4
X2
48.5
65.1
47.2
53.2
55.5
36.1
24.8
33.1
47.4
54.1
X3
9.3
8.0
10.9
12.0
9.7
7.9
14.0
7.6
8.5
11.3
For this data,


4.58

¯ =
X
 46.50  ,
9.92


0.2612889
0.1787778 −0.2422889


S =  0.1787778 14.3724444 −0.5726667  ,
−0.2422889 −0.5726667 0.4381778
and

S−1

7.94150203 0.08036875 4.4962623


=  0.08036875 0.07421317 0.1414309  .
4.49626227 0.14143086 4.9532113
Then a 90% joint confidence ellipse for the population mean vector µT = (µ1 , µ2 , µ3 ) is

T 


4.58 − µ1
7.94150203 0.08036875 4.4962623
4.58 − µ1

 


10  46.50 − µ2   0.08036875 0.07421317 0.1414309   46.50 − µ2 
9.92 − µ3
4.49626227 0.14143086 4.9532113
9.92 − µ3
9×3
≤
F3,7 (0.1) = 32.59781
7
which can be rewritten as 10×{7.94150203(4.58−µ1 )2 +0.07421317(46.50−µ2 )2 +4.9532113(9.92−
µ3 )2 + 2 × 0.08036875(4.58 − µ1 )(46.50 − µ2 ) + 2 × 4.4962623(4.58 − µ1 )(9.92 − µ3 ) + 2 ×
0.1414309(46.50 − µ2 )(9.92 − µ3 )} ≤ 32.59781.
3
The individual simultaneous 90% confidence intervals for µ1 , µ1 and µ3 are
s
x1 −
≡
4.58 −
s
x2 −
≡
46.5 −
p(n − 1)
Fp,n−p (α)
(n − p)
r
s11
, x1 +
n
s
p(n − 1)
Fp,n−p (α)
(n − p)
r
q
32.59781 × (0.2612889/10), 4.58 +
p(n − 1)
Fp,n−p (α)
(n − p)
r
q
s22
, x2 +
n
s
!
q
32.59781 × (0.2612889/10) ,
p(n − 1)
Fp,n−p (α)
(n − p)
32.59781 × (14.3724444/10), 46.5 +
s11
n
r
s22
n
!
q
32.59781 × (14.3724444/10) ,
and
s
x3 −
≡
9.92 −
p(n − 1)
Fp,n−p (α)
(n − p)
r
s33
, x3 +
n
s
p(n − 1)
Fp,n−p (α)
(n − p)
q
r
32.59781 × (4.9532113/10), 9.92 +
s33
n
!
q
32.59781 × (4.9532113/10) .
5. Many one-sample and two-sample tests involve the statistic dT S−1 d, where S is a p × p
matrix. Suppose S is a diagonal. Then



d 


T

=
=


d 


T
p
X
s11 0 · · · 0
0 s22 · · · 0
..
..
..
..
.
.
.
.
0
0 · · · spp
−1





s−1
0 ··· 0
11
0 s−1
··· 0
22
..
..
..
..
.
.
.
.
−1
0
0 · · · spp
d



d


2
s−1
ii di
i=1
=
p
X
t2i ,
i=1
√
where ti = di / sii .
Suppose now p = 2. Then
S−1 =
"
1
s11 s22 − s212
s22 −s12
−s12 s11
#
and
T
−1
d S
d =
=
=
1
s11 s22 − s212
"
T
d
s22 −s12
−s12 s11
s22 d21 + s11 d22 − 2s12 d1 d2
s11 s22 − s212
t21 + t22 − 2t1 t2 r12
,
2
1 − r12
4
#
d
√
where r12 = s12 / s11 s22 .
Suppose now p = 3. Then


S−1
s33 s22 − s32 s23
− (s33 s12 − s32 s13 )
s23 s12 − s22 s13
1 

−
(s
s
−
s
s
)
s
s
−
s
s
−
(s23 s11 − s21 s13 )  ,
=

33 12
32 13
33 11
31 13
D
s23 s12 − s22 s13
− (s23 s11 − s21 s13 )
s22 s11 − s21 s12
where D = s11 (s33 s22 − s32 s23 ) − s21 (s33 s12 − s32 s13 ) + s31 (s23 s12 − s22 s13 ). So,
dT S−1 d =
1n
(s33 s22 − s32 s23 ) d21 − 2 (s33 s12 − s32 s13 ) d1 d2 + 2 (s23 s12 − s22 s13 ) d1 d3
D
o
+ (s33 s11 − s31 s13 ) d22 − 2 (s23 s11 − s21 s13 ) d2 d3 + (s22 s11 − s21 s12 ) d23
which, after considerable simplification, reduces to
dT S−1 d =
2 ) + t2 (1 − r 2 ) + t2 (1 − r 2 ) − 2t t u
t21 (1 − r23
1 2 12·3 − 2t1 t3 u13·2 − 2t2 t3 u23·1
2
13
3
12
,
2
2 − r2
1 + 2r12 r13 r23 − r23 − r13
12
where u12·3 = r12 − r13 r23 , etc.
5