ON THE VARIANCE OF THE SAMPLE MEAN FROM FINITE POPULATION

Journal of Scientific Research
Volume XXXIV No. 2 October 2005
ON THE VARIANCE OF THE SAMPLE MEAN
FROM FINITE POPULATION
Syed Shakir Ali Ghazali1*, Ghausia Masood Gilani**, Muhammad Hussain Tahir***
*Department of Statistics, Govt. S.E. College, Bahawalpur.
**Institute of Statistics, Punjab University, Lahore
***Department of Statistics, Islamia University, Bahawalpur
ABSTRACT
An alternate proof of the variance of the sample mean in case of simple random
sampling without replacement (SRSWOR) is obtained. This proof is very simple
and avoids the use of expectation.
Key words: Sample mean, simple random sampling, variance, without replacement
sampling.
1. INTRODUCTION
Simple random sampling from a finite population
has
attracted
much
of
APPROACH I:
the researchers and
Barnett (2002, p.32-35) has given the proof of the
practitioners working in surveys. It is the simplest,
variance of mean as follows:
most preferable and widely used probability
If a sample of size n is drawn from a finite
sampling technique. The variance or standard error
population of size N having y1 , y1 , L , y N units,
of the sampling distribution of mean serves as a
N
n
then there are K =   distinct samples each
basis for efficiency comparison with other sampling
methods like stratified random sampling, systematic
−1
N
having the same probability   .
n
sampling, cluster sampling etc. The variance of
sample mean in case of SRSWOR has been
Let y i , i = 1, 2, L , n be the ith chosen member, then
discussed by Hansen et al. (1953), Murthy (1967),
the probability for obtaining this ordered sequence
Sukhatme and Sukhatme (1970), Cochran (1977),
Jessen (1978), Singh and Chaudhary (1986),
is
DesRaj and Chandhok (1998), Mukhopadhyay
But the probability for obtaining any particular set
(1998), Govindarajulu (1999), Barnett (2002) and
of n distinct population members (irrespective of
Sampath (2005). In literature, three different
−1
approaches for the variance of sample mean in
the order) is just
simple random sampling without replacement
(SRSWOR) are available, which are given below:
1
( N − n)!
1
1
1
1
.
L
=
.
.
N N −1 N − 2 N − n +1
N!
For proof and correspondence: S. S. A. Ghazali
19
n! ( N − n)!  N 
=   .
N!
n
Journal of Scientific Research
Volume XXXIV No. 2 October 2005
Then the estimator of Y based on simple random
Consider
sample of size n is
y=
n( y − Y ) = ( y1 − Y ) + ( y 2 − Y ) + L + ( y n − Y )
1 n
∑ yi ,
n i =1
But
[
which is an unbiased estimator of Y .
E ( y1 − Y ) 2 + L + ( y n − Y ) 2
Now for variance
1
Var ( y ) = Var 
n

=
1
n
2
n
∑
i =1
=

yi 


2
i
2
i =1
i
j)
E ( y i2 ) =
1
N
( y1 − Y )( y 2 − Y )


n(n − 1) 
=
+ ( y1 − Y )( y 3 − Y )

N ( N − 1) 

+ L + ( y N −1 − Y )( y N − Y )
=Y ,
i =1
N
∑Y
i
2
and
units in the sample and population respectively. The
N
∑Y Y
i
j
(i ≠ j )
sum on the left contains
i< j
We get
Var ( y i ) =
(1. 5)
Here the sum of products extends over all pair of
i =1
2
E( yi y j ) =
N ( N − 1)
(1.4)
( y1 − Y )( y 2 − Y ) + ( y1 − Y )( y 3 − Y )
E

+ L + ( y n −1 − Y )( y n − Y )

(1.1)
i< j
N
i
]
n
∑ Var ( y ) + n ∑ Cov ( y , y
∑Y
n
( y1 − Y ) 2 + L + ( y N − Y ) 2
N
Also
n
Using the results on Barnett (2002, p34) that
1
E( yi ) =
N
[
]
(1.3)
( N − 1) 2
S2
S and Cov ( y i , y j ) = −
N
N
right contains
(1. 2)
n(n − 1)
and that on the
2
N ( N − 1)
terms.
2
Now squaring (1.3) and averaging over all simple
Hence by using (1.2) in (1.1), we have
random samples. We get by using (1.4) and (1.5),
1  n ( N − 1) S 2 n (n − 1) S 2 
−
Var ( y ) = 2 

N
N

n 
N −n 2
S
Var ( y ) =
Nn
n 2 E( y − Y ) 2
( y1 − Y ) 2 + L + ( y N − Y ) 2


n 
=  2(n − 1) ( y1 − Y )( y 2 − Y )

N +


 N − 1 + L + ( y N −1 − Y )( y N − Y ) 
Sampath (2005) and DesRaj and Chandhok (1998)
also used the same approach for the derivation of
Completing the square on the cross-product term,
the variance of sample mean SRSWOR.
we have
2


n − 1   ( y1 − Y )
1 −


2
n  N − 1  + L + ( y N − Y )  
2
2
n E( y − Y ) = 
 (1.6)
2
N



(n − 1) ( y1 − Y )

+



 N − 1 + L + ( y N − Y )
APPROACH II:
Cochran (1977, p.23-24) has given the proof of the
variance of the mean y as under:
20
Journal of Scientific Research
Volume XXXIV No. 2 October 2005
The second term inside the curly brackets vanishes,
Therefore
since the sum of y i equals NY . Division of (1.6)
Cov ( ai a j ) = E (a i a j ) − E (a i ) E (a j )
2
by n gives
=
V ( y) = E( y − Y ) 2
=
N −n
n N ( N − 1)
N
∑
( yi − Y ) 2 =
i =1
n (n − 1)  n 
− 
N ( N − 1)  N 
= −
N −n 2
S
Nn
2
n
n

1 − 
N ( N − 1)  N 
Now we will find the variance as
This approach has also been used by DesRaj and

1 N
Var ( y ) = Var  ∑ a i y i 

 n i =1
Chandhok (1998) Govindarajulu (1999), Hansen et
al. (1953) and Singh and Chaudhary (1986).
APPROACH III:
N
N 2

∑ y i Var (a i ) + 2∑ y i y j Cov (a i , a j )
 i =1

i< j
=
1
n2
=
1− f
nN
Cornfield (1944) suggested another proof that uses
the results from an infinite population theory, which
is as follows:
N 2

2 N
−
y
yi y j 
∑ i
∑
N − 1 i< j
 i =1

Let a i be the random variate such that a i takes the
Completing the square on the cross-product term
value “1” if the ith unit is drawn and “0” otherwise.
gives
Then sample mean can be written as y =
N
1
∑ ai yi ,
n i =1
=
where the sum extends over all N units in the

1− f  N N 2
1
yi −
Y2

∑
nN  N − 1 i =1
N −1 
1− f N
=
( yi − Y ) 2
∑
n( N − 1) i =1
1− f 2 N − n 2
Var ( y ) =
S =
S
n
Nn
=
population. Here a i are random variables and the
y i are set of fixed numbers.
n
n
, Pr(a i = 0) = 1 −
N
N
Thus a i is distributed as binomial variate in a
Clearly

1− f  N 2
1  N 2
 ∑ y i + Y 2 
y
−
∑ i
nN  i =1
N − 1  i =1

Pr(a i = 1) =
single trial, with P =
n
.
N
This method is also cited in Murthy (1967),
Hence E ( ai ) = P =
n
n
n
, V (a i ) = PQ = 1 − 
N
N
N
advantage of Cornfield’s (1944) method is that it is
Cochran (1977) and Govindarajulu (1999). The
useful in evaluating higher moments of the
distribution of y .
The probability that two specific units are both in a
sample is equal to
n (n − 1)
.
N ( N − 1)
We have developed an alternate method for the
variance of sample mean in case of simple random
sampling without replacement given in section 2.
21
Journal of Scientific Research
Volume XXXIV No. 2 October 2005
2. ALTERNATE METHOD FOR
VARIANCE OF SAMPLE MEAN IN
SRSWOR
 ( y − Y )2 + ( y − Y )2 + L + ( y − Y ) 2
 
n
2
 +
 1
 + 2 ( y1 − Y ) ( y2 − Y ) + L + 2( yn −1 − Y )( yn − Y )  



 ( y1 − Y )2 + L + ( yn −1 − Y ) 2 + ( yn +1 − Y ) 2


1  + 2( y1 − Y ) ( y2 − Y ) + L + 2( yn −1 − Y )( yn +1 − Y )  
Var ( y ) = 2 

n k + L +



2
2

y
Y
)
(
y
Y
)
L
(
y
Y
)
−
+
−
+
+
−
N −n+2
N
 N − n +1
 
 + 2( y
 
N − n +1 − Y ) ( y N − n + 2 − Y )

 
 + L + 2( y N −1 − Y )( y N − Y )
 
 


Suppose a simple random sample of size n is drawn
from a population Y1 , Y2 , L , Y N of size N with out
replacement. Then the total number of samples of
N
size n is k =   and the samples are given as
n
below:
Since over all possible samples, each
{ y1 , y 2 , L , y n }, { y1 , L y n −1 , y n +1 } ,L ,
 N − 1
 times and each
Yi ( i = 1, 2, , L , N ) appears 
 n −1 
{ y N − n +1 , y N − n + 2 , L , y N }
 N − 2
The variance of all possible sample means without
 times, therefore
pair Yi Y j appears 
 n−2 
replacement is given as
1
Var ( y ) =
k
Var ( y ) =
k
∑(y
j

 N − 1 N

(Yi − Y ) 2



1  n − 1  i =1
Var ( y ) =


N
 N − 2
2N 



n   + 2 

(Yi − Y )(Y j − Y )

n
 
 n − 2  i j >i

∑
−Y )2
j =1
{
1
( y1 − Y ) 2 + ( y 2 − Y ) 2 + L + ( y k − Y ) 2
k
∑∑
}
 N − 1 N


(Yi − Y ) 2 


1  n − 1  i =1
Var ( y ) =


N
N
 N − 2
2
n 2   − 

(
Y
Y
)
−
i

 n    n − 2  i =1

∑
2
 y + y + L + y


2
n
 1

− Y 
n





2

1   y1 + L + y n −1 + y n +1

− Y 
Var ( y ) = + 

n
k 



2
+ L +  y N − n +1 + y N − n + 2 + L + y N − Y  
 


n
 


(
)
 y + y + L + y − nY 2
2
n
 1
1 
2
Var ( y ) = 2 + y1 + L + y n −1 + y n +1 − nY
n k
+ L + y N − n +1 + y N − n + 2 + L + y N − nY

(
(
(
)
)
∑
Var ( y ) =




2


)
 ( y −Y ) + (y −Y ) +L+ (y −Y ) 2
n
2
 1
2

1 + ( y1 − Y ) + L + ( y n −1 − Y ) + ( y n +1 − Y )
Var ( y ) = 2 
n k + L +

 y N − n +1 − Y ) + ( y N − n + 2 − Y ) + L + ( y N − Y )
(
(
Var ( y ) =
)
3.





2

)
22

 N − 2
1  N − 1
 Nσ 2 
 Nσ 2 − 

 N   n −1 
 n−2 

n 2   
n
 
σ 2  N −n
N −n 2
S

=
n  N −1  N n
REMARKS
The alternative method developed here is simple
and without the use of expectation.
Journal of Scientific Research
Volume XXXIV No. 2 October 2005
Second Revised Edition, Iowa State University
REFERENCES
Press, Ames, Iowa.
1. Barnett, V. (2002) Sample Survey: Principles
and Methods. 3rd Edition, Arnold, London.
2. Cochran, W.G. (1977) Sampling Techniques.
3rd Edition, Wiley, New York.
3. Cornfield, J. (1944) On samples from finite
population. J. Amer. Statist. Assoc. 39, 236239.
4. DesRaj and Chandhok, P. (1998) Sample
Survey Theory. Narosa Publishing House, New
Delhi.
5. Govindarajulu,
Z.
(1999)
Elements
of
Sampling: Theory and Methods. Prentice-Hall
Inc.
6. Hansen, M.H., Hurwitz, W.N. and Madow,
W.G. (1953) Sample Survey Methods and
Theory. Volume II: Theory. Wiley, New York.
7. Jessen,
R.
J.
(1978)
Statistical
Survey
Techniques. Wiley, New York.
8. Mukhopadhyay, P. (1998) Theory and Methods
of Survey Sampling. Prentice-Hall India Pvt.
Ltd., New Delhi.
9. Murthy, M. N. (1967) Sampling Theory and
Methods.
Statistical
Publishing
Society,
Calcutta.
10. Sampath, S. (2005) Sampling Theory and
Methods. 2nd Edition. Narosa Publishing
House, New Delhi.
11. Singh, D. and Chaudhary, F.S. (1986) Theory
and Analysis of Sample Survey Designs. Wiley
Eastern Limited, New Delhi.
12. Sukhatme, P.V. and Sukhatme, B.V. (1970)
Sampling Theory of Surveys with Application.
23