Choosing Sample Sizes 10/13/2011

10/13/2011
Choosing Sample Sizes
IRA JOHNSON
SENIOR QUALITY ENGINEER
MOOG SPACE AND DEFENSE GROUP
Why Sample Size Important
y To Minimize Risks; Make Good, Durable Decisions
y To Avoid Oversampling (Costs!)
y Consistent, Justifiable Procedures Limits Liability
y An Expectation of Q.E.s in Most Organizations
1
10/13/2011
What Will be Discussed?
y The Rules of Thumb-Why They Work (or Not)
y Data Types, Statistical Terminology
y Common Problems in Sampling
y Attribute Sample Sizes
y Variable Averages and Hypothesis Testing
y Standard Deviations
y Cpk Sample Size
Recommended Sample Sizes Method
y The easiest and most accurate method for select
sample size is…
2
10/13/2011
Power and Sample Size In Minitab
Most Common Method For Sample Sizes
y But what if you don’t have Minitab? Don’t have
patience to memorize the all those statistical terms?
y Quality Professionals have long used methods for
selecting sample sizes that do not require Minitab, or
any statistical calculations…
3
10/13/2011
The Sample Size Rules of Thumb!
Detect changes using ±3 σ Limits
10 samples to estimate an Average
20 samples to estimate a Standard Deviation
30 samples for a Capability study (Cpk)
50 Samples to identify a Distribution (shape)
Use the Standard Normal Table from ANSI/ASQ
Z1.4 to select sample sizes
y 59 samples of attribute data to achieve 95/95
y 1,000 samples needed for valid surveys
y
y
y
y
y
y
y DO THESE WORK? Let’s Play Mythbusters!
Rules of Thumb-Convenience or Danger?
y Rules of thumb persist because they are easy to use,
and often work
y But will they work for you? Why or why not?
4
10/13/2011
Background/Review
COMMON ASSUMPTIONS
KEY DEFINITIONS
DATA TYPES
COMMON SENSE PRACTICES
Typical Assumptions for R.O.T.
y Samples are randomly and independently selected
from the population
y Measurement error is negligible
y Data is normally distributed
y The standard deviation is a known a constant value
y Looking for a moderately large change, such as a 1
sigma differences or more
5
10/13/2011
The “Common Sense” Checks
Questions That Should ALWAYS Be Asked:
yWhat are you going to decide using this data?
yHow was the data collected and measured?
yHow big a change is important?
yIs there any prior knowledge (standard deviation,
distribution, etc.)?
yWhat confidence is needed that a change will be
detected? Will there be ongoing sampling or is this a
1-shot check?
yIs the process stable for the short term?
Types of Risks Key Definitions
y Alpha, α, Type I Error- False Reject; Concluding
there is a differences, but there is not
y Beta, β, Type II Error- False Acceptance; Concluding
there is no difference, but there actually is
y Power- Correct Acceptance; Ability to detect a
difference of δ, Power = (1- Beta)
y Delta, δ - The magnitude of difference /change that
is important to detect
6
10/13/2011
Some Common Types of Data
y Counts-Integer values (0, 1, 2, 3… etc.);
y Binomial Data-Conditions that are 1 of 2
conditions (pass/fail, yes/no, etc.)
y Rates- qty per unit; can could range from 0 to ∞,
such as defects/hour, complaints/day,
repairs/1,000 hours,
y Proportions- values between 0 and 1, or 0-100%
y Means and/or standard deviations- Variables
Data that can range from - ∞ to +∞.
Attribute Sample Sizes
DERIVE BINOMIAL SAMPLE SIZE EQUATIONS “BY HAND”
ANSI Z1.4 (MIL-STD-105) NORMAL TABLE
7
10/13/2011
Coin and Dice Toss Examples
Coin: Defect = Tails
Dice: Defect= Roll a “1”
y 1 Toss= ½= 50%
y 1 Roll= 5/6= 83.3%
y 2 Tosses= ¼= 25%
y 2 Rolls= 25/36 = 69.4%
y 3 Tosses= 1/8= 12.5%
y N Tosses=(1/2)N
y 3 Rolls= 125/216 =
57.8%
y N Rolls=(5/6)N
Probability of no “defects” in sample is P(0)=(1-p)N , where p
is the probability of a defect in a single sample.
• Can that equation be used to calculate sample size?
• In lay terms, what does (1-p) translate to in these examples?
Calculate Sample Size from P(0)=(1-p)N
y Consider “p” the rate a defect that you can not accept
y Select the Confidence Level for detecting a defect
rate of “p” or greater, make P(0)=C.L.
y Solve for N
y N=Log(P(0))/ Log(1-p) … then round up
y Reject the lot if one or more defects are found in the
sample of size N
8
10/13/2011
Binomial Sample Sizes Matrix
Defect Level
Confidence Level
p
%
PPM
50%
95%
0.500000
50.0% 500,000 1.0 4.3
0.166667
16.7% 166,667 3.8 16.4
0.050000
5.0% 50,000 13.5 58.4
0.010000
1.0% 10,000 69.0 298.1
•Can you locate the 59 pc attribute sample for
95/95% R.O.T?
Use of Standard Normal Tables
y R.O.T: “Use the Standard Normal Table from ANSI/ASQ
Z1.4 to select sample sizes”.
y Is this a good R.O.T.? Why or Why not?
y ANS1: The ANSI/ASQ Z1.4 REQUIRES switch to the
Tightened table after too many rejected lots
y ANS2: In 2000, DoD declared Z1.4 obsolete; Recommend
c=0 plans, which provide equal or greater consumer
protection with less overall inspection than Z1.4.
9
10/13/2011
Variable Sample Size Topics
CENTRAL LIMIT THEOREM: NORMALITY OF AVERAGES
STANDARD DEVIATION OF AVERAGES
3σ LIMITS COMPARED TO α/β RISK LIMITS
HYPOTHESIS TEST REVIEW
WHEN STANDARD DEVIATION IS NOT KNOWN
CHANGES IN STANDARD DEVIATIONS
Central Limit Theorem
y
10
10/13/2011
Effect of N on σ
y
Hypothesis Testing Steps
y Start with a null hypothesis, H0
y Add an alternative hypothesis, HA, if Ho is not true
y Select the allowable risk levels,
{ α for the risk of false rejecting the null, typically .05
{ β for the risk of not accepting the alternative, typically .05
y Identify the appropriate statistical distribution
y Determine Sample size that satisfies those risk levels
11
10/13/2011
Using 3 σ Limits to Detect a 1 σ shift
y The original distribution is the black curve on the right
y The red curve shows the process shifted 1 σ to the left
y Using 3σ limits, >97% of the new distribution overlaps the old
distribution.
y What is the Power of a 3 σ test, for δ=1 σ and sample size n=1?
-3
Ho
Histogram of Ho, Ha
gm
Si
1 Sigma Shift, 3 Sigma Limits
a
Ha
+3
Ho
Ho
gm
Si
a
0.4
Density
0.3
0.2
0.1
0.0
6.25
7.50
8.75
10.00
11.25
12.50
13.75
Using α=.05 to Detect a 1 σ shift
y With the process shifted 1 σ to the left and using
α=0.05, >66% of the Ha distribution would overlap
the Ho distribution.
y What is the power of a 1 sample test with α=0.05?
/
ha
lp
(A
-Z
Ho
Histogram of Ho, Ha
1 Sigma Shift, alpha=0.05
2)
Ha
Ho
(
+Z
Ho
a/
ph
Al
2)
0.4
Density
0.3
0.2
0.1
0.0
6.25
7.50
8.75
10.00
11.25
12.50
13.75
12
10/13/2011
Sample Size of 9
y Sample size of 9
Histogramof HoandHa, n=9
1 Sigma Shift, Alpha=0.05
Ha
9.347
Ho
10.653
1.2
1.0
0.8
Density
reduces the σx-bar by 3
y Gray area shows the
likelihood the
likelihood of detecting
the 1σ shift
0.6
0.4
0.2
0.0
8.0
8.5
9.0
9.5
Data
10.0
10.5
11.0
Sample Size vs. Power
Detection Power for a 1 Sigma Change, Normal Distribution
10
1.0
0.9
0.8
Power
0.7
0.6
0.5
0.4
0.3
0.2
0
5
10
15
Sample Size
20
25
30
N=10 provides Power= 88% for δ=1σ. Not the 95% typically desired
but close. Is the R.O.T “busted”?
13
10/13/2011
What δ Can a 10 pc Sample Detect?
Power Curve 10 pc Sample Z Test
1.14
1.0
0.8
Power
Sample
Size
10
0.95
A ssumptions
A lpha
0.05
S tDev
1
A lternativ e N ot =
0.6
0.4
0.2
0.0
-1.0
-0.5
0.0
Difference
0.5
1.0
Impact of δ on Sample Size
Effect of Delta on Power and Sample Size
13
1.0
0.95
Power
0.8
Variable
0.5 Sigma Power, Normal
1 Sigma Power, Normal
2 Sigma Power, Normal
0.6
0.4
0.2
0.0
0
5
10
15
20
Sample Size
25
30
• N=13 has 95% Power to detect a 1σ change
• Changes other than 1σ dramatically affect Power
14
10/13/2011
Graphing the Elements of a Hypothesis Test
y Ho is black distribution
-Z
Ho
Ha
1.6
y Ha is red distribution
2)
a/
ph
( al
y δ= Potential Change to detect
Histogramof Ho, Ha
1 Sigma Shift, n=13
Ho
Ho
2)
a/
ph
Al
Z(
+
δ
1.4
y Yellow area signifies β Risk
Density
1.2
y Blue area signifies α Risk
1.0
0.8
0.6
0.4
y Gray area signifies Power
0.2
0.0
8.4
8.8
9.2
9.6
10.0
10.4
10.8
y Acceptance Criteria are
Dotted Vertical Lines
Change in Average, σ Unknown
y Z-tables are standard normal distribution, where
the standard deviation is a known value. Could be
from previous data or an SPC chart.
y What do we use if the standard deviation is not
known?
y ANS: Use Students t-Distribution
y The t-test uses the sample data to estimate both an
average and a standard deviation, but loses a little
power
15
10/13/2011
Power of t-Test vs. Normal Z
Power vs Sample Size, Z test compared to t-test with a 1 Sigma shift
10
1.0
0.95
0.9
Power
0.8
Variable
1 Sigma Power, ZNormal
t-test 1 Sigma Power
0.6
0.4
0.2
0.0
0
5
10
15
20
Sample Size
25
30
Standard Deviation Sample Sizes
y Use Variance (square of Standard Deviation) and the
y
y
y
y
F-statistic to test for differences
The F statistic is the ratio of two variances.
As sample size for both the numerator and
denominator ⇒ ∞, the F-ratio ⇒ 1.
By selecting n2=∞ a single variance can be assessed
against the population standard deviation
Requires Normal Data!!
16
10/13/2011
20 Samples for Standard Deviation R.O.T.
y F ratio vs. sample
size is very steep
where N<20.
y True standard
deviation is ~±
25% of the sample
value with N=20
y R.O.T Confirmed?
F‐ Ratio vs. Sample Size
2.8
2.6
2.4
2.2
2.0
1.8
1.6
1.4
1.2
1.0
0
10
20
30
40
50
60
70
80
Capability Index Confidence Intervals
y The following approximation is commonly used:
1
Cˆ pk 2
ˆ
)
Cpk = Cpk ± Z (1−α / 2 ) ( +
9n 2 * (n − 1)
y It is important to note that the sample size should be
at least 25 before these approximations are valid*
y
*Obtained from ITL.NIST.gov
17
10/13/2011
Are 30 Pieces for Cpk OK?
y Green indicates the lower confidence exceeded 1.33.
y For a sample size of 30, the calculated Cpk needs to
be >1.7 to be confident the actual Cpk is >1.33
Required Cpk:
1.33
Alpha:
Lower Confidence Bounds for Cpk
Calculated/Estimated Cpk
n
1.33
1.4
1.5
1.67
30 1.025804 1.081489 1.160917 1.29568
40 1.067565 1.125226 1.207494 1.347119
80 1.14548 1.206816 1.294364 1.443033
120 1.179622 1.242564 1.332421 1.485044
160 1.19989 1.263785 1.355011 1.509979
700 1.267929 1.335018 1.430834 1.593667
0.05
1.75
1.359004
1.412742
1.512937
1.556818
1.582864
1.670274
Which Rules Failed Mythbusters?
Detect changes using ±3 σ Limits
10 samples to approximate an Average
20 samples to estimate a Standard Deviation
30 samples for a Capability study (Cpk)
50 Samples to identify a Distribution (shape)
Use the Standard Normal Table from ANSI/ASQ
Z1.4 to select sample sizes
y 59 samples of attribute data to achieve 95/95
y 1,000 samples needed for valid surveys
y
y
y
y
y
y
18
10/13/2011
The End
QUESTIONS?
Useful References
y How to Choose the Proper Sample Size, by Gary G.
Brush, ASQ Quality Press
y Zero Acceptance Number Sampling Plans, by
Nicholas L. Squeglia, ASQ Press
y Online Engineering Statistics Handbook,
NIST/Sematech, website: itl.nist.gov
19