Document 264388

An aI y z e Ph as e
The team met to discuss these results. They decided to set all factors that were not
found to be statistically significant to the levels that cost the least to operate, and factors Band D at their midpoints. The process would be monitored at these settings for
a while to determine that the results were similar to what the team expected based on
the experimental analysis. While this was done, another series of experiments would
be planned to further explore the significant effects uncovered by the screening experiment.
Based on the screening experiment, the linear model for estimating the defect rate
was found from the coefficients in Table 10.10 to be
Defect rate = 70.375 + 4B - 9.25D
Power and Sample Size
The term power of a statistical test refers to the probability that the will lead to correctly
rejecting a false Null Hypothesis, that is, 1-~, where beta is the probability of failing to
reject the false Null Hypothesis. Generally, the power of a statistical test is improved
when:
• There is a large difference between the null and alternative conditions,
• The population sigma is small,
• The sample size is large; or,
• The significance (a) is large.
Many statistical software packages provide Power and Sample Size calculations.
Minitab's Power and Sample Size option in the Stat menu can estimate these for a variety
of test formats.
Example
Consider a one-way ANOVA test of the hypothesis that four populations have equal
means. A sample of n = 5 is taken from each population whose historical standard deviation is 2.0. If we are interested in detecting a difference of 3 units in the means, the
software can estimate the power of the test after completing the Power and Sample Size
for one-way ANOVA dialog box as:
• Number of levels: 4
• Sample sizes: 5
• Values of the maximum difference between means: 3
• Standard deviation: 2
• Significance level (in the Options dialog): 0.05
The probability the assumption of equal means is rejected is found to be about 39% in
this case. Note that if the sample size is increased to 10 the power is improved to 77%.
Testing Common Assumptions
Many statistical tests are only valid if certain underlying assumptions are met. In most
cases, these assumptions are stated in the statistical textbooks along with the descriptions
369
370
Chap te r Ten
of the particular statistical technique. This chapter describes some of the more common
assumptions encountered in Six Sigma project work and how to test for them. However,
the subject of testing underlying assumptions is a big one and you might wish to explore
it further with a Master Black Belt.
Continuous versus Discrete Data
Data come in two basic flavors: continuous and discrete, as discussed in Chap. 7. To
review the basic idea, continuous data are numbers that can be expressed to any desired
level of precision, at least in theory. For example, using a mercury thermometer I can
say that the temperature is 75 degrees Fahrenheit. With a home digital thermometer I
could say it's 75.4 degrees. A weather bureau instrument could add additional decimal
places. Discrete data can only assume certain values. For example, the counting numbers can only be integers. Some survey responses force the respondent to choose a particular number from a list (pick a rating on a scale from 1 to 10).
Some statistical tests assume that you are working with either continuous or discrete data. For example, ANOVA assumes that continuous data are being analyzed,
while chi-square and correspondence analysis assume that your data are counts. In
many cases the tests are insensitive to departures from the data-type assumption. For
example, expenditures can only be expressed to two decimal places (dollars and cents),
but they can be treated as if they are continuous data. Counts can usually be treated as
continuous data if there are many different counts in the data set. For example, if the
data are defect counts ranging from 10 to 30 defects with all 21 counts showing up in the
data (10, 11, 12, 28, 29, 30).
You Have Discrete Data But Need Continuous Data In some cases, however, the data type
matters. For example, if discrete data are plotted on control charts intended for continuous data the control limit calculations will be incorrect. Run tests and other nonparametric tests will also be affected by this. The problem of "discretized" data is often
caused by rounding the data to too few decimal places when they are recorded. This
rounding can be human caused, or it might be a computer program not recording or
displaying enough digits. The simple solution is to record more digits. The problem
may be caused by an inadequate measurement system. This situation can be identified
by a measurement system analysis (see Chap. 9). The problem can be readily detected
by creating a dot plot of the data.
You Have Continuous Data But Need Discrete Data Let's say you want to determine if
operator experience has an impact on the defects. One way to analyze this is to use a
technique such as regression analysis to regress X = years of experience on Y = defects.
Another would be to perform a chi-square analysis on the defects by experience level.
To do this you need to put the operators into discrete categories, then analyze the defects
in each category. This can be accomplished by "discretizing" the experience variable.
For example, you might create the following discrete categories:
Experience (years)
Experience Category
Less than 1
New
1 to 2
Moderately experienced
3 to 5
Experienced
More than 5
Very experienced
Analyze Phase
The newly classified data are now suitable for chi-square analysis or other techniques that require discrete data.
Independence Assumption
Statistical independence means that two values are not related to one another. In other
words, knowing what one value provides no information as to what the other value
is. If you throw two dice and I tell you that one of them is a 4, that information doesn't
help you predict the value on the other die. Many statistical techniques assume that
the data are independent. For example, if a regression model fits the data adequately,
then the residuals will be independent. Control charts assume that the individual
data values are independent; that is, knowing the diameter of piston 100 doesn't help
me predict the diameter of piston 101, nor does it tell me what the diameter of piston
99 was. If I don't have independence, the results of my analysis will be wrong. I will
believe that the model fits the data when it does not. I will tamper with controlled
processes.
Independence can be tested in a variety of ways. If the data are normal (testing the
normality assumption is discussed below) then the run tests described for control charts
can be used.
A scatter plot can also be used. Let y = X t _ 1 and plot X versus Y. You will see random
patterns if the data are independent. Software such as Minitab offer several ways of
examining independence in time series data. Note: lack of independence in time series
data is called autocorrelation.
If you don't have independence you have several options. In many cases the best
course of action is to identify the reason why the data are not independent and fix the
underlying cause. If the residuals are not independent, add terms to the model. If the
process is drifting, add compensating adjustments.
If fixing the root cause is not a viable option, an alternative is to use a statistical
technique that accounts for the lack of independence. For example, the EWMA control
chart or a time series analysis that can model autocorrelated data. Another is to modify
the technique to work with your autocorrelated data, such as using sloped control limits on the control chart. If data are cyclical you can create uncorrelated data by using a
sampling interval equal to the cycle length. For example, you can create a control chart
comparing performance on Monday mornings.
Normality Assumption
Statistical techniques such as t-tests, Z-tests, ANOVA, and many others assume that the
data are at least approximately normal. This assumption is easily tested using software.
There are two approaches to testing normality: graphical and statistical.
Graphical Evaluation of Normality One graphical approach involves plotting a histogram
of the data, then superimposing a normal curve over the histogram. This approach
works best if you have at least 200 data points, and the more the merrier. For small data
sets the interpretation of the histogram is difficult; the usual problem is seeing a lack of
fit when none exists. In any case, the interpretation is subjective and two people often
reach different conclusions when viewing the same data. Figure 10.28 shows four histograms for normally distributed data with mean = 10, sigma = 1 and sample sizes ranging from 30 to 500.
An alternative to the histogram/normal curve approach is to calculate a
"goodness-of-fit" statistic and a P-value. This gives an unambiguous acceptance
criterion; usually the researcher rejects the assumption of normality if P < 0.05.
371
372
Chapter Ten
_ (] x
1'i~I(]grim
_ Ll "
HISIogriilm or '50) • .....,11'1
(If 100 . WIt" Normal Curve
Norm~ 1
CUI'IIIB
Cl,IC'ortI W'QIt!htet' W.ott ~I'w:~~
1 ~ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _~ _ _ _~
FIGURE
10.28
Histograms with normal curves for different sample sizes.
However, it has the disadvantage of being nongraphical. This violates the three
rules of data analysis:
1. Plot the data
2. Plot the data
3. Plot the data
To avoid violating these important rules, the usual approach is to supplement
the statistical analysis with a probability plot. The probability plot is scaled so that
normally distributed data will plot as a straight line. Figure 10.29 shows the probability plots that correspond to the histograms and normal curves in Fig. 10.28. The
table below Fig. 10.29 shows that the P-values are all comfortably above 0.05, leading us to conclude that the data are reasonably close to the normal distribution.
N
30
100
200
500
P-Value
0.139
0.452
0.816
0.345
What to Do If the Data Aren't Normal
usually pursued:
When data are not normal, the following steps are
• Do nothing-Often the histogram or probability plot shows that the normal
model fits the data well"where it counts." If the primary interest is in the tails,
An aI y z e Phas e
E" -"'" M....., J;.oI;
..
Il"""
(oj<"
' l l ' - to-\>
~~~.:J~~~~
- ....,.._ .....
_ . oI joo1 i l
~-"­
.~
,0
100
FIGURE
10.29
... ...... u
200
10
'2
500
.~~~..
.!.I
_
'.",
JfiI
tI.W...D~
Normal probability plots and goodness of fit tests.
for example, and the curve fits the data well there, then proceed to use the normal model despite the fact that the P-value is less than 0.05. Or if the model fits
the middle of the distribution well and that's your focus, go with it. Likewise, if
you have a very large sample you may get P-values greater than 0.05 even
though the model appears to fit well everywhere. I work with clients who
routinely analyze data sets of 100,000+ records. Samples this large will flag
functionally and economically unimportant departures from normality as
"statistically significant," but it isn't worth the time or the expense to do
anything about it.
•
Transform the data-It is often possible to make the data normal by performing
a mathematical operation on the data. For example, if the data distribution has
very long tails to the high side, taking the logarithm often creates data that are
normally distributed. Minitab's control chart feature offers the Box-Cox normalizing power transformation that works with many data distributions
encountered in Six Sigma work. The downside to transforming is that data have
to be returned to the original measurement scale before being presented to nontechnical personnel. Some statistics can't be directly returned to their original
units; for example, if you use the log transform then you can't find the mean of
the original data by taking the inverse log of the mean of the transformed
data.
•
Use averages-Averages are a special type of transformation because averages
of subgroups always tend to be normally distributed, even if the underlying
373
374
Chap te r Ten
data are not. Sometimes the subgroup sizes required to achieve normality can
be quite smalL
• Fit another statistical distribution-The normal distribution isn't the only game in
town. Try fitting other curves to the data, such as the Weibull or the exponentiaL Most statistics packages, such as Minitab, have the ability to do this. If you
have a knack for programming spreadsheets, you can use Excel's solver add-in
to evaluate the fit of several distributions.
• Use a non-parametric technique-There are statistical methods, called nonparametric methods, that don't make any assumptions about the underlying
distribution of the data. Rather than evaluating the differences of parameters
such as the mean or variance, non-parametric methods use other comparisons.
For example, if the observations are paired they may be compared directly to
see if the after is different than the before. Or the method might examine the
pattern of points above and below the median to see if the before and after values are randomly scattered in the two regions. Or ranks might be analyzed.
Non-parametric statistical methods are discussed later in this chapter.
Equal Variance Assumption
Many statistical techniques assume equal variances. ANOVA tests the hypothesis that
the means are equal, not that variances are equaL In addition to assuming normality,
ANOVA assumes that variances are equal for each treatment. Models fitted by regression analysis are evaluated partly by looking for equal variances of residuals for different levels of XS and Y.
Minitab's test for equal variances is found in Stat> ANOVA > Test for Equal
Variances. You need a column containing the data and one or more columns specifying the factor level for each data point. If the data have already passed the normality test, use the P-value from Bartlett' s test to test the equal variances assumption.
Otherwise, use the P-value from Levene's test. The test shown in Fig. 10.30 involved
five factor levels and Minitab shows a confidence interval bar for sigma of each of
the five samples; the tick mark in the center of the bar represents the sample sigma.
These are the data from the sample of 100 analyzed earlier and found to be normally
distributed, so Bartlett' s test can be used. The P-value from Bartlett's test is 0.182,
indicating that we can expect this much variability from populations with equal
variances 18.2% of the time. Since this is greater than 5%, we fail to reject the null
hypothesis of equal variances. Had the data not been normally distributed we
would've used Levene's test, which has a P-value of 0.243 and leads to the same
conclusion.
Linear Model Assumption
Many types of associations are nonlinear. For example, over a given range of x values,
y might increase, and for other x values, y might decrease. This curvilinear relationship is
shown in Fig. 10.31.
Here we see that y increases when x increases and is less than 1, and decreases as x
increases when x is greater than 1. Curvilinear relationships are valuable in the design
of robust systems. A wide variety of processes produces such relationships.
It is often helpful to convert these nonlinear forms to linear form for analysis using
standard computer programs or scientific calculators. Several such transformations are
shown in Table 10.11.
An aI y z e Phas e
Test for equal variances
95% confidence intervals for sigmas
Factor levels
Bartlett's test
Test statistic: 6.233
P-value
: 0.182
2
3
Levene's test
4
Test statistic: 1.391
P-value
: 0.243
5
I
I
I
I
0.5
1.0
1.5
2.0
FIGURE
10.30
Output from Minitab's test for equal variances.
12
•
10
•
•
•
•
•
8
Y
•
•
6
•
4
•
2
0
FIGURE
0
2
X
10.31 Scatter diagram of a curvilinear relationship.
3
4
375
376
Chap te r Ten
If the Relationship Is
of the Form:
b
Y=a+X
1
Plot the
Transformed Variables
Convert Straight
Line Constants
(80 And 8 1 ) to Original
Constants
Yr
Xr
bo
b1
y
-
a
b
X
a
b
X
a
b
1
y= a+bX
-
X
Y=-a+bX
-
Y
X
Y
1
X
Y= ab x
Y= ae bx
log Y
X
log a
log b
log Y
X
log a
b log e
Y= aXb
log Y
log X
log a
b
Y= a + bxn
where n is known
Y
xn
a
b
(From Natrella (1963), pp. 5-31)
TABLE
10.11
Some Linearizing Transformations
Fit the straight line Y T = bo + b1XT using the usual linear regression procedures (see
below). In all formulas, substitute YT for Y and X T for X. A simple method for selecting
a transformation is to simply program the transformation into a spreadsheet and run
regressions using every transformation. Then select the transformation which gives the
largest value for the statistic R2.
There are other ways of analyzing nonlinear responses. One common method is to
break the response into segments that are piecewise linear, and then to analyze each
piece separately. For example, in Fig. 10.31 Y is roughly linear and increasing over the
range 0 < x < 1 and linear and decreasing over the range x > 1. Of course, if the analyst
has access to powerful statistical software, nonlinear forms can be analyzed directly.
Analysis of Categorical Data
Making Comparisons Using Chi-Square Tests
In Six Sigma, there are many instances when the analyst wants to compare the percentage of items distributed among several categories. The things might be operators, methods, materials, or any other grouping of interest. From each of the groups a sample is
taken, evaluated, and placed into one of several categories (e.g., high quality, marginal
quality, reject quality). The results can be presented as a table with m rows representing
the groups of interest and k columns representing the categories. Such tables can be
analyzed to answer the question "Do the groups differ with regard to the proportion of
items in the categories?" The chi-square statistic can be used for this purpose.