Download Report

Aftennonb multtpleforms of error canimprovetheeffectiveness
of
suweysin evaluntion.
Sourcesof SurveyError:
Implicationsfor EvaluationStudies
MarcT. Brayerman
Surveys constitute one of the most important data collection tools available in
evaluation. For example, sample surveys are used ro gauge public opinion in
policy research and are used with specialized target populations in needs
assessmentstudies. Self-administered survey questionnaires are used in classroom settings to assessdifferences between program participanrs and control
groups. Structured survey interviews are used to assessthe effectiveness of clinical treatments. Surveys are used in longitudinal studies to track long,term
change in targeted populations. Although these examples differ in many ways,
they all share a focus on direct questioning, in a variety of manifestations, to
leam about people-their behaviors, their attitudes, and other aspecrsof their
lives.
Much of the focus in this volume is on sample surveys, which use systematic procedures for selecting probability samples from populations. Program evaluators probably use sample surveys a good deal more now than in
years past, due to recent developments that often identify entire communities
as program targets. These developments include greater attention to media
campaigns as a form of social program (for example, Flay, Kessler, and Utts,
i991) and the current trend toward comprehensive communitywide interventions ([or example,Jackson,Altman, Howard-Pitney,and Farquhar, l9B9).
Randomized population surveys have been used in the evaluations of several
I thank Robert Groves for his valuable and insightful comments on an earlier draft of this
chapter.
Nff
DrREOroNs FoR EvALUAfloN, no, 70, Summer 1996 @Jossey,Brc
publishere
77
1B
ADVAN(-I:s
rNSutrvlr l,ll.:.l.Atr(
ti
SouRcrsoF SURVEv
ERRoR
nationallypKtlnillctttc(rrrllllrlrttly
llr:illtlrltrl(]tvcrrtr()lt
such as the Minl)l'oiects,
n e so taHear t 1{ c ' alr h[ ) 1 ' o 1 1 r;rrn
tl ,i rr:p k ri rrrri lo rl rc rs ,1994) and the N ati onal
(.()fulMll rlialr ((.(llvl Mll' l{csearchGroup, rgg5). NeverCancer InstirLrr(:'s
theless,as Hi,:rrrypoiritr (rrr iri (.lraptcr()ne, the potential of population surveys in evalirarionrt's(,';rlt'lr
lerrrrir.rslargelyuntapped, and he suggeststhat
surveys can lte ttseclto lrrovrrlee()ntextfor evaluation recommendationsand
to communicluc irrlirr"rnat
i()n to the public.
Many lrl,gr lrrr evaluarions,particularly those that involve experimental
or quasi-cxpcrirtrcrrraldcsigns,do not use population suweys, and their selection critclia lor irrcludingindividuals as participantsdo not incorporatethe
technicul issucsrelated to sampling from a larger population. However, all evaluations that use questionnaires in any form can benefit from the current
research regarding measurement errors, that is, errors of observation that
involvc the interviewer, respondent, instrument, or mode. My aim in this chap(er is to provide a briel overviewof some recentadvancesin survey ,.raur.'h,
within the contexr of different kinds of survey error and with partiiular attention to the applicability of the findings for program evaluation. I have drawn
on my own areaof evaluationwork for examples;thus, most of them concern
health promotion programs and educational programs
Componentsof SurveyError
Groves (1989) presentsa schematicmodel of survey error attributable to survey design features, which builds on formulations of total suwey error by Kish
(1965) and Andersen, Kasper,Frankel, and Associates (1979).In thesemodels, the mean square error-the total error associatedwith a survey statisticconsists of two major families of subcomponents: error due to statistical
variance and error due to bias. vaiance error arises from the unavoidable variability in the target statistic acrosssamples of the populatlon units or, more
generally, from any aspect of the survey design that will vary over replications.
Thus, variability will be found acrossindividuals within rhe populatibn (which
produces sampling error), acrossresponsesfrom a single indMdual to the same
question or set of questions (which produces levels of resr-retesrreliability),
across test items sampled from a content domain (which produces levels of
parallel forms reliability), and so on. These error components have an expected
vaiue of zerc, that is, they are unbiased with respect ro the popuiarion parameter being estimated. By contrast, bias refers to fixed erto. to-ponent; that
are specific to a particular survey design and leads to sample sratisticsthat can
be expected to differ from their respective population pu.u-"t"., by some
nonzero, though typically unknown, amount.
Researchersseek to minimize variance error and to eliminate blas error
within the consiraints of acceprableresearchcosts(seeGroves,lg8g). in addition, researcherstypically seek also to assessthe extent of both kinds of error
in their data and to calculate what effectsthese may have on the statistical estimates. Postsurveyprocedurescan then be applied to compensatefor the estimated levels of error.
19
within the respective sets of error due to variance and error due to bias,
Groves (1989) distinguishes further between errors oJnonobsewationand errors
oJobsewationt rhe first categoryincludes errors due to coverage,nonresponse,
and sampling. The second category includes errors due to interviewers,
respondents,instruments, and modes. This model frames the discussion that
follows.
Errors of Nonobservation
Broadly speaking, errors of nonobservation are due to the fact that measurements on some eligible persons have not been made. This occurs when some
persons are not identified prior to sampling (coverage error), when some persons refuse to be interviewed or cannot be reached,(nonresponseerror), and
vThsn-25 is inherent in the concept of sampling-some persons are excluded
from the selected sample (sampling error).
Coverage Errors. Coverage errors arise because units of the target population that should be eligible for the survey have nor been included in the
sampling frame. To illustrate: one of the most widely appreciaredexamplesof
coverageerror is the fact that general population surveys conducted by telephone exclude householdsrhat do not have telephoneservice.This is a particular concern in sun'eys of populations known to have relatively high
proportions of nontelephone househoids, including low-income populations
and rural populations, among others. Since this exclusion is a fixed component of any telephone survey design, if nontelephone households differ from
telephone households on the survey variables being measured (as they frequently do), the resulting error will constiture a bias in the design rather than
a component of statistical variance. Poststratification adjustments on demographic variablessuch as age,sex, and educational attainment are often made
to compensatefor biasesin coverage.
When lists or registers,r. ,rr.d as sampling frames, a different probiem
arises-ineligible target units may be included. In a general population mail
suwey that was part of a community-based health program evaluation in Austraiia and used elecroral registers as the frame, Day, Dunt, and Day (i995)
found that relephone and personal visit contacts with nonrespondents
increasedthe proportion of units deemed ineligible from B percent to 12 percent of the registerentries.Attempting to obtain an accurateestimateof ineligibles is important for obtaining accuratecalculationsof responserates,which
otherwise would be artificially low.
Nonresponse Errors. Nonresponse errors occur when individuals
selectedfrom the frame are not ultimately representedin the data set because
they refuse to participate,cannot be reached,or are unable to respond. Survey
researchershave expressedconcern that nonresponse is becoming a grearer
problem for population surveys than it has been in the past (for example, Bradburn, 1992). Becauseof the obviously high porenrial for bias rhat resuhs when
alarge proportion of the selected sample is noncooperative, a greatdeal of
researchhas been conducted on techniques for minimizing nonresoonseand
. lt l
j l l r * i i i ; ' : r , : ; t 1 . !" " r iEi,ifi lia :r 1 ir i ,t
iillltt3tiiip,
litr nrt:;rrirrittll;rttrl
Ir;t it i..iiripi:l';rlirl(itt)r,t:s(seeChapterFive) present $rvcfal rhr:rrrcrir';rl
it!r'.r tlr:t ;trlrlresswhy contactedindividuals
lrr.i"il.!.rq
and they examine those permight cithcr gfrnt ol rr-h1.,,.
:t =rrfr.rryr.rq11q51,
spectivesin liglrr trl tl;rtirlrlrfri tlrr=I f .5. t.t:nsusand severalnational surveys.
Ilcsprirrst'Itrrlrl lirrvrr. lr.:lt;!r'r'llcrshave long been interested in refiner'csponse
rates.This interestmay be greatest
ments (rf lilir r"rluli':,!i I rr',rrrru;'-('
irt tltc cit:,r'ril rrr;rilt't
I illr:it r{rtutaircs,since difficulty in obtaining high response
i.r1tt:r't;il llrr: illlrrlc:,.tnr.'irlruess
of this survey mode as compared to telephone
(!r lt{'ll,.ift;tlilrlf l \ii(:w tnodes.The most comprehensiveset of recommendali*!l:; ltri lr;r,rinli;ing mail responseratesis probably Dillmans (1978) Total
Ili.l1gt' Miithrtd. ln addition, numerous researchreviews on factors affecting
Irliill :.r.nv(:y responserateshave appearedin recent years (for example, Church,
l!)tlri; l;ox, Crask, and Kim, I9BB; Yammarino, Skinner, and Childers, l99l).
facttrrs consistently found to predict higher responserates include repeated
contacts (preliminary notification and follow-ups), monetary incentives, inclusion of a postage-paid return envelope, certain tlpes of cover letter appeals,
and a questionnaire length of four pages or less (Yammarino, Skinner, and
Childers, 1991). For survey projecrs that use household interview and telephone interview modes, detailed practical adnce on maximizing responserates
can be found in Morton-Williams (1993).
Adlustingfor NonresponseBias. Apart from maximizing respondent cooperatlon, researchersneed methods to estimate the degree of nonresponse bias
and make adjustmentsin their sample statistics.Groves (1989) provides technical treatments of statistical adjustment approaches, including weighting procedures to adjust for nonresponding individuals and households, and
imputation procedures to estimate values of misslng items. One perennial
problem for survey researchersis determining the source of information that
will serve as the basis for nonresponse estimates.Lin and Schaeffer(1995) analyze two competing hypotheses about survey nonparticipants that are often
used to guide estimation measures.The first assumesthat nonparticipants are
similar to late respondersand, in fact, fall at the end of a "continuum of resistance"; thus, estimates of nonresponse bias are derived from the data collected
from late responders. The second hypothesis assumesthat there are distinct
categoriesof nonparticipants, such as refusersand hard-to-contact individuals; estimates for refusers are derived from the number of temporary refusers
(those who finally did participate), and estimatesfor the hard to contact are
derived from participants who were reached only after several callback
attempts. However, Lin and Schaefferconclude that neither model is entirely
sufficient to pror,rde a suitable estimate of bias.
Accessto Respondents.ln addition to the behanoral trends regarding survey refusals, new obstaclesto gaining accessto respondents have resulted from
a number of technological developments. The most significant of these is probably the telephone answering machine. Some researchershave concluded that
answering machines do not, as yet, constitute a serious problem for survey
research(Oldendick and Link, 1994; Xu, Bates,and Schweitzer,1993). But
SouRcesor SunvryEnnon.
2I
the possibility of serious bias is introduced if answering machines restrict
access to some segments of the population in comparison with others.
This was, in fact, found by Oldendick and Link, who reported that households
using answering machines to screen unwanted calls tend to be characteized
by higher family income and higher levels of educarion. However, Xu, Bates,
and Schweitzer found that households with answering machines were
more llkely to complete an interview. They also found that leaving messages
resulted in higher ultimate participation rates for househoids with machines,
perhaps by performing a function similar to the use of advance letters. Most
researchersstudying this topic believe that answering machines may well
become a greaterthreat to survey research in the future, due to sharp increases
in recent years in the number of households owning machines and using them
to screencalls.
Sampling Errors. Sampling errors occur because the elements of the
frame population'(that is, the respondents) differ on the variable of interest,
and different samples selected according to the sampling design will include
different combinations of those population elements. lf other kinds of error
(coverage,nonresponse, and observational errors) do not exist, the difference
between any given sample statistic and the statistic'strue population value will
constitute sampling error. As Grbves (1989, chap. 6) states,mosr staristical
work on sampling error concerns errors of vanance. Certain sampling designs
(for example, those involvlng stratificarion) are more efficient than others (for
example, simple random sampling) in reducing error. However, the danger of
sampling bias also exists when selection procedures are incorrectly applied in
systematic ways or when some individuals in the sampling frame have a zero
chance of selection, as occurs in nonprobability or quota samples. Henry
(1990) provides a practlcal overview of sampling approachesihat can'be used
in research planning.
Sampling error is probably the most well studied of ail of the different
forms of error, and a good deal of effort is expended in survey deslgns to control it. It has received this attention because it is susceptibie to control through
staiistical estimation procedures, and design aiternatives can be developed that
yield projections of both precision and cost (Groves, 1989). For example, as
Groves notes, almost all major surveys use sampling designs that include complex procedures to reduce either sampling error or research costs (procedures
such as stratification, clustering, and unequal probabilities of selection).The
other forms of survey error are not so easily predicted, measured, or controlled,
but in recent years, researchershave begun to energetically addressthese other
design problems that are more resistantto solution.
Errors of Observation
Errors o[ obsewation-also referred to by Groves (1989, p.295) as "measurement errors"-are due to the fact that recorded measurementsdo not reflect
the true values of the variablesthey represent. Such discrepanciesare caused
A r ' \ , \ r r r l : \ l N S t t R V[:Yllt.:' t ir ltt tt
lly factors residing in thc it'tlr:rvirurr:t,llte t'(:slxrndent,the wording or organization of the instrumc:nt,atld l[rr, tnudc ol survey administration.
errors are due to the effectsof individInterviewer lirrors. lrr[r:r'\,rt'wr:r
;rriellaee-to-facesurveys.Srokesand Yeh (1988)
ual interviewr:r'sirr l-t:ltl.rlrnttc
wlty sui:h interviewer effectsmay occur. First, interviewspecify forrr rr;ti;rir-rs
ers n1:lylrl r,;rrelc..Fa
rrr,negligcntin following directions and protocols. Second,
or personal mannerismsthat affectrespontlrry rrr.rvhirri{ r,r'*rritl
L:hnrircteristics
'1'lrird,their demographiccharacteristics,such as race,age,and
rh'r11.:;'
;!r!:,!vr,'rr,
1r'x, njtilyrrllci:l the answersof some respondents.Indeed, a good deal of inveslrJrr Ir.r:;locused on the potential biasesarising from fixed interviewer charnl.,r.rl
tr!:lr:ri:jtic:s
such as gender (Kane and Macaulay 1993) and vocal characteristics
(( )kscnberg and Cannell, 19BB). Fourth, interviewers may vary in rheir prorluction of nonresponse, in terms of both overall nonresponse levels and tlpes
of persons who refuse their requests to panicipate. The first three reasons are
tlpes of measurement errors; the fourth is an indirect influence of interviewers on total error through their contributions to patterns of nonobservation.
The issue of interviewer-related error has drawn considerable attention
from survey researchers,but apart from pointing out the need for thorough
interviewer preparation, the evaluation literature has virtually ignored the
topic. (One welcome exception is provided by Catania and others, 1990, who,
within the context of AIDS prevention programs, discuss the evidence for faceio-face interyiewer gender effects on reports o[ sexual behavior.) There are
probably several reasons for this lack of attention, including the facts that few
program evaluation projects use a large number of interviewers for data collection and that the size of interviewer effectsin a data set is very difficult to
estimate if interviewers have not been treated as a randomly assignedindependent variable.
Respondent Errors. Respondenteffors are due to processesor characteristics inherent in the respondent. In unbiased form, they consist of inconsistency or unreliability in responding. Biases in respondent error can be
introduced by such factors as deliberate misreporting, the activation of
response sets (such as social desirability; see Chapter Four by Dillman and his
colleagues),motivational states,and characteristickinds of memory retrieval
errors for a particular task.
Much of the researchon respondent error is informed by cognitive psychological models of the processesthat come into play when people answer
questions (for example, Sudman, Bradburn, and Schwarz,1996). The attempt
to link survey design with findings from cognitive psychology has been one of
the most robust and important directions in survey researchsince the early
l980s. (SeeJobe and Mingay, 1991, for a historical perspective.)Only a few
examples can be cited here.
One line of investigation concerns human memory for autobiographical
information (Schwarz and Sudman,1994). For example, Blair and Ganesh
(1991) investigated the differencesamong three different ways to ask about
pasl events: absolutefrequencymeasures("During the past year, how many times
Souncrs oF SURVEyERRoR
23
did you charge a purchase to your accounr?"), rate oJfrequencymeasures("During the past year, about how often did you charge purchases to your
account?"), and inter.val-basedJrequency
('When was the last rime you
measures
charged a purchase to your account?").Although these variations on a single
question might be expected to yield highly comparable results, resuhs showed
that respondents provide substantially higher frequency esrimares when
answering interval-based measures. One reason for the varied results may be
that the different formars are differentially susceptible to respondents' misjudging the time of occurrence of an evenr, a kind of error called telescoping.
Pearson, Ross, and Dawes (1992) cite studies indicating thar peoples
recall of their own personal characteristics at specified times in the past is distorted to conform with their implicit theories about consistency or change in
the domain of interest. Thus, when their current responsesare compared with
their previous responses on a topic, respondents often do not accurately
remember their previous status with regard to attitudes (for example, evaluations of dating partners), behaviors (substance use), and pain (intensiry rarings
of chronic headaches). Pearson, Ross, and Dawes argue that these differences
are based on memory distortions rather than attempts to mislead. Respondents
inferentially construct information about their past using their implicit theories, which are often inaccurate,as blues.
ln a final example of cognitively based lines of investigation, Krosnick
(1991) has examined the factors rhar mighr derermine rhe amounr of cognitive effort respondents wlll expend in answering survey questions. In Chapter
Three, he and his colleagues presenr findings from a series of empirical investigations that seek to identify possible determinants of respondents'Ievelsof
cognitive effort.
The topic of human cognition in relation to survey methods really stands
at the interseciion between respondent error and instrument error (discussed
in the next section) because it relates the question design task to characteristic patterns and limitations of human thought. As the cited researchexamples
show, these errors stem from particular forrns of response to particular manipulationsoI question presenration.
Instrument Errors. Instrument errors are related to uncertaintles in the
comprehension and the attributed meaning of questions and are implicated in
levels of instrumenr reliabiiity and validity ln its unbiasedform, instrument
error will arise because any single scale or instrument can include only a sample of items from the universe of theoretically possible items tapping a skill,
attitude, body of knowledge, or other domain. Instrument biasesare due to
vagariesof question wording, question structure, and question sequence.In
mail surveys, instrument error can also be related to options for the design and
format of the questionnaire. Severalexcellent texts exist to provide practical
guidance on question wording and instrument construction (for example, Dillman, I97B; Sudman and Bradburn. l9B2).
Someiimes, adjustments can be made in the data analysis phase of the
researchby removing troublesome items from scalesin post hoc procedures.
l.l
.{trv*H i
| '., l i r 1 l Jl;!,!' l! li- t,l,!Alti 1!
how the l{aschmodel can be used
lrr (.lrirl'1rcr.:ris,
lr.rt{..ranrl;la,!.rn:r1 elr:i!,ur:;r's
i!ctlt"
lr!i;ti.1ly;;i11,4
trr lelcrtltlyl.la'0i
l'litttr"Illsol obtained response,leading to
lt lit t t ' t t t r ' t t l o l ' . 1 r tr r 1 .i ,tlr ' r
W{)f(JMrlnlrri:i ( !rr)fir5i lq8q) reviewsresearchindicating that a responclcnt will ilu:.iw.il:t iltlf iititll'l(:vcn when he or she perceivesit as ambiguous or
tcrms. The researcheris often unaware of these
is rrnl;rrniliattvitlt t:r:sctil.tal
irrtt:rlltr;!tivli l!t{":,ullll)tionson the part of the lespondent. One must, of
,',r,.,,'n.,5lr'!vl lri tirirtirnizethesepossibilitiesfor respondentmisunderstanding
irr rlr,'tlt:li;r-rIitrtr-lpilotingphases of the research.Numerous examplesof sysIr:lrrirli(i ir:rquiryinto question wording are provided by Schuman and Presser
t l ()t,l| ) . ulllongm any o th e r s o u rc e s .
(.ttnLext Elfects. Context effects (Schwarz and Sudman , 1992) refer to
t"csponseerrors createdby variations in the older in which lesponse options
are presented on individual closed-ended items or by variations in the order
in which the survey questions are presented. Dillman and his colleagues
(Chapter Four) provide a fuller description of several forms of response order
effects and question order effects wrthin their discussion of survey mode differences.ln addition, Krosnick and his colleagues(Chapter Three) describe
several forms of response order effects and how they might be related to
respondents' cognitive effort.
Muttrple Languages.The meaning that respondents will ascribe to words
and phrasing in surveys is clearly a complex and subtle area that has sparked
prodigious research.One can imagine, then, that when two oI more languages
are involved, the additional linguistic and cross-cultural complexities create
enormous potential for new forms of meaning-related error to accrue. McKay
and her .oileug.l"r (Chapter Seven) describe the experiences of several largescalesurvey projects in producing and implementing survey iranslations. However, survey translation is an area that has, as yet, seen very little systematic
research(seeMarin and Marin, 199i, for a discusslonof some key issuesand
concepts). Given the growing linguistic diversity of industrialized societies all
over the world, survey translatlon will undoubtedly become a more important
topic of research in years to come.
Sensitiveltems. Sensitive items, that is, items on which respondents are
likely to feel a degree of personal threat (see Lee, 1993), pose a particrrlar problem in that the likelihood of bias is high. Recognition of this possibility is particularly important in the evaluation context because many proglams,
especially those for adolescents,deal with sensitive topics such as drug use,
sexuality,or risk behanor. Data quality on sensitivequestionsis dependenton
a number of design fearures, including question wording, timing of data collection, choice of mode, interviewer decisions, and assulancesof confidentiality
Singer, von Thurn, and Miller (1995), in a meta-analysisof studies that
involved confidentiality assurances,found that such assurancesimprove
response only when the survey items are sensitive, in which case they significantly increase both response rates and data quality. Mail surveys on sensitive
SouncrsoF SURVEy
ERRoR 25
topics sometimes have the option of guaranteeing complete anonymity to
respondents through the removal of all respondent identification information
(a condition that is obviously more difficult to promise in the telephone mode
and essentially impossible in the face-to-face mode), but guarantees of
anonyrnity conflict with the researchers'need to identify nonrespondents for
the purpose of targeting follow-up contacts. Biggar and Melbye (1992), in an
anonymous sample survey of sexual activity and AIDS attitudes among residents of Copenhagen,Denmark, found that inclusion of a separare,postagepaid "survey completed" card (see Dillman, 1978) did nor lower the initial
response rate nor did it appear to influence reports of sexual behavior or atti,
tudes. By making follow-up mailings possible, this procedure enabled the survey project to increase its response rate from 45.8 percent after one mailing to
72.8 percent after three mailings.
A particular concern regarding respondent bias on sensitive items occurs
in evaluations using experimental designs. In addition to recognizing the problems that bias might causefor estimation of outcomes, evaluatorsshould seek
to avoid differential bias across treatment and control conditions because this
bias can seriously affect interpreiation of program effects.For example, students who participate in a drug prevention education program may become
sensitized to repofting drug use. If'they have enloyed the program or feel close
to the presenters, they might be inclined to underreport recent drug use at
posttest due to a desire to please program staff or to personal embarrassment
at hanng violated newly established classroom norms. If students in the control group feel no such compunction, then differential bias will occur, and the
program will appear more effectiverhan it is. Alternatively, all groups of students might underreport drug use at pretest due to fears about breach of confidentiality After going through the program, the experimental group students
may feel more trusting and therefore repon accurate drug use levels while conrrol group students maintain their previous level of underreporting. This second scenario would createbias in the other direction and make the program
appear artificially ineffective. Clearly, extra care in developing questionnaires
on sensitive topics is strongly warranted.
Mode Errors. Mode errors are created by differencesin the circumstances
under which the data are collected: personal interview, telephone interview, or
self-administered questionnaire. Dillman and his colleagues(Chapter Four)
analyze frequently reported telephone versus mail mode effects and describe
how these might stem from fundamental differences between the two formats.
They also describe the extent of the researchevidence that supports the existence of the various effects.
As Dillman and his colleaguesstate, one reason for the growing research
interest in this topic is that many designs mix formats within the same respondent sample. An illustrative example from the evaluation iiterature is the study
cited earlier byDay, Dunt, and Day (1995), which used telephone conracts and
household visits as two distinct levels of follow-up ro a general population mail
survey.Another situation that occurs frequently in evaluation studies entails
.l.ft
i l l v.{f-!(
SoURCESoF SURVEYERRoR
t::i l t-l 1t !t- :.\:t:tl{ !ir :,t,A!t!ll
eliIler-cntconditions (for example,
rLifll:rrnt [!.illuLliir:irlrrtnr.lr:r'
strt.vr.yiLi6
l{t youth at schooland mailingsto their parr.:mp[rrying
;rrlttrit!h[f;ltiirllf;
p1t'rtrp
ents et hcinre) wlth t ltrr rntt:trtill t:ollec'ting related or parallel kinds of survey
infgrnrflriqrp,lri rrrr'fi r':r.rt:!.il poterntialfor differentialbias will exist due to the
varilttiotrs i l i :!r'lI triil
Moelc clln l: ;rlrl)(:irrto be particularly important when the survey questii:rnsan: :;i:llsilivir. l-lrttlrcl['two recent studies that systematically compared
t(r :iilll$titllccuSCquestionsacrosstelephOneinterviews, faCe-to-face
lTlipr:lll.',r..r.i
rtnrl scll-administered(nonmail) questionnairesfound the strongest
inlr:r'r,iew:.r,
n,ir[:lrr:r: lirr urrclerreportingto exist for the telephonemode (Aquilino, 1994;
licrrclrich and \/aughn, 1994)'. Aquilino suggeststhat this finding may be due
tu rtspondent confidentiality concems. The telephone mode offers neither the
.)l)portunity for building trust inherent in the face-to-face mode nor the
response anonyrnity possible in the self-administered mode.
Conclusion
This review has been intended to illustrate that a total errol approach to survey design strives for a balance of design features that can maximize the validity of the survey data. Undue attention to reducing one kind of error will not
ultimately prove beneficial if other forms of error ale thereby increased.For
example, a researcher may attempt to reduce nonresponse by providing high
incentives for survey partlcipation. But even if this strategy resuits
-on"lary
in
a higher overall response rate,ll it also differentially boosts the representation of different subsetsof the population (without the application of compensatory weighring schemesduring data analysis) or if it also introduces the
possibility of increased measurement bias (perhaps by making respondents
to please the researchers),the total error will not be reduced, and the
"uger will be ill advised. Comparable examples could be cited for other comstrategy
binations of error as well.
Researchpertinent to the prediction and contlol of suwey errors is accumulating at a very rapid pace. Careful attention to this growrng body of knowledge will help evaluators to increase substantially the precision of their data.
On a larger scale, these methodological advances can contribute to the development of powerful new applications of surveys to evaluaiion settings.
Note
is oftenmadeto rhealtematetlpologythatidentifiesthetwo majorcategones
1 Reference
Kasper,
(described
by Kish,1965.andAndersen,
nonsamplingerrors
of erroras sarnplingand
(1989)modelto be conceptually
clearer
\979). I find Groves's
Frankel,and Associates,
terrnnonsamplingerrorisalooselydefinedcategorythatjoins
because,ashepointsout,the
error.Groves
errorandmeasuremenl.
asnonresponse
elements
suchverydisparate
together
errorsdue to
model
of
survey
is
a
comprehensive
his
framework
als'onotesthat alihough
it doesnot addresserrorsthat
it is not a modelof totalsrweyerrorbecause
designfeatures,
errors.
occurafterdatahavebeencollected,mostnotablycodingand processing
27
References
Andersen,R., Kasper,J.,Frankel, M. R., and Associates.TotalSurveyError: Applicationsto
Impro'teHealthSur-veys.
San Francisco:Jossey-Bass,1979.
Aquilino, W. S. "lntewiew Mode Effectsin Surveysof Drug and Alcohol Use:A Field Experiment." PublicOpinionQuarterly,1994, 58, 2IO-240.
Biggar,R. J., and Melbye, M. "Responsesto Anon)rmous QuestionnairesConceming Sexual
Behar,-ior:
A Method to EstimatePotential Biases."Ameican JournaloJPublicHealtlt, 1992,
82. t506-15r2.
Blair, E. A., and Ganesh,G. K. "Characteristicsof Interval-BasedEstimatesof AutobiographicalFrequencies."ApptiedCognittue
Psycholog,1991, 5, 237-250.
Bradbum, N. M. "A Responseto the NonresponseProblem." PublicOpinionQuarterly,1992,
56,39r-397.
Catania,J.A., Gibson,D. R., Marin, 8., Coates,T. J., and Greenblatt,R. M. "ResponseBias
in AssessingSexual Behaviors Relevant to HIV Transmission." Evaluationand Program
Planning,1990, 13, 19-29.
Church, A. H. "Estimating the Effect of Incentives on Mail Survey ResponseRates:A MetaAnalysis." PublicOpinion Quarterly, 1993, 57, 62-79.
COMMiT ResearchGroup. "Community Intervention Trial for Smoking Cessation(COMMiT): 1I. Changesin Adult CigaretteSmoking Prevalence."Americanlournalof Public
Health,1995, 85, 193-200.
Day, N. A., Dunt, D. R., and Day, S. "Maximizing Responseto Suweysin Health Program
Evaluation at Minimum Cost Using Multiple Methods: Mail, Telephone,and Visit." Evaluation Review,1995. 19, 436-450.
Dillman, D. A.MallandTelephoneSurveys:The
TotalDesrgnMethod.
NewYork: Wiley, 1978.
Fendrich, M., and Vaughn, C. M. "Diminished Lifetime SubstanceUse over Time. An
Inquiry into Differential Reporting." PublicOpinion Quarterly, 1994, 58, 96-123.
Flay, B. R., Kessler,R. C,, and Utts, J. M. "EvaluatingMedia Campaigns."In S. L. Coyle,
R. F. Boruch,and C. F. Tumer (eds.),EvaluatingAIDSPreventionPrograms.
Washington,
D.C.: NationalAcademyPress,1991.
Fox, R.J., Crask, M. R., and Kim, J. "Mall SurveyResponseRate:A Meta-Analysisof Selected
Techniques for inducing Response."PublicOpinion Quarterly, I9BB,52, 467-491.
GroVes,R. M. Survey ErrorsandSurvey Costs.New York: Wiley, 1989.
Henry, G. T. PracticalSampting.Newbury Park, Calif.: Sage,1990.
Jackson,J. C., Altman, D. G., Howard-Pitney,8., and Farquhar,J. W. "EvaluatingCommunity-Level Health Promotion and DiseasePrevention lntewentions." In M. T. Braverman (ed.), EvaluatingHealthPromotionPrograms.New Directlons for ProgramEvaluation,
no. 43. San Francisco:Jossey-Bass,
1989Jobe,J. 8., and Mingay, D. J. "Cognition and Survey Measurement:History and Overview."
AppliedCognitivePsycholog,,
1991,5, 175-1q,.
Kane,E. W,, and Macaulay,L. J. "Intewiewer Genderand GenderAttitudes."PublicOpinion Quarterly, 1993, 57, l-28.
Kish, L. Smtey Sampling.New York: Wiley, 1965.
Krosnick,J. A. "ResponseStrategiesfor Coping with the Cognitive Demandsof Attitude
Measuresin Surveys,"AppliedCognitittePsychologt
, 1991, 5, 213-236 .
Lee,R. M. DoingResearch
on Sensitiye
fopics.Newbury Park, Calif.: Sage,1993.
Lin, I. F., and Schaeffer,N. C. "Using Sur-veyParticipantsto Estimatethe Impact of Nonparticipation." PublLcOpinionQuartedy,199 5, 59, 236-2 58.
Luepker,R. V., Murray, D. M.,Jacobs,D. R., N,littelmark,M.8., Bracht,N., Carlaw,R., and
others. "Community Education for CardiovascularDiseasePrevention: Risk Factor
Changesin the MinnesotaHeart Health Program;' Ameican lournal of PublicHealth, 1994,
84, I383-1393.
Marin, G., and Marin, B.Y. Research
with HispantcPopulations.
Newbury Park, Cali[.: Sage,
1991
28
AoveNcrs rN SuRVEyREsEARCH
Morron-williams,J.Intenieuer-Approaches.
Brookfield,vt.: Dartmouth, r993.
oksenberg, L-, and Canneil, c. ;in".ir
vocal characterisricson Nonre_
sponse."In R. M. Groves,p.-p, Biemer,"ii",J*t.*".
L I f_yterg,1.-f Uusrey,W.i. Ni.froifr,
,r,a;.
Waksberg
(eds.),
Telep.honr_S-uryry
urin"ai"fi Newyork:Wiley,1988.
oldendick,R' w., andii"\, rra.wl"*r" e"r*?"ng
Machine
c.r,'.rJo..'wto AreThey
DorheyPose
ro.
tt:Ii::i;:blem
su-"f n","",..i,r"
nalticiiii"" qu'arterty,
ree4,
Pearson,R' w', Ross,M., and Dawes,R. "personal
M.
Recailand the Limitsof Rerrospec_
tive Questionsin suweys."11J M. ra"u. (eJi,
uestions
AbouiA*rri"*lt"qriies into
e
Bases
of Surveys.
Newyork: RusseilSage,t902.
.^!:_r:!:r:",H', and presser,
5crruman,
S. Questions
and Answers
in A.ttitudesurveys:
Erperiments
on Ques_
Woydlngl
andCon.text.
SanDiego:acad"mrcpress,198I.
.schwarz'
!1",oT,N
and
sudman,s (eds.).contexirlJeitsinsorior
'
ondiryri'ologcalResearch.
New
1992.
.York:Springer-Verlag,
^
Schwarz,
N', andsudman,s- (eds.)_.-Autobiog-raphicarMemory
andthevalidity
--of Retrospec_
tiveReports.
Newyork: Springer_Verla
g, fOel.
^
singer'
E', von Thum' n. n- ani Miler, F. R. iconfidentiality
Assurances
and Response:
A
Reviewof the Experim.""r r-rr*rirt e-"pubric
opinioneuarterry, rgg5,59,
ffit'u,'u"
stokes'L A'' and yeh' M' 'searchingfor
causesof InrerviewerEffectsin Terephone
surveys."In R. M. Groves,p. p. Biem"er,
r_.r. ft"rg,-1-L,:rrey,
* i. *r.Ii"ffr,
Waksberg(eds.),Telephoyg
Surveyuitn aa"'g,.-ri"wyork: wiley, l9BB.
""a1.
t'3fiT*:ffi3,in*-T;,i
1g;tinnsa"-"1til';:'A'Practicatc,iili"'q,;,irionna*eDesign
sudman's Bradbum'N. M.,and schwarz,
N. ThinhingAbout
'
Answers:
The
of
Cognitiye
Processes
to SurteyUulrololgA,lu., nru*rr.o, yossey_Bass, Apptication
1996.
Xu' M , Bares,B.J., and schweitzel,l carhe
i-p".i or u.rsageson surveypartrcipation
in AnsweringMachineHou-sehord-s."
r"ati.. cipiirii eudrterly,rgg3,57,232-237.
Yammarino,F.J., skinner.S d chid;;;
i:;. y"g:Tlrlq,",
J
,
Mail SurveyResponse
Behavior:A Meta-Analys
is.; publicopiir" br
19i|.,55. 6 t3_63g
"iirriy,
Menc T. BRAVER,'TAN
is a CooperativeExtensionspecialist
in the Department oJHuman
onddirector
if the4-HCenterforyiuthDevelopment
i,:!,9*^yitl.Dlv.elopmrit
at
the
Universigt
of California,
Davis.