Educational Research: caveat emptor EDU 8603 Educational Research

Educational Research:
Instruments (“caveat emptor”)
EDU 8603
Educational Research
Richard M. Jacobs, OSA, Ph.D.
Instruments…

tools researchers use to collect data
for research studies (alternatively
called “tests”)
The types of instruments…
1. Cognitive Instruments
2. Affective Instruments
3. Projective Instruments
1. Cognitive instruments...

Measure an individual’s attainment
in academic areas typically used
to diagnose strengths and
weaknesses
Types of cognitive instruments...

achievement tests
…provide information about how well
the test takers have learned what
they have been taught in school
…achievement is determined by
comparing it to the norm, the
performance of a national group of
similar students who have taken the
same test

aptitude tests
…measure the intellect and abilities
not normally taught and often are
used to predict future performance
…typically provide an overall score, a
verbal score, and a quantitative
score
2. Affective instruments...

Measure characteristics of
individuals along a number of
dimensions and to assess
feelings, values, and attitudes
toward self, others, and a variety
of other activities, institutions, and
situations
Types of affective instruments...

attitude scales
…self-reports of an individual’s beliefs,
perceptions, or feelings about self,
others, and a variety of activities,
institutions, and situations
…frequently use Likert, semantic
differential, Thurstone , or Guttman
scales

values tests
…measure the relative strength of an
individual’s valuing of theoretical,
economic, aesthetic, social, political,
and religious values

personality inventories
…an individual’s self-report measuring
how behaviors characteristic of defined
personality traits describe that
individual
3. Projective instruments...

Measure a respondent’s feelings
or thoughts to an ambiguous
stimulus
Primary type of projective test...

associational tests
…participants react to a stimulus
such as a picture, inkblot or word
onto which they project a
description
Selecting an instrument...
1. determine precisely the type of
instrument needed
2. identify and locate appropriate
instruments
3. compare and analyze instruments
4. select best instrument
Instrument sources…
Burros’ Mental Measurements Yearbook
Tests in Print
PRO-ED Publications Test Critiques
Compendium
ETS Test Collection Database
ERIC/AE Test Review Locator
ERIC/Burros Test Publisher Directory
Rules governing the selection
instruments...
1. the highest validity
2. the highest reliability
3. the greatest ease of administration,
scoring, and interpretation
4. test takers’ lack of familiarity with
instrument
5. avoids potentially controversial
matters
Administering the instrument...
1. make arrangements in advance
2. ensure ideal testing environment
3. be prepared for all probable
contingencies
Two issues in using instruments...
1. Validity: the degree to which the
instrument measures what it purports
to measure
2. Reliability: the degree to which the
instrument consistently measures
what it purports to measure
Types of validity...
1. Content validity
2. Criterion-related validity
3. Construct validity
1. Content validity: the degree to which
an instrument measures an intended
content area
forms of content validity…
…sampling validity: does the instrument
reflect the total content area?
…item validity: are the items included on
the instrument relevant to the
measurement of the intended content
area?
2. Criterion-related validity: an
individual takes two forms of an
instrument which are then
correlated to discriminate between
those individuals who possess a
certain characteristic from those
who do not
forms of criterion-related validity…
…concurrent validity: the degree to which
scores on one test correlate to scores
on another test when both tests are
administered in the same time frame
…predictive validity: the degree to which a
test can predict how well individual will
do in a future situation
3. Construct validity: a series of studies
validate that the instrument really
measures what it purports to measure
Types of reliability...
1. Stability
2. Equivalence
3. Internal consistency
1. Stability (“test-retest”): the degree to
which two scores on the same
instrument are consistent over time
2. Equivalence (“equivalent forms”): the
degree to which identical instruments
(except for the actual items included)
yield identical scores
3. Internal consistency (“split-half”
reliability with Spearman-Brown
correction formula , KuderRichardson and Cronback’s Alpha
reliabilities, scorer/rater reliability):
the degree to which one instrument
yields consistent results
Terms associated with instruments...
Data…
…the pieces of information researchers
collect through instruments to
examine a topic or hypothesis
Constructs…
…abstractions of behavioral factors
that cannot be observed directly and
which researchers invent to explain
behavior
Variable…
…a construct that can take on two or
more values or scores
Raw scores…
…the number of items an individual
scored on an instrument
Measurement scales…
…the representation of variables so
that they can be quantified
Measurement scales...
Qualitative (categorical)
1. nominal variables
Quantitative (continuous)
2. ordinal variables
3. interval variables
4. ratio variables
1. nominal (“categorical”): classifies
persons or objects into two or more
categories
2. ordinal (“order”): classifies persons
or objects and ranks them in terms of
the degree to which those persons or
objects possess a characteristic of
interest
3. interval: ranks, orders, and classifies
persons or objects according to equal
differences with no true zero point
4. ratio: ranks, orders, classifies persons
or objects according to equal
differences with a true zero point
Norm reference…
…provides an indication about how one
individual performed on an
instrument compared to the other
students performing on the same
instrument
Criterion reference…
…involves a comparison against
predetermined levels of performance
Self reference…
…involves measuring how an
individual’s performance changes
over time
Operationalize…
…the process of defining behavioral
processes that can be observed
Standard error of measurement…
…an estimate of how often a researcher
can expect errors of a given size on
an instrument
Mini-Quiz…

True or false…
…a large standard error of
measurement indicates a high
degree of reliability
false

True or false…
…a large standard error of
measurement indicates low
reliability
true

True or false…
…most affective tests are projective
false

True or false…
…the primary source of test
information for educational
researchers is the Burros Mental
Measurements Yearbook
true

True or false…
…research hypotheses are usually
stated in terms of variables
true

True or false…
…similar to a Thurstone scale, a
Guttman scale attempts to
determine whether an attitude is
unidimensional
true

True or false…
…validity requires the collection of
evidence to support the desired
interpretation
true

True or false…
…researchers should first consider
developing an instrument rather than
utilizing a published instrument
false

True or false…
…a researcher’s goal is to achieve
perfect predictive validity
false

True or false…
…predictive validity is extremely
important for instruments that are
used to classify or select
individuals
true

True or false…
…a high validity coefficient is closer
to 1.00 than 0.00
true

True or false…
…norm reference and criterion
reference are synonymous terms
false

True or false…
…“criterion related” refers to
correlating one instrument with a
second instrument; the second
instrument is the criterion against
with the validity of the second
instrument is judged
false

True or false…
…a valid test is always reliable but a
reliable test is not always valid
true

True or false…
…it is difficult to state appropriate
reliability coefficients because
reliability, like validity, is dependent
upon the group being tested, i.e.,
groups with different characteristics
will produce different reliabilities
true

True or false…
…content validity is not compromised
if the instrument covers topics not
taught
false

Fill in the blank…
…the tendency of an individual to
respond continually in a particular
way
response set

Fill in the blank…
…a study which consists of two
quantitative variables
correlational

Fill in the blank…
…a study which consists of one
categorical and one quantitative
variable
experimental or causal-comparative

Fill in the blank…
…a study which consists of two or
more categorical variables
correlational or descriptive

Fill in the blank…
…data collection methods which
emphasize student processes or
products
performance

Fill in the blank…
…data collection methods including
multiple-choice, true-false, and
matching
selection

Fill in the blank…
…data collection methods in which
students fill in the blank, provide a
short answer, or write an essay
supply

Fill in the blank…
…an instrument administered, scored,
and interpreted in the same way no
matter where or when it is
administered
standardized

Fill in the blank…
…the term that includes the general
process of collecting, synthesizing,
and interpreting information,
whether formal or informal
assessment

Fill in the blank…
…a formal, systematic, usually
paper-and-pencil procedure for
gathering information about
peoples’ cognitive and affective
characteristics
test

Fill in the blank…
…the degree to which individuals
seek out or participate in particular
activities, objects, and ideas
interests

Fill in the blank…
…also called “temperament,” the
characteristics representing an
individual’s typical behaviors and
describes what individual do in
their natural life circumstances
personality

Fill in the blank…
…things individuals feel favorable or
unfavorable about; the tendency to
accept or reject groups, ideas, or
objects
attitudes

Fill in the blank…
…deeply held beliefs about ideas,
persons, or objects
values

Fill in the blank…
…requires administering the
predictor instruments to a different
sample from the same population
and developing a new equation
cross-validation

Which type of test…
…Minnesota Multiphasic
Personality Inventory
personality inventory

Which type of test…
…Stanford-Binet
achievement test

Which type of test…
…Strong Campbell
interest inventory

Which type of test…
…SRA Survey of Basic Skills
achievement test

Which type of test…
…Weschler Intelligence Scales
aptitude test

Which type of test…
…Gates-McGinitie Reading Test
achievement test

Which type of test…
…Otis-Lennon School Ability Test
aptitude test

Which type of test…
…Kuder Occupational
interest inventory

Which type of test…
…Rorschach Inkblot Test
projective

Which type of test…
…Meyers-Briggs Type Indicator
personality inventory

Which type of test…
…Iowa Test of Basic Skills
achievement test

Which type of test…
…Thematic Apperception Test
projective

Which type of validity…
…compares the content of the test to
the domain being measured
content

Which type of validity…
…Graduate Record Examination
predictive

Which type of validity…
…correlates scores from one
instrument to scores on a criterion
measure, either at the same or
different time
criterion-related

Which type of validity…
…amasses convergent, divergent, and
content-related evidence to
determine that the presumed
construct is what is being measured
construct

Which type of reliability…
…scores on one instrument are
consistent over time
stability (test-retest)

Which type of reliability…
…the extent to which independent
scorers or a single scorer over
time agree on the scoring of an
open-ended instrument
scorer/rater

Which type of reliability…
…scores correlate between similar
version of an instrument given at
different times
equivalence and stability

Which type of reliability…
…scores correlate between two
versions of a test that are intended
to be equivalent
equivalence (alternate forms)

Which type of reliability…
…the extent to which items included
on an instrument are similar to one
another in content
internal consistency

Which type of response scale…
…an individual gives a quantitative
rating to a topic where each
position on the continuum has an
associated score value
semantic differential

Which type of response scale…
…value points are assigned to a
participant’s responses to a series
of statements
Likert

Which type of response scale…
…participants select from a list of
statements that represent differing
points of view from those which
participations agree
Thurstone
This module has focused on...
instruments
…which describes the procedures
researchers use to select individuals
to participate in a study
The next module will focus on...
qualitative research
...the tools researchers use to gather
data for a study