W I P A

WHAT IS PREDICTIVE ASSESSMENT?
DISCOVERY EDUCATION ASSESSMENT
New Mexico Benchmark Assessments
Reading and Mathematics, Grades 3-10
WHAT IS PREDICTIVE ASSESSMENT?
What is Predictive Assessment?
EXECUTIVE SUMMARY
1. Are Discovery Education Predictive Assessments reliable?
The median Reading reliability for the 0809 NM benchmark assessments grades 3-8 was .78 with
a median sample size of 574. The median Mathematics reliability for the 0809 NM benchmark
assessments was .83 with a median sample size of 574.
2. Do Discovery Education Predictive Assessments have content validity?
Discovery Education Assessments ensure content validity by using each state’s curriculum
standards for reading and mathematics. The DEA New Mexico reporting categories for Reading
and Mathematics Grades 3-8 are based on the US Reporting Categories The DEA New Mexico
reporting categories for Reading and Mathematics grades 9 & 10 are based on the New Mexico
Short-Cycle Assessments (SCA) requirements.
3. Do Discovery Education Predictive Assessments match state standardized tests?
4. Can Discovery Education Predictive Assessments predict proficiency levels?
Yes, there is a greater than 90% accuracy rate for predicting combined state proficiency
percentages. The median state Proficiency Prediction Score for the Reading tests was 94% and
the median state Proficiency Prediction Score for Mathematics tests was 94%.
5. Can the use of Discovery Education Predictive Assessments improve student learning?
6. Can Discovery Education Predictive Assessments be used to measure growth over time?
Yes. These benchmark assessments are scored on a vertical scale using state-of-the-art Rasch
psychometric modeling. Thus, reliable estimates of student growth can be made over time.
7.
Are Discovery Education Predictive Assessments based on scientifically-based research
advocated by the U. S. Department of Education?
Two matched control group studies—one in Birmingham, Alabama, and the other in Nashville,
Tennessee—support the claim that Discovery Education Predictive Assessments help schools
demonstrate significant improvement in student proficiency.
WHAT IS PREDICTIVE ASSESSMENT?
What is Predictive Assessment?
NEW MEXICO
An Overview of Standards and Scientifically-Based Evidence
Supporting the Discovery Education Assessment Test Series
Since its inception in 2000 by Vanderbilt University, ThinkLink Learning, now a part of
Discovery Education, has focused on the use of formative assessments to improve K-12 student
learning and performance. Bridging the gap between university research and classroom practice,
Discovery Education Assessment offers effective and user-friendly assessment products that
provide classroom teachers and students with the feedback needed to strategically adapt their
teaching and learning activities throughout the school year.
Discovery Education Assessment has pioneered a unique approach to formative assessments
using a scientifically research-based continuous improvement model that maps diagnostic
assessments to each state’s high stakes test. Discovery Education Assessment’s Predictive StateSpecific Benchmark tests are aligned to the content assessed by each state test allowing teachers
to track student progress toward the standards and objectives used for accountability purposes.
Furthermore, Discovery Education Assessment subscribes to the Standards for Educational and
Psychological Testing articulated by the consortium of the American Educational Research
Association, the American Psychological Association, and the National Council on Measurement
in Education. This document, “What is Predictive Assessment?” outlines how Discovery
Education Assessment addresses the following quality testing standards:
1. Are Discovery Education Predictive Assessments reliable?
Test reliability provides evidence that test questions are consistently measuring a given
construct, such as mathematics ability or reading comprehension. Furthermore, high test
reliability indicates that the measurement error for a test is low.
2. Do Discovery Education Predictive Assessments have content validity?
Content validity evidence shows that test content is appropriate for the particular constructs that
are being measured. Content validity is measured by agreement among subject matter experts
about test material and alignment to state standards, by highly reliable training procedures for
item writers, by thorough reviews of test material for accuracy and lack of bias, and by
examination of depth of knowledge of test questions.
3. Do Discovery Education Predictive Assessments match state standardized tests?
Criterion validity evidence demonstrates that test scores predict scores on an important criterion
variable, such as a state’s standardized test.
4. Can Discovery Education Predictive Assessments predict proficiency levels?
WHAT IS PREDICTIVE ASSESSMENT?
Proficiency predictive validity evidence supports the claim that a test can predict a state’s
proficiency levels. High accuracy levels show that a high degree of confidence can be placed in
the vendor’s prediction of student proficiency.
5. Can the use of Discovery Education Predictive Assessments improve student learning?
Consequential validity outlines how the use of these predictive assessments facilitates important
consequences, such as the improvement of student learning and student performance on state
standardized tests.
6. Can Discovery Education Predictive Assessments be used to measure growth over time?
Growth models depend on a highly rigorous and valid vertical scale to measure student
performance over time. A vendor’s vertical scales should be constructed using advanced
statistical methodologies such as Rasch measurement models and other state-of-the-art
psychometric techniques.
7. Are Discovery Education Predictive Assessments based on scientifically-based research
advocated by the U. S. Department of Education?
In the No Child Left Behind Act of 2001, the U.S. Department of Education outlined six major
criteria for “scientifically-based research” to be used by consumers of educational measurements
and interventions. Accordingly, a vendor’s test
(i)
employs systematic, empirical methods that draw on observation and experiment;
(ii) involves rigorous data analyses that are adequate to test the stated hypotheses and justify
the general conclusions drawn;
(iii) relies on measurements or observational methods that provide reliable and valid data
across evaluators and observers, across multiple measurements and observations, and
across studies by the same or different investigators;
(iv) is evaluated using experimental or quasi-experimental designs in which individuals,
entities, programs or activities are assigned to different conditions and with
appropriate controls to evaluate the effects of the condition of interest, with a
preference for random-assignment experiments, or other designs to the extent that
those designs contain within-condition or across-condition control.
(v) ensures experimental studies are presented in sufficient detail and clarity to allow for
replication or, at a minimum, offer the opportunity to build systematically on their
finding; has been accepted by a peer-reviewed journal or approved by a panel of
independent experts through a comparably rigorous, objective and scientific review;
WHAT IS PREDICTIVE ASSESSMENT?
TEST RELIABILITY
1. Are Discovery Education Predictive Assessments reliable?
Test reliability provides evidence that test questions are consistently measuring a given
construct, such as mathematics ability or reading comprehension. Furthermore, high test
reliability indicates that the measurement error for a test is low. Reliabilities are calculated using
Cronbach’s alpha.
Table 1 and Table 2 present test reliabilities and sample sizes for Discovery Education Predictive
Assessments in the subject areas of Reading and Mathematics for New Mexico grades 3-8.
Reliabilities for grades 3-8 are based on Discovery Education Assessments given in the fall,
winter and spring 2008-2009 school year.
The median Reading reliability was .78 with a median sample size of 574. The median
Mathematics reliability was .83 with a median sample size of 574.
Table 1: Test reliabilities for the 0809 DEA Reading Benchmark assessments, grades 3-8.
Reading Reliabilities
Grade 3
Grade 4
Grade 5
Grade 6
Grade 7
Grade 8
Median
Fall 2008
N
Reliability
459
0.77
535
0.78
589
0.78
575
0.79
223
0.78
250
0.77
497
0.78
Winter 2009
N
Reliability
563
0.82
596
0.78
643
0.75
627
0.83
228
0.76
254
0.75
580
0.77
Spring 2009
N
Reliability
546
0.83
601
0.80
645
0.76
621
0.79
227
0.79
243
0.78
574
0.79
Table 2: Test reliabilities for the 0809 DEA Mathematics Benchmark assessments, grades 3-8.
Mathematics Reliabilities
Grade 3
Grade 4
Grade 5
Grade 6
Grade 7
Grade 8
Median
Fall 2008
N
Reliability
460
0.82
532
0.84
585
0.84
572
0.79
215
0.75
253
0.84
496
0.83
Winter 2009
N
Reliability
564
0.81
597
0.83
648
0.76
615
0.74
237
0.80
245
0.80
581
0.80
Spring 2009
N
Reliability
549
0.83
598
0.84
646
0.80
634
0.82
227
0.78
237
0.83
574
0.83
Table 3 and Table 4 present test reliabilities and sample sizes for Discovery Education Predictive
Assessments in the subject areas of Reading and Mathematics for New Mexico grades 9 and 10.
Reliabilities for grades 9 and 10 are based on Discovery Education Assessments given in the fall,
winter and spring 2009-2010 school year.
WHAT IS PREDICTIVE ASSESSMENT?
Table 3: Test reliabilities for the 0910 DEA Reading Benchmark assessments, grades 9 & 10.
Reading Reliabilities
Fall 2009
N
Reliability
Grade 9
Grade 10
283
279
0.79
0.85
Winter 2010
N
Reliability
284
273
0.80
0.84
Spring 2010
N
Reliability
263
228
0.78
0.88
Table 4: Test reliabilities for the 0910 DEA Reading Benchmark assessments, grades 9 & 10.
Mathematics Reliabilities
Grade 9
Grade 10
Fall 2009
N
Reliability
290
0.49
300
0.66
Winter 2010
N
Reliability
285
0.68
284
0.57
Spring 2010
N
Reliability
252
0.71
231
0.79
WHAT IS PREDICTIVE ASSESSMENT?
CONTENT VALIDITY
2. Do Discovery Education Predictive Assessments have content validity?
Content validity evidence shows that test content is appropriate for the particular constructs that
are being measured. Content validity is measured by agreement among subject matter experts
about test material and alignment to state standards, by highly reliable training procedures for
item writers, by thorough reviews of test material for accuracy and lack of bias, and by
examination of depth of knowledge of test questions.
To ensure content validity of all tests, Discovery Education Assessment carefully aligns the
content of its assessments to a given state’s content standards and the content sampled by the
respective high stakes test. Discovery Education Assessment hereby employs one of the leading
alignment research methodologies, the Webb Alignment Tool (WAT), which has continually
supported the alignment of our tests to state specific content standards both in breadth (i.e.,
amount of standards and objectives sampled) and depth (i.e., cognitive complexity of standards
and objectives). All Discovery Education Assessment tests are thus state specific and feature
matching reporting categories of a given state’s large-scale assessment used for accountability
purposes.
The following reporting categories are used on Discovery Education Predictive Assessments.
The DEA New Mexico reporting categories for Reading and Mathematics are based on the
United States Reporting Categories for grades 3-8. The DEA New Mexico 9th and 10th grade
reporting categories for reading and math are based on the New Mexico Short-Cycle Assessments
We continually update our assessments to reflect the most current version of a state’s standards.
Reading US Reporting Categories: Grades 3-8
Text & Vocabulary Understanding
Sentence Construction
Interpretation
Composition
Extend Understanding
Proofreading/ Editing
Comprehension Strategies
Reading New Mexico Reporting Categories: Grades 9 & 10
Reading and Listening for Comprehension
Literature and Media
Writing and Speaking for Expression
Math US Reporting Categories: Grades 3-8
Number and Number Relations
Geometry and Spatial Sense
Computation and Numerical Estimation
Data Analysis, Statistics and Probability
Operation Concepts
Patterns, Functions, Algebra
Measurement
Problem Solving and Reasoning
WHAT IS PREDICTIVE ASSESSMENT?
Math New Mexico Reporting Categories: Grades 9 & 10
Data Analysis and Probability
Algebra
Geometry
WHAT IS PREDICTIVE ASSESSMENT?
CRITERION VALIDITY
3. Do Discovery Education Predictive Assessments match state standardized tests?
WHAT IS PREDICTIVE ASSESSMENT?
PROFICIENCY PREDICTIVE VALIDITY
4. Can Discovery Education Predictive Assessments predict state proficiency levels?
Proficiency predictive validity supports the claim that a test can predict a state’s proficiency
levels. High accuracy levels show that a high degree of confidence can be placed in our test
predictions of student proficiency. Two measures of predictive validity are calculated. If only
summary data for a school or district are available, the Proficiency Prediction Score is tabulated.
When individual student level data is available, then an additional index, the Proficiency Success
Rate, is also calculated. Both measures are explained in the following sections with examples
drawn from actual data from the states.
Proficiency Prediction Score
The Proficiency Prediction Score is used to determine the accuracy of predicted proficiency
status. Under the NCLB legislation, it is important that states and school districts help students
progress from a “Not Proficient” status to one of “Proficient”. The Proficiency Prediction Score is
based on the percentage of correct proficiency classifications (Not Proficient/Proficient). If a state
uses two or more classifications for “Proficient” (such as “Proficient” and “Advanced”), the
percentage of students in these two or more categories would be added together. Also, if a state
uses two or more categories for “Not Proficient” (such as “Below Basic” and “Basic”), the
percentage of students in these two or more categories would be added together. To see how to
use this score, let’s assume a school district had the following data based on its annual state test
and a Discovery Education Assessment Spring benchmark assessment. Let’s use data from a
Grade 4 Mathematics Test as an example:
Predicted Percent Proficient or higher = 70%
Actual Percent Proficient or higher on the State Test = 80%
The error rate for these predictions is as follows:
Error Rate = /Actual Percent Proficient - Predicted Percent Proficient/
Error Rate = 80% - 70% = 10%
In this example, Discovery Education Assessment underpredicted the percent of students
proficient by 10%. The absolute value (the symbols / / ) of the error rate is used to account for
cases where Discovery Education Assessment overpredicts the percent of students proficient and
the calculation is negative (e.g., Actual - Predicted = 70% - 80% = -10%; absolute value is 10%).
The Proficiency Prediction Score is calculated as follows:
Proficiency Prediction Score = 100% - Error Rate
In this example, the score is as follows:
Proficiency Prediction Score = 100% - 10% = 90%.
A higher Proficiency Prediction Score indicates a larger number or percentage of correct
proficiency predictions. In this example, Discovery Education Assessment had a score of 90%,
WHAT IS PREDICTIVE ASSESSMENT?
which indicates 9 correct classifications for every 1 misclassification. Discovery Education
Assessment uses information from these scores to improve its benchmark assessments every year.
The state of New Mexico uses a four level proficiency scale: Beginning, Near Proficient,
Proficient and Advanced. The table following show the proficiency prediction scores for the
Reading and Mathematics assessments. On average, Discovery Education Assessments had a
94% proficiency prediction score on the reading tests and a 94% proficiency prediction score on
the math tests. The figures on the following pages compare the percentages of the student
proficiency levels predicted by Discovery Education Assessment 0809B test and the 2009 NM
State assessments.
Proficiency Prediction Score for New Mexico
Grade 3
Grade 4
Grade 5
Grade 6
Grade 7
Grade 8
Grade 11
Reading
94.10
98.00
98.50
94.70
93.00
84.40
94.90
Math
94.30
98.60
97.00
91.30
95.20
81.80
97.20
WHAT IS PREDICTIVE ASSESSMENT?
Grade 3 Reading Proficiency Level Percentages
Percent of Students
60.00
50.00
40.00
DEA
30.00
NM State
20.00
10.00
0.00
Beginning
Near
Proficient
Proficient
Advanced
Grade 3 Math Proficiency Level Percentages
Percent of Students
50.00
40.00
30.00
DEA
20.00
NM State
10.00
0.00
Beginning
Near
Proficient
Proficient
Advanced
Grade 4 Reading Proficiency Level Percentages
Percent of Students
60.00
50.00
40.00
DEA
30.00
NM State
20.00
10.00
0.00
Beginning
Near
Proficient
Proficient
Advanced
WHAT IS PREDICTIVE ASSESSMENT?
Grade 4 Math Proficiency Level Percentages
Percent of Students
60.00
50.00
40.00
DEA
30.00
NM State
20.00
10.00
0.00
Beginning
Near
Proficient
Proficient
Advanced
Grade 5 Reading Proficiency Level Percentages
Percent of Students
60.00
50.00
40.00
DEA
30.00
NM State
20.00
10.00
0.00
Beginning
Near
Proficient
Proficient
Advanced
Grade 5 Math Proficiency Level Percentages
Percent of Students
60.00
50.00
40.00
DEA
30.00
NM State
20.00
10.00
0.00
Beginning
Near
Proficient
Proficient
Advanced
WHAT IS PREDICTIVE ASSESSMENT?
Grade 6 Reading Proficiency Level Percentages
Percent of Students
50.00
40.00
30.00
DEA
20.00
NM State
10.00
0.00
Beginning
Near
Proficient
Proficient
Advanced
Grade 6 Math Proficiency Level Percentages
Percent of Students
60.00
50.00
40.00
DEA
30.00
NM State
20.00
10.00
0.00
Beginning
Near
Proficient
Proficient
Advanced
Grade 7 Reading Proficiency Level Percentages
Percent of Students
60.00
50.00
40.00
DEA
30.00
NM State
20.00
10.00
0.00
Beginning
Near
Proficient
Proficient
Advanced
WHAT IS PREDICTIVE ASSESSMENT?
Grade 7 Proficiency Level Percentages
Percent of Students
60.00
50.00
40.00
DEA
30.00
NM State
20.00
10.00
0.00
Beginning
Near
Proficient
Proficient
Advanced
Grade 8 Proficiency Level Percentage
Percent of Students
60.00
50.00
40.00
DEA
30.00
NM State
20.00
10.00
0.00
Beginning
Near
Proficient
Proficient
Advanced
Grade 8 Math Proficiency Level Percentages
Percent of Students
60.00
50.00
40.00
DEA
30.00
NM State
20.00
10.00
0.00
Beginning
Near
Proficient
Proficient
Advanced
WHAT IS PREDICTIVE ASSESSMENT?
Grade 10/11 Reading Proficiency Level
Percentages
Percent of Students
60.00
50.00
40.00
DEA Grade
10
30.00
NM State
Grade 11
20.00
10.00
0.00
Beginning
Near
Proficient
Proficient
Advanced
Grade 10/11 Math Proficiency Level Percentages
Percent of Students
40.00
30.00
20.00
DEA Grade
10
10.00
NM State
Grade 11
0.00
Beginning
Near
Proficient
Proficient
Advanced
WHAT IS PREDICTIVE ASSESSMENT?
CONSEQUENTIAL VALIDITY
5. Can the use of Discovery Education Predictive Assessments improve student learning?
Consequential validity outlines how the use of benchmark assessments facilitates important
consequences, such as the improvement of student learning and student performance on state
standardized tests.
WHAT IS PREDICTIVE ASSESSMENT?
GROWTH MODELS
6. Can Discovery Education Predictive Assessments be used to measure growth over time?
Growth models depend on a highly rigorous and valid vertical scale to measure student
performance over time. Discovery Education Assessment vertical scales are constructed using
Rasch measurement models with state-of-the-art psychometric techniques.
The accurate measurement of student achievement over time is becoming increasingly important
to parents, teachers, and school administrators. Student “growth” within a grade and across
grades has also been sanctioned by the U. S. Department of Education as a reliable way to
measure student proficiency in Reading and Mathematics and to satisfy the requirements of
Adequate Yearly Progress (AYP) under the No Child Left Behind Act. Accurate measurement
and recording of individual student achievement can also help with issues of student mobility: as
students move within a district or state, records of individual student achievement can help new
schools administer to the needs of this mobile population.
The assessment of student achievement over time is even more important with the use of
benchmarks tests. Discovery Education Assessment Benchmark tests provide a snapshot of
student progress toward state standards at up to four points during the school year. These
benchmark tests are scientifically linked, so that the reporting of student proficiency levels is both
reliable and valid.
How is the growth score created?
Discovery Education Assessment has added a scientifically based vertical scaled growth score to
its family of benchmark tests in 2007-08. These growth scores are based on the Rasch
measurement model, a state-of-the-art psychometric technique for scaling ability (e.g., Wright &
Stone, 1979; Wright & Masters, 1982; Linacre 1999; Smith & Smith, 2004; Wilson, 2005). To
accomplish vertical scaling, common items are embedded across assessments to enable the
psychometric linking of tests at different points in time. For example, a Grade 3 mathematics
benchmark test administered mid-year might contain below grade level and above grade level
items. Performance on these off grade level items provides an accurate measurement of how
much growth occurs across grades. Furthermore, benchmark tests within a grade are also linked
with common items, once again to assess change at different points in time within a grade.
Discovery Education Assessment is using established psychometric procedures to build calibrated
item banks and linked tests (i.e., Ingebo, 1997; Kolen & Brennan, 2004).
Why use such a rigorous vertical scale?
Isn’t student growth similar across grades? Don’t students change as much from Grade 3 to Grade
4 as they do from Grade 7 to Grade 8? Previous research on the use of vertical scales has
demonstrated that student growth is not linear; that is, growth in student achievement is
different from grade to grade (see Young 2006). For instance, Figure 9 on the next page shows
preliminary Discovery Education Assessment vertically scaled growth results. This graph shows
growth from Grades 3 to 10 in Mathematics as measured by Discovery Education Assessment’s
Spring benchmark tests. Typically, students have larger gains in mathematics achievement in
elementary grades with growth somewhat slowing in middle and high school, as published by
other major testing companies.
WHAT IS PREDICTIVE ASSESSMENT?
Figure 11: Vertically Scaled Growth Results for Discovery Education Assessment Mathematics
Tests.
Discovery/ThinkLink Growth Score: Mathematics
Median Scale Score
1700
1600
1500
1400
1300
Grade 3
Grade 4
Grade 5
Grade 6
Grade 7
Grade 8
Grade 9
Grade 10
What is unique about the Discovery Education Assessment vertical growth scores?
Student growth can now be accurately measured at four points in time in each grade level.
Discovery Education Assessment benchmark tests are administered up to four times yearly:
Early Fall, Late Fall, Winter, and Spring. For each time period, we report scale scores and
accompanying statistics. Most testing companies only allow the measurement of student growth
at two points in time: Fall and Spring. Discovery Education Assessment benchmark tests
provide normative information to assess student growth multiple times each year. Figure 10
illustrates this growth for Grade 4 Mathematics using our benchmark assessments.
Figure 12: Within-Year Growth Results for Discovery Education Assessment Mathematics
Tests.
Discovery/ThinkLink Growth in Grade 4 Mathematics
1520
Median Scale Score
1500
1480
1460
Math
1440
1420
1400
1380
Early Fall
Late Fall
Winter
Think Link Benchmark Tests
Spring
WHAT IS PREDICTIVE ASSESSMENT?
New Mexico Growth Scale
The following tables and figures illustrate the Test Difficulty on the Discovery Education
Assessment vertical growth scale for the 0809 Reading and Mathematics tests between three
periods, Fall, Winter and Spring.
Reading Average Scale Scores
Grade 3
Grade 4
Grade 5
Grade 6
Grade 7
Grade 8
Grade 9
Grade 10
Fall 2008
1395
1412
1485
1532
1557
1572
1604
1672
Winter 2009
1397
1429
1504
1529
1562
1585
1636
1657
Spring 2009
1430
1472
1502
1578
1601
1605
1626
1581
Math Average Scale Scores
Grade 3
Grade 4
Grade 5
Grade 6
Grade 7
Grade 8
Grade 9
Grade 10
Fall 2008
1398
1440
1478
1516
1557
1595
1617
1638
Winter 2009
1352
1449
1487
1545
1560
1616
1611
1621
Spring 2009
1404
1456
1516
1584
1597
1630
1682
1638
Fall
Winter
Spring
Fall
Winter
Spring
Fall
Winter
Spring
Fall
Winter
Spring
Fall
Winter
Spring
Fall
Winter
Spring
Fall
Winter
Spring
Fall
Winter
Spring
Avg Student Scale Score
Fall
Winter
Spring
Fall
Winter
Spring
Fall
Winter
Spring
Fall
Winter
Spring
Fall
Winter
Spring
Fall
Winter
Spring
Fall
Winter
Spring
Fall
Winter
Spring
Avg. Student Scale Score
WHAT IS PREDICTIVE ASSESSMENT?
0809 NM Reading Average Student Scale Scores
1700
1600
1500
1400
Scale
Score
1300
Grade Grade Grade Grade Grade Grade Grade Grade
3
4
5
6
7
8
9
10
1700
0809 NM Math Average Student Scale Scores
1600
1500
1400
Scale
Score
1300
Grade Grade Grade Grade Grade Grade Grade Grade
3
4
5
6
7
8
9
10
WHAT IS PREDICTIVE ASSESSMENT?
NCLB SCIENTIFICALLY-BASED RESEARCH
7. Are Discovery Education Predictive Assessments based on scientifically-based research
advocated by the U. S. Department of Education?
Discovery Education Assessment has also adhered to the criteria for “scientifically-based
research” put forth in the No Child Left Behind Act of 2001. “What is Predictive Assessment?”
has outlined how Discovery Education Predictive Assessments test reliability and validity meets
the following criteria for scientifically-based research set forth by NCLB:
(i)
employs systematic, empirical methods that draw on observation and
experiment;
(ii)
involves rigorous data analyses that are adequate to test the stated hypotheses
and justify the general conclusions drawn;
(iii)
relies on measurements or observational methods that provide reliable and valid
data across evaluators and observers, across multiple measurements and
observations, and across studies by the same or different investigators;
Discovery Education Assessment also provides evidence of meeting the following scientificallybased research criterion:
(iv)
is evaluated using experimental or quasi-experimental designs in which
individuals, entities, programs or activities are assigned to different conditions
and with appropriate controls to evaluate the effects of the condition of interest,
with a preference for random-assignment experiments, or other designs to the
extent that those designs contain within-condition or across-condition control.
Case Study One: Birmingham, Alabama City Schools
Larger schools and school districts typically do not participate in experimental or quasiexperimental studies due to logistical and ethical concerns. However, a unique situation in
Birmingham, Alabama afforded Discovery Education Assessment with the opportunity to
investigate the efficacy of its benchmark assessments in respect to a quasi-control group. In
2003-2004, approximately one-half of the schools in Birmingham City used Discovery Education
Predictive Assessments whereas the other half did not. At the end of the school year,
achievement results for both groups were compared revealing a significant improvement on the
SAT10 for those schools that used the Discovery Education Predictive Assessments as opposed to
those that did not. Discovery Education Assessment subsequently compiled a brief report titled
the “Birmingham Case Study”. Excerpts from the case study are included below:
This study is based on data from elementary and middle schools in the City of Birmingham,
Alabama. In 2002-03, no Birmingham Schools used Discovery Education’s Predictive
Assessment Series. Starting in 2003-04, 20 elementary and 9 middle schools used the Discovery
Education Assessment program. All Birmingham schools took the Stanford Achievement Test
Tenth Edition (SAT10) at the end of both school years. The SAT10 is administered yearly as part
of the State of Alabama’s School Accountability Program. The State of Alabama uses
improvement in SAT10 percentiles to gauge school progress and as part of its NCLB reporting.
WHAT IS PREDICTIVE ASSESSMENT?
National percentiles on the SAT10 are reported by subject and grade level. A single national
percentile is reported for all students within a subject and grade level (this analysis is
subsequently referred as ALL STUDENTS). Furthermore, national percentiles are disaggregated
by various subgroups within a school. For the comparisons that follow, the national percentiles
for students
classified as utilizing free and reduced lunch (referred to below as POVERTY) were used. All
percentiles have been converted to Normal Curve Equivalents (NCE) to allow for averaging of
results.
The Discovery Education Assessment schools comprise the experimental group in this study.
The Birmingham schools that did not use Discovery Education Assessment comprise the matched
comparison group. The following charts show SAT10 National Percentile changes for DEA
Schools vs. Non-DEA Schools in two grades levels (Grades 5 and 6) for three subjects
(Language, Mathematics, and Reading) for two groups of students (ALL STUDENTS and
POVERTY students). In general, there was a significant decline or no improvement in SAT10
scores from 2002-03 to 2003-04 for most non-DEA schools. This trend however did not happen
in the schools using Discovery Education Assessment: instead, there was a marked improvement
with most grades scoring increases in language, math and reading. In grade levels where there
was a decline in Discovery Education Assessment schools, it was a much lower decline in scores
when compared to those schools that did not use Discovery Education Assessment.
As a result of the improvement that many of these schools made in school year 2003-04, the
Birmingham City Schools selected Discovery Education Assessment to be used with all of the
schools in school year 2004-05. The Birmingham City Schools also chose to provide
professional development in each school to help all teachers become more familiar with the
concepts of standardized assessment and better utilize data to focus instruction.
SAT 10 Math Comparisons: DEA vs. Non DEA Schools
in Birmingham, AL
45
44
NCE
43
42
41
40
39
38
SAT 10 SAT 10 SAT 10 SAT 10 SAT 10 SAT 10 SAT 10 SAT 10
02-03
03-04 02-03
03-04
02-03
03-04 02-03
03-04
DEA Schools
Non-DEA
Schools
All Schools
DEA Schools
Non-DEA
Schools
Poverty Schools
WHAT IS PREDICTIVE ASSESSMENT?
SAT 10 Grade 5 Reading Comparison: DEA vs. Non
DEA Schools in Birmingham, AL
46
45
NCE
44
43
42
41
40
SAT 10 SAT 10
02-03
03-04
SAT 10 SAT 10 SAT 10 SAT 10
02-03
03-04
02-03
03-04
DEA Schools
Non-DEA
Schools
DEA Schools
All Schools
SAT 10 SAT 10
02-03
03-04
Non-DEA
Schools
Poverty Schools
SAT 10 Grade 6 Language Comparison: DEA vs Non
DEA Schools
48
46
NCE
44
42
40
38
36
SAT 10
02-03
SAT 10
03-04
DEA Schools
SAT 10
02-03
SAT 10
03-04
Non-DEA Schools
All Schools
SAT 10
02-03
SAT 10
03-04
DEA Schools
SAT 10
02-03
SAT 10
03-04
Non-DEA Schools
Poverty Schools
The following pie graph shows the Lunch Status percentages provided by Birmingham,
AL school system for grades 5th and 6th.
WHAT IS PREDICTIVE ASSESSMENT?
Grades 5 & 6 Free Lunch Status Percentages
Reduced
6%
Not
Free/Reduced
11%
Free
Not Free/Reduced
Reduced
Free
83%
Case Study Two: Metro Nashville, Tennessee City Schools
Metro Nashville schools that used Discovery Education Assessment made greater
improvements in AYP than Metro Nashville schools that didn’t use Discovery Education
Assessment. During the 2004-2005 school year, sixty-five elementary and middle schools in
Metro Nashville, representing over 20,000 students, used Discovery Education assessments.
Fifty-two elementary and middle schools, representing over 10,000 students, did not use
Discovery Education assessments. The improvement in the percent of students at the
Proficient/Advanced level from 2004 to 2005 is presented in the graph below. The results
compare DEA schools versus non-DEAschools in Metro Nashville. Discovery Education
Assessment schools showed more improvement in AYP status from 2004 to 2005 when schools
are combined and analyzed separately at the elementary and middle school level.
% Improvement
Improvement in Reading TCAP Scores 2005 over
2004
14
12
10
8
6
4
2
0
DEA
Schools
Non DEA
Schools
Com bined
Improvement
(n=20190)
Elementary
Im provement
(n=5217)
Middle School
Improvem ent
(n=14948)
WHAT IS PREDICTIVE ASSESSMENT?
% Improvement
Improvement in Math TCAP Scores 2005 over
2004
14
12
10
8
6
4
2
0
DEA
Schools
Non DEA
Schools
Combined
Improvem ent
(n=21549)
Elem entary
Improvement
(n=5765)
Middle School
Im provement
(n=15759)
The following pie charts display the frequency percents of the NCLB data
provided to DEA from the Metro Nashville Public School System for the Elementary
Schools.
Gender Frequency Percentages: Elementary
No Gender
Provided
5%
Male
Male
49%
Female
No Gender Provided
Female
46%
Ethnicity Frequency Percentages: Elementary
American Indian
0.13%
Hispanic
15.99%
Other
0.08%
Asian
Black
White
31.54%
Hispanic
American Indian
Other
White
Black
43.45%
No Ethnicity Provided
Asian
3.46%
No Ethnicity
Provided
5.34%
WHAT IS PREDICTIVE ASSESSMENT?
Free Lunch Status Freq Percentages:
Elementary
No Status
Provided
6%
Not
Free/Reduced
27%
Not Free/Reduced
Yes: Free/Reduced
No Status Provided
Yes:
Free/Reduced
67%
English Language Learners Freq Perce ntages:
Elementary
5.61%
12.96%
Non ELL Students
ELL Student
No Status Provided
81.43%
Special Education Freq Percentages:
Elementary
6%
11%
Not in Special Ed.
Yes: Special Education
No Status Provided
83%
The following pie charts display the frequency percents of the NCLB data
provided to DEA from the Metro Nashville Public School System for the Middle
Schools.
WHAT IS PREDICTIVE ASSESSMENT?
Gender Frequency Percentages: Middle School
No Gender
Provided
5%
Male
Male
48%
Female
No Gender Provided
Female
47%
Ethinicity Frequency Percentages: Middle School
No Ethnicity
Provided
4.91%
Asian
3.33%
Asian
Black
White
30.71%
Hispanic
Black
46.43%
American Indian
0.18%
American Indian
Other
White
No Ethnicity Provided
Other
0.05%
Hispanic
14.40%
Free Lunch Status Freq Percentages: Middle
School
No Status
Provided
5%
Not
Free/Reduced
29%
Not Free/Reduced
Yes: Free/Reduced
No Status Provided
Yes:
Free/Reduced
66%
WHAT IS PREDICTIVE ASSESSMENT?
English Language Learners Freq Percentages:
Middle School
8.52%
5.09%
Non ELL Students
ELL Student
No Status Provided
86.39%
Special Education Freq Percentages: Middle
School
5%
10%
Not in Special Ed.
Yes: Special Education
No Status Provided
85%
(v)
ensures experimental studies are presented in sufficient detail and clarity to
allow for replication or, at a minimum, offer the opportunity to build
systematically on their finding;
Consumers are encouraged to request additional data or further details for the examples listed in
this overview. Discovery Education Assessment also compiles Technical Manuals specific to
each school district and/or state. Accumulated data are of sufficient detail to permit adequate
psychometric analyses, and their results have been consistently replicated across school districts
and states. Past documents of interest include among others: “A Multi-State Comparison of
Proficiency Predictions for Fall 2006” and “A Multi-State Look at ‘What is Predictive
Assessment?’.” Furthermore, the “What is Predictive Assessment?” series of documents is
WHAT IS PREDICTIVE ASSESSMENT?
available for multiple states. Please check the Discovery Education website
www.discoveryeducation.com for document updates.
(vi)
has been accepted by a peer-reviewed journal or approved by a panel of
independent experts through a comparably rigorous, objective and scientific
review;
Discovery Education Assessment tests and results have been incorporated and analyzed in the
following peer- reviewed manuscripts and publications:
Jacqueline Shrago and Dr. Michael Smith of ThinkLink Learning contributed a chapter on
formative assessment to “Online Assessment and Measurement: Case Studies from Higher
Education, K-12, and Corporate” by Scott Howell and Mary Hicko in 2006.
Dr. Elizabeth Vaughn-Neely, Associate Professor, Dept. of Leadership & Counselor Education,
Ole Miss University ([email protected]) and Dr. Marjorie Reed, Associate Professor, Dept. of
Psychology, Oregon State University presented their peer-reviewed findings based on their joint
research and work with schools using Discovery Education Assessment at:
Society for Research & Child Development, Atlanta, GA. April 2005
Kappa Delta Pi Conference, Orlando, FL November 2005
Society on Scientific Study of Reading, July 2006
Two dissertations for Ed.S studies have also been published:
Dr. Juanita Johnson, Union University ([email protected])
Dr. Monica Eversole, Richmond KY ([email protected])
Please contact us for other specific information requests. We welcome your interest in the
evidence supporting the efficacy of our Discovery Education Assessment tests.
WHAT IS PREDICTIVE ASSESSMENT?
Procedures for Item and Test Review
Discovery Education Assessment has established policies to review items in each
benchmark assessment for appropriate item statistics and for evidence of bias.
Furthermore, the collection of items that form a specific test is reviewed for alignment to
state or national standards and for appropriate test difficulty. This document outlines in
more detail how these processes are implemented.
Item Statistics
P-values and item discrimination indices are calculated for each item based on the
number of students that completed a benchmark assessment.
P-values represent the percentage of students tested who answered the item correctly.
Based on p-values alone, the following item revision procedures are followed:
Items with p-values of .90 or above are considered too “easy” and are revised or
replaced.
Items with p-values of .20 or below are considered too “hard” and are revised or
replaced.
Item discrimination indices (biserial correlations) are calculated for each item. Items with
low biserial correlations (less than .20) are revised or replaced.
Test Content Validity and Statistics
The blueprints of state tests are monitored for changes in state standards. When state
blueprints have changed, Discovery Education Assessment revises the blueprints for
benchmarks tests used in that state. Items may be revised or replaced where necessary to
match any revised blueprint and to maintain appropriate test content validity.
P, A, B, and C tests within the same year are compared to verify that all tests are of
comparable difficulty. When necessary, items are revised to maintain the necessary level
of difficulty within each test.
The predictability of Discovery Education benchmark assessments is examined closely.
When necessary, items are revised or replaced to improve the predictability of benchmark
assessments. In order to maintain high levels of predictability, Discovery Education
Assessment strives to revise or replace less than 20% of the items in any benchmark
assessment each year.
WHAT IS PREDICTIVE ASSESSMENT?
Differential Item Functioning
Differential Item Functioning (DIF) analyses are performed on items from tests where the
overall sample size is large (n = 1000 or more per test) and the sample size for each
subgroup meets minimum standards (usually 300 or more students per subgroup). DIF
analyses for Gender (males and females) and Ethnicity (Caucasian vs. African-American,
or Caucasian vs. Hispanic) are routinely performed when sample size minimums are met.
DIF analyses are conducted using Rasch modeling through the computer program
WINSTEPS. This program calculates item difficulty parameters for each DIF subgroup
and compares these logit estimates. DIF Size is calculated using industry standard
criteria (see p.1070 “An Adjustment for Sample Size in DIF Analysis”, Rasch
Measurement Transactions, 20:3, Winter 2006).
The following criteria are used to determine DIF Size:
Negligible:
Moderate:
Large:
0 logits to .42 logits (absolute value)
.43 logits to .63 logits (absolute value)
.64 logits and up (absolute value)
Items with Large DIF Size are marked for the content team to review. Subject matter
content experts analyze an item to determine the cause of Gender or Ethnic DIF. If the
content experts can determine this cause, the item is revised to remove the gender or
ethnicity bias. If the cause cannot be determined, the item is replaced with a different
item that measures the same sub-skill.
WHAT IS PREDICTIVE ASSESSMENT?
TEST AND QUESTION STATISTICS, RELIABILITY, AND PERCENTILES
The following section reports test and question statistics, reliability, and percentiles for the New
Mexico benchmark tests, for grades 3-10, Reading and Mathematics. The 3-8 benchmark tests
were administered in New Mexico in spring of 2009. The 9-10th grade benchmarks were
administered winter of 2010. These benchmark tests are representative samples of over 1000
benchmark tests developed by Discovery Education Assessment. Benchmark tests are revised
each year based on test and question statistics, particularly low item discrimination indices and
significant DIF.
The following statistics are reported:
Number of Students:
Number of Items:
Number of Proficiency
Items:
Number of students used for calculation of test statistics.
Number of items in each benchmark test (including common items
used for scaling purposes).
Number of items used to calculate the proficiency level.
Mean:
Test mean in terms of number correct.
Standard Deviation:
Test standard deviation.
Reliability:
Cronbach’s alpha.
SEM:
Standard Error of Measurement (SEM) for the test.
Percentiles:
Discovery Education Assessment Scale Score for each number
correct (Scale scores are vertically scaled using Rasch
measurement. Scale scores from grades K-12 range from 1000 to
2000).
Percentage of students below each number correct score.
Stanines:
Scale scores that range from 1 to 9.
Question P-values:
The proportion correct for each item.
Biserial:
Item discrimination using biserial correlation.
Rasch Item Difficulty:
Rasch item difficulty parameter calculated using WINSTEPS.
DIF Gender:
Rasch item difficulty difference (Male vs. Female).
DIF Ethnicity:
Rasch item difficulty difference (White vs. Black).
Scale Score:
DIF Size
Negligible:
0 logits to .42 logits (absolute value).
Moderate:
.43 logits to .63 logits (absolute value).
Large:
.64 logits and up (absolute value).
WHAT IS PREDICTIVE ASSESSMENT?
Reading Grade 3
0809 Spring New Mexico
Test Statistics
Number of Students
546
Number of Items
28
Mean
17.40
Std. Deviation
5.58
Reliability
0.83
Std. Error of Measurement
2.30
Question Statistics
Item
No.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
P-Value
0.70
0.88
0.58
0.79
0.72
0.55
0.30
0.48
0.74
0.61
0.73
0.81
0.75
0.61
0.61
0.55
0.58
0.64
0.61
0.69
0.58
0.34
0.59
0.70
0.55
0.64
0.40
0.66
Biserial
0.45
0.37
0.29
0.41
0.47
0.46
0.31
0.34
0.50
0.38
0.54
0.36
0.51
0.48
0.45
0.38
0.44
0.46
0.43
0.47
0.40
0.24
0.40
0.48
0.42
0.36
0.41
0.41
Rasch Item
Difficulty
-0.38
-1.77
0.23
-0.99
-0.50
0.43
1.71
0.74
-0.66
0.11
-0.55
-1.13
-0.73
0.12
0.10
0.42
0.23
-0.07
0.11
-0.32
0.24
1.52
0.21
-0.41
0.39
-0.04
1.17
-0.18
Scale Scores
No.
Correct Scale Score
0
1010
1
1104
2
1162
3
1197
4
1224
5
1246
6
1265
7
1282
8
1298
9
1312
10
1326
11
1339
12
1352
13
1364
14
1377
No.
Correct
15
16
17
18
19
20
21
22
23
24
25
26
27
28
Scale
Score
1389
1402
1415
1428
1442
1456
1472
1489
1508
1530
1557
1593
1650
1744
WHAT IS PREDICTIVE ASSESSMENT?
Math Grade 3
0809 Spring New Mexico
Test Statistics
Number of Students
549
Number of Items
32
Mean
18.76
Std. Deviation
5.98
Reliability
0.83
Std. Error of Measurement
2.47
Question Statistics
Item
No.
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
PValue
0.41
0.53
0.66
0.55
0.57
0.62
0.71
0.69
0.79
0.62
0.62
0.53
0.75
0.50
0.52
0.65
0.85
0.50
0.68
0.61
0.50
0.32
0.66
0.43
0.59
0.33
0.88
0.52
0.72
0.62
0.48
0.34
Biserial
0.38
0.29
0.30
0.36
0.46
0.50
0.41
0.42
0.42
0.49
0.35
0.45
0.35
0.49
0.36
0.32
0.29
0.28
0.38
0.41
0.29
0.31
0.49
0.45
0.34
0.20
0.37
0.41
0.50
0.50
0.37
0.36
Scale Scores
Rasch
Item
Difficulty
0.92
0.29
-0.33
0.21
0.11
-0.16
-0.64
-0.52
-1.08
-0.13
-0.14
0.32
-0.83
0.46
0.35
-0.28
-1.59
0.43
-0.45
-0.09
0.44
1.39
-0.32
0.8
0.01
1.32
-1.87
0.35
-0.65
-0.12
0.53
1.26
No.
Correct
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
Scale
Score
1000
1080
1138
1173
1200
1221
1240
1256
1271
1285
1298
1310
1322
1334
1345
1356
1367
No.
Correct
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
Scale
Score
1378
1389
1400
1411
1423
1435
1448
1462
1476
1492
1510
1532
1557
1592
1648
1741
WHAT IS PREDICTIVE ASSESSMENT?
Reading Grade 4
0809 Spring New Mexico
Test Statistics
Number of Students
601
Number of Items
28
Mean
16.39
Std. Deviation
5.05
Reliability
0.80
Std. Error of Measurement
2.26
Scale Scores
Question Statistics
Item
No.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
P-Value
0.51
0.63
0.82
0.25
0.66
0.87
0.75
0.50
0.75
0.70
0.46
0.26
0.75
0.66
0.36
0.42
0.63
0.61
0.45
0.61
0.67
0.64
0.59
0.19
0.37
0.73
0.77
0.80
Biserial
0.42
0.31
0.45
0.3
0.43
0.31
0.41
0.5
0.51
0.51
0.31
0.19
0.54
0.38
0.39
0.34
0.55
0.36
0.43
0.3
0.49
0.45
0.31
0.13
0.17
0.44
0.46
0.49
Rasch Item
Difficulty
0.41
-0.16
-1.36
1.77
-0.33
-1.76
-0.84
0.44
-0.85
-0.56
0.65
1.73
-0.87
-0.32
1.13
0.83
-0.17
-0.09
0.69
-0.11
-0.38
-0.24
0
2.17
1.11
-0.74
-0.97
-1.17
No.
Correct
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
Scale
Score
1060
1154
1212
1249
1277
1300
1319
1337
1353
1368
1383
1397
1410
1423
1436
No.
Correct
15
16
17
18
19
20
21
22
23
24
25
26
27
28
Scale
Score
1449
1463
1476
1490
1504
1519
1535
1553
1572
1595
1622
1659
1716
1811
WHAT IS PREDICTIVE ASSESSMENT?
Math Grade 4
0809 Spring New Mexico
Test Statistics
Number of Students
598
Number of Items
32
Mean
17.57
Std. Deviation
6.33
Reliability
0.84
Std. Error of Measurement
2.53
Question Statistics
Item
No.
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
P-Value
0.67
0.87
0.65
0.72
0.57
0.65
0.58
0.39
0.74
0.49
0.58
0.45
0.67
0.23
0.49
0.52
0.69
0.61
0.72
0.53
0.59
0.50
0.34
0.44
0.56
0.41
0.48
0.31
0.40
0.74
0.59
0.4
Biserial
0.39
0.41
0.43
0.30
0.48
0.26
0.31
0.34
0.42
0.51
0.45
0.27
0.47
0.23
0.52
0.42
0.37
0.49
0.49
0.42
0.46
0.48
0.47
0.44
0.52
0.45
0.43
0.10
0.25
0.51
0.52
0.47
Rasch Item
Difficulty
-0.60
-2.00
-0.50
-0.91
-0.09
-0.49
-0.14
0.80
-0.98
0.29
-0.13
0.50
-0.60
1.76
0.31
0.17
-0.71
-0.28
-0.87
0.14
-0.17
0.28
1.09
0.55
-0.04
0.71
0.34
1.24
0.74
-1.03
-0.16
0.74
Scale Scores
No.
Correct
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
Scale
Score
1056
1150
1207
1242
1268
1289
1308
1324
1339
1353
1366
1378
1390
1401
1412
1423
1434
No.
Correct
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
Scale
Score
1445
1457
1468
1479
1491
1504
1517
1531
1546
1562
1581
1603
1630
1665
1722
1817
WHAT IS PREDICTIVE ASSESSMENT?
Reading Grade 5
0809 Spring New Mexico
Test Statistics
Number of Students
645
Number of Items
28
Mean
16.10
Std. Deviation
4.81
Reliability
0.76
Std. Error of Measurement
2.36
Question Statistics
Item
No.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
P-Value
0.94
0.82
0.71
0.41
0.38
0.55
0.55
0.65
0.55
0.74
0.50
0.51
0.39
0.41
0.50
0.67
0.30
0.81
0.61
0.33
0.66
0.65
0.76
0.42
0.73
0.60
0.63
0.29
Biserial
0.36
0.41
0.41
0.37
0.27
0.45
0.40
0.56
0.36
0.23
0.40
0.39
0.27
0.24
0.35
0.43
0.33
0.48
0.30
0.26
0.32
0.38
0.49
0.40
0.43
0.43
0.46
0.16
Rasch Item
Difficulty
-2.67
-1.36
-0.65
0.83
0.99
0.18
0.18
-0.31
0.17
-0.82
0.43
0.39
0.96
0.83
0.42
-0.40
1.39
-1.26
-0.09
1.23
-0.38
-0.33
-0.92
0.79
-0.76
-0.07
-0.20
1.44
Scale Scores
No.
Correct
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
Scale
Score
1093
1188
1247
1284
1312
1336
1356
1374
1391
1407
1422
1436
1450
1464
1477
No.
Correct
15
16
17
18
19
20
21
22
23
24
25
26
27
28
Scale
Score
1490
1504
1517
1531
1546
1561
1577
1594
1613
1636
1663
1698
1755
1849
WHAT IS PREDICTIVE ASSESSMENT?
Math Grade 5
0809 Spring New Mexico
Test Statistics
Number of Students
646
Number of Items
32
Mean
17.79
Std. Deviation
5.64
Reliability
0.80
Std. Error of Measurement
2.52
Question Statistics
Item
No.
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
P-Value
0.76
0.74
0.57
0.68
0.17
0.52
0.53
0.37
0.70
0.43
0.64
0.57
0.58
0.62
0.68
0.77
0.67
0.76
0.54
0.46
0.57
0.70
0.84
0.46
0.37
0.17
0.56
0.51
0.54
0.31
0.60
0.41
Biserial
0.44
0.51
0.38
0.32
0.13
0.33
0.28
0.26
0.43
0.31
0.50
0.44
0.41
0.44
0.39
0.38
0.37
0.46
0.36
0.34
0.49
0.31
0.27
0.40
0.41
0.17
0.44
0.33
0.47
0.36
0.40
0.26
Rasch Item
Difficulty
-1.10
-0.94
-0.06
-0.61
2.12
0.19
0.14
0.93
-0.72
0.59
-0.42
-0.06
-0.09
-0.31
-0.62
-1.14
-0.54
-1.04
0.09
0.46
-0.03
-0.71
-1.67
0.47
0.91
2.17
-0.01
0.22
0.08
1.22
-0.20
0.69
Scale Scores
No.
Correct
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
Scale
Score
1112
1205
1262
1298
1324
1346
1364
1381
1396
1409
1422
1435
1447
1459
1470
1481
1492
No.
Correct
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
Scale
Score
1504
1515
1526
1538
1550
1563
1576
1590
1606
1622
1641
1663
1690
1725
1782
1876
WHAT IS PREDICTIVE ASSESSMENT?
Reading Grade 6
0809 Spring New Mexico
Test Statistics
Number of Students
621
Number of Items
28
Mean
17.33
Std. Deviation
4.94
Reliability
0.79
Std. Error of Measurement
2.26
Question Statistics
Item
No.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
P-Value
0.57
0.60
0.74
0.62
0.41
0.78
0.34
0.75
0.54
0.41
0.77
0.80
0.74
0.78
0.64
0.85
0.87
0.54
0.69
0.36
0.63
0.25
0.68
0.83
0.35
0.82
0.39
0.60
Biserial
0.29
0.32
0.38
0.37
0.31
0.53
0.48
0.34
0.30
0.44
0.43
0.39
0.39
0.54
0.47
0.45
0.44
0.42
0.46
0.30
0.44
0.26
0.49
0.45
0.21
0.42
0.23
0.36
Rasch Item
Difficulty
0.34
0.19
-0.58
0.06
1.10
-0.85
1.46
-0.69
0.45
1.10
-0.79
-0.97
-0.58
-0.86
-0.04
-1.39
-1.60
0.48
-0.31
1.35
0.01
1.95
-0.28
-1.25
1.42
-1.12
1.20
0.19
Scale Scores
No.
Correct
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
Scale
Score
1155
1250
1307
1343
1370
1393
1412
1430
1446
1461
1475
1489
1502
1515
1528
No.
Correct
15
16
17
18
19
20
21
22
23
24
25
26
27
28
Scale
Score
1541
1555
1568
1582
1596
1611
1628
1645
1665
1688
1716
1752
1811
1906
WHAT IS PREDICTIVE ASSESSMENT?
Math Grade 6
0809 Spring New Mexico
Test Statistics
Number of Students
634
Number of Items
32
Mean
19.12
Std. Deviation
5.78
Reliability
0.82
Std. Error of Measurement
2.45
Question Statistics
Item
No.
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
P-Value
0.61
0.69
0.82
0.69
0.84
0.57
0.28
0.81
0.26
0.67
0.56
0.44
0.66
0.77
0.48
0.49
0.45
0.61
0.76
0.69
0.65
0.67
0.76
0.50
0.55
0.50
0.35
0.80
0.52
0.75
0.35
0.58
Biserial
0.46
0.22
0.21
0.43
0.38
0.46
0.29
0.28
0.25
0.39
0.45
0.34
0.46
0.31
0.48
0.35
0.40
0.50
0.43
0.45
0.40
0.35
0.39
0.38
0.40
0.43
0.53
0.46
0.29
0.39
0.21
0.39
Rasch Item
Difficulty
-0.04
-0.44
-1.30
-0.43
-1.46
0.18
1.65
-1.17
1.75
-0.35
0.22
0.82
-0.28
-0.89
0.60
0.55
0.74
-0.02
-0.84
-0.44
-0.22
-0.31
-0.83
0.50
0.29
0.50
1.25
-1.11
0.41
-0.78
1.28
0.14
Scale Scores
No.
Correct
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
Scale
Score
1172
1265
1321
1355
1381
1402
1420
1436
1450
1464
1476
1488
1500
1511
1522
1533
1544
No.
Correct
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
Scale
Score
1555
1566
1577
1588
1600
1612
1624
1638
1652
1668
1686
1707
1733
1767
1823
1916
WHAT IS PREDICTIVE ASSESSMENT?
Reading Grade 7
0809 Spring New Mexico
Test Statistics
Number of Students
227
Number of Items
28
Mean
17.57
Std. Deviation
5.05
Reliability
0.79
Std. Error of Measurement
2.32
Question Statistics
Item
No.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
P-Value
0.67
0.75
0.61
0.72
0.83
0.63
0.60
0.22
0.78
0.88
0.65
0.60
0.70
0.78
0.53
0.41
0.54
0.45
0.68
0.61
0.72
0.61
0.80
0.68
0.40
0.49
0.63
0.59
Biserial
0.26
0.30
0.39
0.37
0.49
0.38
0.32
0.22
0.47
0.47
0.42
0.49
0.43
0.41
0.45
0.24
0.31
0.29
0.48
0.43
0.40
0.31
0.40
0.39
0.40
0.42
0.39
0.46
Rasch Item
Difficulty
-0.17
-0.63
0.12
-0.47
-1.22
0.01
0.19
2.17
-0.80
-1.68
-0.06
0.19
-0.37
-0.83
0.51
1.11
0.47
0.92
-0.25
0.12
-0.44
0.15
-0.99
-0.22
1.18
0.70
0.06
0.23
Scale Scores
No.
Correct
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
Scale
Score
1189
1283
1340
1375
1401
1423
1442
1458
1474
1488
1501
1514
1526
1539
1551
No.
Correct
15
16
17
18
19
20
21
22
23
24
25
26
27
28
Scale
Score
1563
1575
1587
1600
1613
1627
1642
1658
1677
1698
1724
1758
1815
1908
WHAT IS PREDICTIVE ASSESSMENT?
Math Grade 7
0809 Spring New Mexico
Test Statistics
Number of Students
227
Number of Items
32
Mean
18.44
Std. Deviation
5.04
Reliability
0.78
Std. Error of Measurement
2.36
Question Statistics
Item
No.
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
P-Value
0.82
0.92
0.79
0.32
0.62
0.36
0.66
0.67
0.52
0.63
0.18
0.92
0.62
0.80
0.86
0.57
0.63
0.30
0.63
0.09
0.78
0.80
0.26
0.12
0.55
0.35
0.61
0.53
0.69
0.77
0.37
0.70
Biserial
0.30
0.22
0.42
0.40
0.48
0.55
0.26
0.53
0.47
0.40
0.37
0.26
0.33
0.36
0.39
0.36
0.20
0.33
0.49
-0.05
0.29
0.35
0.33
0.41
0.50
0.16
0.37
0.50
0.14
0.46
0.32
0.34
Rasch Item
Difficulty
-1.35
-2.31
-1.16
1.32
-0.18
1.09
-0.36
-0.43
0.31
-0.25
2.24
-2.31
-0.18
-1.19
-1.72
0.05
-0.23
1.42
-0.21
3.08
-1.07
-1.22
1.63
2.72
0.16
1.16
-0.12
0.27
-0.55
-1.02
1.02
-0.60
Scale Scores
No.
Correct
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
Scale
Score
1190
1284
1341
1376
1402
1424
1442
1458
1473
1487
1499
1512
1523
1535
1546
1557
1567
No.
Correct
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
Scale
Score
1578
1589
1600
1612
1623
1635
1648
1661
1676
1692
1710
1731
1757
1792
1848
1941
WHAT IS PREDICTIVE ASSESSMENT?
Reading Grade 8
0809 Spring New Mexico
Test Statistics
Number of Students
243
Number of Items
28
Mean
17.07
Std. Deviation
4.82
Reliability
0.78
Std. Error of Measurement
2.26
Question Statistics
Item
No.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
P-Value
0.86
0.48
0.93
0.58
0.85
0.74
0.39
0.36
0.84
0.49
0.67
0.50
0.49
0.74
0.56
0.73
0.66
0.41
0.36
0.58
0.46
0.76
0.44
0.69
0.36
0.49
0.89
0.75
Biserial
0.30
0.22
0.25
0.50
0.38
0.37
0.31
0.30
0.35
0.26
0.48
0.42
0.35
0.36
0.40
0.53
0.35
0.40
0.40
0.41
0.39
0.50
0.38
0.39
0.49
0.37
0.20
0.36
Rasch Item
Difficulty
-1.52
0.72
-2.39
0.23
-1.41
-0.58
1.17
1.34
-1.34
0.68
-0.19
0.62
0.68
-0.58
0.35
-0.56
-0.17
1.09
1.34
0.25
0.82
-0.71
0.92
-0.33
1.34
0.70
-1.77
-0.69
Scale Scores
No.
Correct
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
Scale
Score
1170
1265
1324
1361
1390
1414
1434
1453
1470
1486
1501
1516
1530
1544
1558
No.
Correct
15
16
17
18
19
20
21
22
23
24
25
26
27
28
Scale
Score
1571
1585
1598
1612
1627
1642
1658
1675
1695
1717
1744
1780
1837
1931
WHAT IS PREDICTIVE ASSESSMENT?
Math Grade 8
0809 Spring New Mexico
Test Statistics
Number of Students
237
Number of Items
32
Mean
18.14
Std. Deviation
5.90
Reliability
0.83
Std. Error of Measurement
2.43
Question Statistics
Item
No.
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
P-Value
0.65
0.85
0.44
0.71
0.79
0.55
0.32
0.77
0.28
0.76
0.55
0.35
0.56
0.65
0.58
0.62
0.38
0.51
0.47
0.69
0.86
0.84
0.65
0.37
0.37
0.43
0.37
0.64
0.66
0.66
0.52
0.29
Biserial
0.38
0.33
0.24
0.37
0.41
0.48
0.36
0.34
0.15
0.43
0.44
0.34
0.32
0.31
0.31
0.33
0.47
0.55
0.37
0.53
0.51
0.41
0.47
0.49
0.24
0.52
0.28
0.42
0.55
0.52
0.40
0.36
Rasch Item
Difficulty
-0.40
-1.65
0.67
-0.72
-1.23
0.13
1.30
-1.06
1.50
-1.00
0.11
1.12
0.09
-0.40
-0.04
-0.20
0.94
0.31
0.52
-0.61
-1.73
-1.55
-0.38
1.01
1.01
0.69
1.03
-0.33
-0.45
-0.42
0.27
1.47
Scale Scores
No.
Correct
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
Scale
Score
1203
1299
1359
1397
1425
1448
1467
1485
1500
1515
1529
1542
1554
1566
1578
1589
1601
No.
Correct
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
Scale
Score
1612
1624
1635
1647
1660
1673
1686
1701
1717
1734
1754
1777
1806
1844
1906
2000
WHAT IS PREDICTIVE ASSESSMENT?
Reading Grade 9
0910 Spring New Mexico
Test Statistics
Number of Students
284
Number of Items
31
Mean
16.18
Std. Deviation
5.61
Reliability
.80
Std. Error of Measurement
2.51
Question Statistics
Scale Scores
Item
No.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
P-Value
0.49
0.61
0.75
0.42
0.55
0.73
0.47
0.23
0.45
0.51
0.62
0.68
0.44
0.88
0.58
0.51
Biserial
0.18
0.30
0.43
0.38
0.46
0.52
0.41
0.11
0.37
0.25
0.43
0.43
0.3
0.29
0.54
0.44
Rasch Item
Difficulty
0.2
-0.42
-1.17
0.52
-0.1
-1.02
0.28
1.54
0.37
0.07
-0.45
-0.79
0.43
-2.08
-0.26
0.05
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
0.39
0.62
0.40
0.61
0.61
0.45
0.34
0.44
0.37
0.46
0.39
0.33
0.81
0.50
0.51
0.23
0.45
0.54
0.47
0.53
0.38
0.03
0.42
0.42
0.31
0.41
0.28
0.36
0.44
0.50
0.64
-0.43
0.59
-0.42
-0.43
0.35
0.93
0.4
0.74
0.32
0.64
0.93
-1.56
0.1
0.07
No.
Correct
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
Scale
Score
1251
1345
1402
1438
1464
1486
1505
1521
1536
1550
1563
1576
1587
1599
1610
1621
No.
Correct
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
Scale
Score
1643
1655
1666
1678
1690
1702
1716
1730
1746
1763
1784
1810
1844
1899
1992
1643
WHAT IS PREDICTIVE ASSESSMENT?
0910A NM Reading 9th Grade Assessment
NM Item
Number
Also found
in:
Question #
1
KY / 9th
2
FL / 9th
3
Biserial
DIF
Gender
DIF Eth
10
0.3
0.56
0.27
11
0.38
0.09
0.22
-
-
-
-
-
4
IL / 9th
16
0.32
0.06
0.04
5
MS / 9th
17
0.33
NA
NA
6
-
-
-
-
-
7
-
-
-
-
-
8
-
-
-
-
-
9
FL / 9th
30
0.5
0.09
0.09
10
-
-
-
-
-
11
MS / 9th
8
0.18
NA
NA
12
FL / 9th
19
0.45
0.08
0.44
13
-
-
-
-
-
14
IL / 9th
5
0.16
0.39
0.46
15
IL / 9th
7
0.43
0.08
0.36
16
IL / 9th
18
0.47
0.02
0.03
17
-
-
-
-
-
18
-
-
-
-
-
19
-
-
-
-
-
20
-
-
-
-
-
21
IL / 9th
14
0.39
0.02
0.35
22
-
-
-
-
-
23
-
-
-
-
-
24
-
-
-
-
-
25
-
-
-
-
-
26
FL / 9th
28
0.39
0.21
0.11
27
-
-
-
-
-
28
KY / 9th
23
0.45
0.29
0.04
29
-
-
-
-
-
30
-
-
-
-
-
31
-
-
-
-
-
WHAT IS PREDICTIVE ASSESSMENT?
NM Item #
0910B NM Reading 9\th Grade Assessment
Also found
in:
Item #
Biserial
DIF Eth
DIF Gen
1
KY / 9th
3
0.31
0.35
0.04
2
WI / 9th
2
0.24
NA
NA
3
-
-
-
-
-
4
KY / 9th
6
0.45
0.51
0.08
5
-
-
-
-
-
6
-
-
-
-
-
7
IL / 9th
10
0.43
0.08
0.24
8
-
-
-
-
-
9
IL / 9th
11
0.49
0.1
0.17
10
FL / 9th
19
0.4
0.25
0.04
11
-
-
-
-
-
12
IL / 9th
22
0.44
0.07
0.08
13
MS / 9
23
0.38
0.01
0.15
14
-
-
-
-
-
15
IL / 9th
14
0.52
0.09
0.08
16
FL / 9th
13
0.49
0.16
0.08
17
-
-
-
-
-
18
FL / 9th
12
0.48
0.0
0.03
19
US / HS
20
0.49
NA
NA
20
FL / 9th
9
0.44
0.49
0.05
21
-
-
-
-
-
22
IL / 9th
24
0.38
0.23
0.12
23
-
-
-
-
-
24
-
-
-
-
-
25
-
-
-
-
-
26
IL / 9th
26
0.46
0.02
0.16
27
-
-
-
-
-
28
-
-
-
-
-
29
-
-
-
-
-
30
KY / 9th
32
0.41
0.03
0.04
31
KY / 9th
33
0.52
0.12
0.32
WHAT IS PREDICTIVE ASSESSMENT?
Reading Grade 10
0910 Spring New Mexico
Test Statistics
Number of Students
273
Number of Items
31
Mean
17.81
Std. Deviation
6.14
Reliability
.84
Std. Error of Measurement
2.43
Question Statistics
Scale Scores
Item
No.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
P-Value
0.56
0.65
0.61
0.81
0.63
0.67
0.61
0.91
0.71
0.77
0.54
0.47
0.40
0.75
0.50
0.58
Biserial
0.38
0.41
0.47
0.50
0.61
0.43
0.52
0.44
0.5
0.42
0.33
0.28
0.42
0.38
0.34
0.55
Rasch Item
Difficulty
0.13
-0.34
-0.15
-1.37
-0.24
-0.46
-0.13
-2.3
-0.67
-1.03
0.2
0.55
0.94
-0.94
0.42
0.04
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
0.70
0.70
0.61
0.34
0.40
0.53
0.53
0.46
0.75
0.46
0.59
0.23
0.50
0.48
0.39
0.41
0.57
0.62
0.32
0.17
0.52
0.33
0.41
0.53
0.38
0.47
0.28
0.41
0.40
0.23
-0.63
-0.65
-0.13
1.25
0.94
0.26
0.26
0.62
-0.96
0.62
-0.05
1.9
0.42
0.53
0.97
No.
Correct
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
Scale
Score
1260
1353
1409
1444
1469
1490
1508
1524
1539
1552
1565
1577
1589
1600
1611
1622
No.
Correct
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
Scale
Score
1633
1644
1655
1667
1678
1690
1703
1717
1731
1747
1765
1786
1812
1847
1903
1997
WHAT IS PREDICTIVE ASSESSMENT?
0910A NM Reading 10th Grade Assessment
NM Item
Number
Also Found
in:
Question #
Biserial
DIF
Gender
DIF Eth
1
IL / 10th
7
0.51
0.2
0.08
2
-
-
-
-
-
3
-
-
-
-
-
4
IL / 10th
10
0.2
0.27
0.38
5
DC / 10th
2
0.46
0.19
0.26
6
-
-
-
-
-
7
KY / 10th
6
0.49
0.35
0.02
8
-
-
-
-
-
9
-
-
-
-
-
10
-
-
-
-
-
11
IL / 10th
2
0.41
0.1
0.0
12
US / HS
22
0.44
0.24
0.16
13
-
-
-
-
-
14
TN / HS
13
0.44
0.05
0.05
15
-
-
-
-
-
16
DC / 10th
34
0.42
0.03
1.2
17
-
-
-
-
-
18
-
-
-
-
-
19
-
-
-
-
-
20
FL / 10th
17
0.6
0.48
0.38
21
FL / 10th
18
0.54
0.56
0.25
22
-
-
-
-
-
23
TN / HS
17
0.41
0.15
0.07
24
FL / 10th
19
0.53
0.29
0.09
25
FL / 10th
20
0.39
0.0
0.03
26
-
-
-
-
-
27
IL / 10th
21
0.35
0.47
0.46
28
-
-
-
-
-
29
-
-
-
-
-
30
-
-
-
-
-
31
-
-
-
-
-
WHAT IS PREDICTIVE ASSESSMENT?
0910B NM Reading 10th Grade Assessment
NM Item #
Also found in:
Item #
Biserial
DIF Eth
DIF Gen
1
-
-
-
-
-
2
IL / 10th
2
0.45
0.31
0.3
3
IL / 10th
3
0.39
0.08
0.01
4
IL / 10th
4
0.47
0.27
0.29
5
-
-
-
-
-
6
TN / HS
13
0.46
0.24
0.12
7
-
-
-
-
-
8
KY / 10th
4
0.46
0.08
0.19
9
KY / 10th
6
0.5
0.12
0.02
10
-
-
-
-
-
11
KY / 10th
8
0.28
0.27
0.13
12
IL / 10th
10
0.29
0.0
0.05
13
KY / 10th
18
0.45
0.02
0.13
14
-
-
-
-
-
15
KY / 10th
20
0.39
0.19
0.26
16
DC / 10th
11
0.52
0.18
0.3
17
KY / 10th
21
0.52
0.08
0.23
18
-
-
-
-
-
19
DC / 10th
6
0.46
0.42
0.06
20
DC / 10th
4
0.2
0.81
0.14
21
DC / 10th
5
0.3
0.91
0.21
22
KY / 10th
24
0.48
0.48
0.39
23
-
-
-
-
-
24
DC / 10th
25
0.44
0.11
0.23
25
DC / 10th
26
0.43
0.46
0.44
26
TN / HS
18
0.33
0.7
0.45
27
TN / HS
19
0.29
0.4
0.22
28
-
-
-
-
-
29
-
-
-
-
-
30
IL / 10th
23
0.47
0.04
0.01
31
KY / 10th
29
0.36
0.28
0.46
WHAT IS PREDICTIVE ASSESSMENT?
Math Grade 9
0910 Winter New Mexico
Test Statistics
Number of Students
290
Number of Items
32
Mean
10.79
Std. Deviation
3.54
Reliability
0.49
Std. Error of Measurement
2.53
th
0910A New Mexico Test & Item Stats for 9 Grade Math
NM Item
Number
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
NM Item
Average
NM Biserial
Rasch Item
Difficulty
0.43
0.65
0.65
0.35
0.14
0.23
0.19
0.26
0.46
0.45
0.24
0.37
0.37
0.23
0.28
0.37
0.18
0.22
0.14
0.40
0.32
0.18
0.43
0.34
0.51
0.35
0.29
0.38
0.31
0.50
0.36
0.20
0.34
0.26
0.36
0.24
-0.06
0.32
0.05
0.12
0.12
0.42
0.21
0.24
-0.02
0.07
0.06
0.56
0.34
-0.06
0.16
0.24
0.37
0.01
0.32
0.37
0.21
0.37
0.29
0.21
0.36
0.41
0.42
0.20
-0.5
-1.45
-1.45
-0.11
1.16
0.50
0.77
0.33
-0.60
-0.57
0.46
-0.19
-0.22
0.48
0.22
-0.2
0.84
0.59
1.16
-0.36
0.01
0.84
-0.48
-0.07
-0.84
-0.11
0.20
-0.27
0.08
-0.79
-0.16
0.72
Also Found
in:
IL / 9th
IL / 9th
AL / HS
KY / 10th
IL / 10th
IL / 9th
US / HS
IL / HS
IL / 10th
IL / 10th
AL / HS
DC / 10th
FL / 9th
IL / 9th
IL / 9th
IL / 9th
FL / 9th
IL / 9th
FL / 9th
FL / 9th
FL / 10th
KY / 10th
FL / 10th
#
76
42
4
56
51
73
2
28
Biserial
DIF Gen
DIF Eth
0.35
0.35
0.39
0.31
0.02
0.23
0.1
0.51
0.22
0.21
0.05
0.03
0.39
0.12
0.42
0.4
0.07
0.00
0.39
0.03
0.86
0.28
0.01
0.22
-
-
-
-
-
-
-
-
70
72
0.29
0.36
0.2
0.29
0.01
0.18
-
-
-
-
37
0.28
0.31
0.81
-
-
-
-
5
58
0.51
0.42
0.17
0.02
1.57
0
-
-
-
-
-
-
-
-
74
0.32
0.17
0.05
-
-
-
-
-
-
-
-
63
56
46
65
45
65
0.4
0.37
0.39
0.51
0.4
0.26
0.21
0.12
0.29
0.21
0.14
0.27
0.04
0.04
0.06
0.2
0.24
0.11
-
-
-
-
49
72
67
0.47
0.42
0.36
0.02
0.05
0.2
0.06
0.62
0.11
WHAT IS PREDICTIVE ASSESSMENT?
Math Grade 10
0910 Winter New Mexico
Test Statistics
Number of Students
300
Number of Items
32
Mean
12.01
Std. Deviation
4.16
Reliability
0.66
Std. Error of Measurement
2.43
0910A New Mexico Test & Item Stats for 10th Grade Math
NM Item
Number
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
NM Item
Average
NM Biseral
Rasch Item
Difficulty
0.04
0.57
0.35
0.4
0.09
0.77
0.23
0.57
0.54
0.14
0.48
0.87
0.36
0.29
0.53
0.58
0.31
0.09
0.28
0.28
0.38
0.25
0.51
0.14
0.26
0.57
0.27
0.46
0.48
0.2
0.15
0.58
0.09
0.48
0.21
0.32
-0.15
0.45
0.31
0.39
0.48
0.2
0.11
0.2
0.38
0.27
0.37
0.43
0.29
-0.01
0.41
0.1
0.38
0.16
0.42
-0.01
0.42
0.32
0.38
0.28
0.5
0.14
0.22
0.36
2.66
-0.98
0.02
-0.25
1.84
-1.99
0.67
-1.00
-0.85
1.34
-0.61
-2.78
-0.05
0.29
-0.81
-1.05
0.20
1.80
0.36
0.38
-0.16
0.51
-0.75
1.34
0.46
-0.98
0.42
-0.49
-0.58
0.87
1.20
-1.05
Also Found
in:
FL / 10th
IL / 10th
AL / HS
AL / HS
IL / 10th
IL / HS
US / HS
IL / 10th
US / HS
FL / 10th
IL / 10th
KY / 10th
AL / HS
KY / 10th
KY / 10th
KY / 10th
IL / 10th
DC / 10th
DC / 10th
-
#
Biserial
DIF Gen
DIF Eth
-
-
-
-
50
0.55
0.29
0.44
-
-
-
-
55
18
22
0.33
0.18
0.39
0.05
0.04
0.12
0.35
0.26
0.45
-
-
-
-
54
2
22
42
0.40
0.37
0.23
0.19
0.35
0.34
0.14
0.22
0.15
0.06
0.31
0.26
-
-
-
-
-
-
-
-
6
0.26
0.23
0.05
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
61
0.52
0.00
0.24
-
-
-
-
61
62
15
0.33
0.31
0.33
0.14
0.2
0.00
0.08
0.04
0.1
-
-
-
-
70
41
42
75
11
19
0.47
0.37
0.4
0.30
0.42
0.29
0.05
0.1
0.09
0.12
0.15
0.11
0.23
0.47
0.01
0.23
0.46
0.23
-
-
-
-
-
-
-
-