EJLTS Shaw 09 - Electronic Journal of Literacy Through Science

Science Performance Assessment and English
Learners: An Exploratory Study
Jerome M. Shaw
University of California Santa Cruz
Email: [email protected]
This study was conducted with support from National Science Foundation grant ESI-0196127.
http://ejlts.ucdavis.edu
Electronic Journal of Literacy Through Science, 8 (2009)
Science Performance Assessment and English Learners:
An Exploratory Study
Jerome M. Shaw
University of California Santa Cruz
Abstract
This study examined the science achievement of English Learners as measured by three
performance assessments embedded within inquiry-based units of instruction at the fifth grade
level. Participants in a multi-district, NSF-funded science education reform effort developed the
assessments and their associated rubrics. Scores from a member district yielded a sample of 589
students identified as either English Learners or Non-English Learners, with the former category
further subdivided into the mutually exclusive groups of EXIT (recently reclassified), LEP
(limited English proficient) and NEP (non English proficient). Comparison of means from the
teacher-generated scores showed that, on the whole, English Learners underperformed in relation
to Non-English Learners. However, contrary to previous studies using scores from traditional
assessments, these differences were not found to be significant. In addition, except for the NEP
subgroup on one of the assessments, all student groups performed at the proficient level on each
of the three assessments. Dissimilar reading comprehension demands are discussed as a potential
source of variance for student performance. While issues such as the small sample size of the
English Learner subgroups and lack of inter-rater reliability information on the scores limit the
strength of these findings, this study highlights the need for continued research on the use of
performance assessments as viable measures of inquiry-based science for English Learners.
Key words: literacy/science, school practice, performance assessment, English learners
Introduction
This study was motivated by the confluence of three significant factors on the US K-12
educational landscape: an emphasis on inquiry-based science instruction, the value of
performance assessment for measuring student learning in such contexts, and the growth of
English Learners as a segment of the school age population. The first two factors speak to a
desirable alignment between instruction and assessment while the third acknowledges current
and projected demographic realities.
Seminal documents such as the National Science Education Standards (National Research
Council, 1996) tout the need for science teaching that centrally involves “guiding students in
active and extended scientific inquiry” (p. 52). Such inquiry-based science instruction fosters
student attainment of conceptual understanding and sophisticated process skills. Given their
propensity to favor “recognition and recall,” the authors of Inquiry and the National Science
Education Standards (National Research Council, 2000) note the potential of traditional
assessments, such as multiple-choice tests, to “pose a serious obstacle to inquiry-based teaching”
(p. 75). Conversely, performance assessment’s ability to tap into complex thinking and skills
leads many to regard it as being more closely aligned with the learning outcomes associated with
Shaw
2
http://ejlts.ucdavis.edu
Electronic Journal of Literacy Through Science, 8 (2009)
inquiry-based science (Kim, Park, Kang & Noh, 2000; Lee, 1999; National Research Council,
2000). On balance, it should be recognized that constructed response assessments such as
performance tasks typically measure different learning outcomes than selected response items.
Consideration of the students being assessed in inquiry-based science classrooms leads to
the issue of linguistic diversity. English Learners (Non-native speakers of English who are
developing their proficiency in English) constitute a sizeable and growing portion of the US K12 student population. Data from the National Clearinghouse for English Language Acquisition
and Language Instruction Educational Programs (NCELA) show the following regarding English
Learners in US public elementary and secondary schools: their number has more than doubled in
the past fifteen years (school years 1989/1990-2004/2005), their rate of enrollment has increased
at nearly seven times the rate of total enrollment; in 2004/2005 (most recent year for which data
are listed) their estimated number totaled nearly 5,120,000 – approximately 10.5% of the total
public school enrollment – a figure that represents a 56.2% increase over that reported for school
year 1994/1995 (NCELA, 2007). With continued immigration and the ongoing growth of the US
Hispanic population (the largest group of English Learners), these trends are expected to
continue into the near future.
The rhetoric of current reform exhorts that scientific literacy is for all students, including
English Learners (American Association for the Advancement of Science, 1989; National
Research Council, 1996). Institutions such as the National Science Foundation have supported
this vision by funding large-scale efforts designed to implement inquiry-based science, an
important contributor to the development of scientific literacy, particularly with diverse or
underserved student populations. These initiatives strive to increase alignment between
instruction and assessment, often by linking inquiry-based curriculum and instruction with
performance-based assessment. Given the above demographic data and the inclusive nature of
many science education reform efforts, one can infer that in contexts where inquiry-based
science instruction is measured via performance assessment, English Learners are increasingly
engaging in such assessments.
The presumed value of using performance assessments bears concomitant caveats. As
discussed above, many consider performance assessments to be more congruent with the
teaching practices called for in current conceptions of educational reform, particularly for
measuring outcomes such as the ability to gather and interpret experimental data. Noting the bias
in traditional assessments, proponents of performance assessment argue its potential to narrow
achievement gaps between ethnic, socioeconomic, and gender groups (Lee, 1999). Nevertheless,
challenges associated with the use of performance assessments include difficulty and costliness
in development and implementation (Baker, 1997; Stecher, 1995). Moreover, studies have found
that while girls tend to have higher overall mean scores than boys, patterns of achievement gaps
among ethnic and socioeconomic groups are generally the same with both performance
assessment and traditional tests (Klein et al., 1997; Lee, 1999; Lee, 2005; Lynch, 2000; Stecher
& Klein, 1997). It has yet to be determined whether or not, or the degree to which, performance
assessment provides more equitable measurement of diverse students’ knowledge and skills
(Lee, 1999).
A similar lack of clarity is apparent when focusing on assessment of English Learners in
content areas such as science and mathematics. In fact, there is evidence indicating inequitable
aspects of assessments, performance and otherwise, when used with this student population.
Numerous studies have documented significant links between students’ level of
proficiency in English and their performance on content-based assessments (Abedi, 2002; Abedi
Shaw
3
http://ejlts.ucdavis.edu
Electronic Journal of Literacy Through Science, 8 (2009)
& Lord, 2001; Abedi, Lord, Hofstetter, & Baker, 2000; Johnson & Monroe, 2004; Shaftel,
Belton-Kocher, Glasnapp & Poggio, 2006). Understandably, low levels of proficiency in English
have been shown to correspond with low levels of achievement for assessments given in English.
However, confounding variables include the linguistic demands of the assessment, which may be
mitigated by the use of accommodations such as customized dictionaries (Abedi et al., 2000).
When looking exclusively at science performance assessment with English Learners, the
research base is quite sparse. Nevertheless, there are indications of similar linkages between
English proficiency and student scores (see for example Shaw, 1997). Once again, the limited
findings are insufficient to determine a consensus on the relative merit of using performance
assessments to measure the achievement of English Learners in inquiry-based science
classrooms.
The present study was conducted to address these gaps in the literature. While not
comparative in nature (i.e., traditional versus performance), the purpose of the study was to
investigate a context in which performance assessments were indeed being used to measure the
achievement of English Learners who in fact were taught science via inquiry-based instruction.
Accordingly, this study examined the learning of fifth grade English Learners in inquiry-based
science classrooms as measured by a set of three curriculum-embedded performance
assessments. Specifically, this post-hoc analysis sought answers to the following questions:
1. What are the patterns of performance for all students, for English Learners in
general, and for English Learners at different levels of proficiency in English?
2. Are there statistically significant differences between the performance of English
Learners and Non-English Learners?
As an educator, I come to this study with prior personal experience in teaching science to
English Learners. Such background knowledge sensitizes me to the difficulties inherent in
educating and assessing this segment of the student population. My experiences as a researcher
have bolstered my informed skepticism regarding the ability of any assessment, performance or
otherwise, to accurately measure the achievement of English Learners. Both sets of experiences
motivate me to conduct studies such as this in the interest of identifying equitable educational
practices for English Learners.
Research Design
Context. This study is based on the work of students taught science by teachers who were
participants in a recently completed multi-year, multi-district, NSF-funded science education
reform initiative known as STEP-uP (Science Teacher Enhancement Program unifying the Pikes
Peak region). STEP-uP efforts to improve student learning included the development of
performance assessments integrated with inquiry-based curriculum units taught at each grade
level, K-5. A prominent feature of the STEP-uP model is the engagement of participating
teachers in professional development on the curriculum units as well as the assessments.
This study focused on the scores of 5th grade students in the Abacus School District
(psuedonym), a member of the STEP-uP consortium, during the 2004-2005 school year. This
timeframe was selected in part due to the fact that it was the first year in which all three
assessments were available for implementation in their final or near final form (see discussion of
the assessment development process in the next section). The grade level was selected in the
interest of minimizing the potentially negative effect on student scores due to lack of previous
exposure to this type assessment. Attrition and migration rates aside, students were more likely
Shaw
4
http://ejlts.ucdavis.edu
Electronic Journal of Literacy Through Science, 8 (2009)
to have had prior experience with STEP-uP developed performance assessments by the time they
reached the 5th grade. A single district was selected to minimize variations in assessments and
student classification – STEP-up districts chose their own scope and sequence of science units
and standardized statewide procedures for classifying English Learners were not yet in effect.
Assessments. STEP-uP-developed performance assessments were used to measure
student learning for the three science units taught during the 2004/2005 school year at the 5th
grade level: Ecosystems, Food Chemistry, and Microworlds. These names correspond to titles of
curriculum units or kits developed by the Science and Technology for Children program and
linked to learning gains for diverse students (Amaral, Garrison, & Klentschy, 2002). In actuality,
the STEP-uP performance assessments are “refined and revised” versions of embedded
assessments existing within those units of study (Kuerbis & Mooney, 2008).
As might be inferred from their titles, the three kits cover a range of grade-level
appropriate science content from the biological and the physical sciences. While divergent in
content coverage, all three assessments share a common focus on science process skills, and,
according to project documents, assess student understanding of scientific investigation and
design, including appropriate communication of the results from scientific investigations (STEPuP, 2003; STEP-uP, 2004; STEP-uP, 2005).
The performance assessments were developed as part of an “embedded assessment
package” that includes rubrics for scoring student responses to the assessments. These
administration manuals include correlations of the tasks to state standards, guidelines for
administering the tasks, and samples of student responses collected during the development
process. All of the assessments and supporting documents are provided in English only.
The assessments were developed by Design Teams that included two to three classroom
teachers with prior experience teaching the particular kit and a university scientist
knowledgeable in the kit’s science content. While inclusion of the latter member speaks in a
limited fashion to content validity, no formal studies were conducted in relation to the technical
quality of the assessments (e.g., validity or reliability). Actual student papers were not available
for the researcher to score and check reliability.
As part of the development process, Design Teams enrolled in a college level course led
by an assessment development expert. Design Teams created the assessments and their
accompanying administration manuals as part of the course. Overall, STEP-up assessment
development was guided by the dual philosophies of assessment of learning – to provide
evidence of achievement for public reporting and for the purpose of accountability – and
assessment for learning – to help students and teachers for the purpose of promoting greater
learning (STEP-uP 2003, p.1) In particular, performance assessments were envisioned as
summative assessments of learning and targeted to measure “enduring” scientific understandings,
as opposed to important facts, in the manner described by Wiggins and McTighe’s (1998)
backward design approach. Nevertheless, STEP-uP performance assessments do include aspects
of formative assessment by calling for students to self-assess themselves with rubrics as part of
the assessment process.
Each performance assessment was created according to a three-year development cycle
that included initial design (stage 1), pilot/field test (stage 2), and implementation (stage 3). At
the time of the study, the assessments for Ecosystems and Food Chemistry had reached stage 3;
the Microworlds assessment was undergoing field tests. Pilot and field-testing occurred in
project-affiliated schools from all five of the participating STEP–uP districts. Efforts were made
to have test sites reflect the student diversity of the participating districts in terms of ethnicity,
Shaw
5
http://ejlts.ucdavis.edu
Electronic Journal of Literacy Through Science, 8 (2009)
socioeconomic status, special education, and English Learners. Contrary to these intentions, the
latter group was distinctly under-represented.
As a group, the STEP-uP assessments share several features in common. They all invite
students to apply previously learned knowledge and skills to novel situations, for example,
researching and reporting on the components of a not yet studied ecosystem. They incorporate
content and procedures presented in previous lessons and are integrated into the units as part of
the standard course of instruction. An important component of the STEP-uP assessment
development process was the creation of scoring guides or rubrics for judging the quality of
student work. Further details on the STEP-uP assessment development process are provided in
Kuerbis and Mooney (2008). Student instructions for each of the assessments are provided in
Appendices A, C, and E.
While all the assessments are to be implemented over multiple class sessions, they vary
in other design features and administration. For example, Space Snack and The World Down
Under (the performance assessments for the Food Chemistry and Microworlds units,
respectively) incorporate hands-on laboratory investigations during which students gather data to
serve as evidence for supporting a claim while Relationships in an Ecosystem (the performance
assessment for the Ecosystems unit) is an extended research project during which students are
expected to use resources including the Internet to gather information about a particular
ecosystem (STEP-uP, 2003, p. 18). However, the latter assessment does share the “make and
defend a claim” aspect of the previous two by requiring the students to predict the consequences
of disrupting one of the organisms in their ecosystem, although this is not the primary focus of
the task.
Other ways in which the assessments differ include placement within the science unit and
allotment of class time. Space Snack and The World Down Under occur as a final lesson while
Relationships in an Ecosystem begins half way through the Ecosystems unit and with students
presenting their research at the end of the unit. Understandably, Relationships in an Ecosystem
has the longest suggested total time for completion (over six hours in three sessions) while The
World Down Under has the shortest (a maximum of two hours over two sessions). At roughly
three hours, Space Snack is relatively close to Relationships in an Ecosystem with respect to total
time but has the most number of class sessions, namely four.
Variation exists also in how students are organized to complete the tasks as well as the
final products. While all three assessments are primarily group activities with some individual
events (e.g., students do lab work together but record observations in their own science
notebooks), the guidelines suggest forming small groups for Space Snack and pairs for
Relationships in an Ecosystem and The World Down Under. Similarly, all three assessments
have a written component (poster, chart or data table, lab report). Relationships in an Ecosystem
and Space Snack both include oral presentations (to the whole class versus to the teacher,
respectively) while The World Down Under does not. Key features of the assessments as
discussed above are presented in Table 1. Note that in subsequent tables and sections in this
paper, the assessments are referred to by unit name, for example, Microworlds for The World
Down Under.
In sum, the three performance assessments in this study measure student attainment of
scientific concepts and skills associated with inquiry-based science instruction. However, they
employ varying approaches to having students gather evidence and present their findings as
described in the following thumbnail portrayals: Relationships in an Ecosystem is a long-term
research task for which student pairs present their graphically displayed information orally to the
Shaw
6
http://ejlts.ucdavis.edu
Electronic Journal of Literacy Through Science, 8 (2009)
whole class, Space Snack is an extended lab investigation wherein groups of student orally
present their findings to the teacher, and The World Down Under is a focused lab investigation
in which pairs of students submit written documentation of their findings for teacher appraisal.
TABLE 1
Key features of the three assessments
Unit Title
Ecosystems
Food Chemistry
Microworlds
Assessment Title
Relationships in
an Ecosystem
Space Snack
The World Down
Under
Task Description
(Response format)
Research, prepare,
and present a
poster that shows
interrelationships
among organisms
in an ecosystem
(graphic and oral)
Test 5 snack foods
for presence of 4
nutrients, decide
which is best for
space travel,
defend choice
(written and oral)
Placement in Unit
Middle until End
End
Class Sessions
(suggested time in
minutes)
1. Task
Introduction (45)
3. Presentations
(60-120)
Rubrics
End
1. Lab Work
(50-60)
2. Testing (40-60)
2. Poster Creation
(180-240)
Student Organization
1. Planning (50)
Examine 3 water
samples for
presences of
microbes, decide
which is safest to
drink, defend
choice (written)
Pairs
• Visual Display
3. Preparation
(30)
4. Presentation
(60)
Groups of 3 to 4
• Chart
• Oral Presentation • Testing
2. Report Writing
(30-60)
Pairs
• Lab Work
• Final Report
• Interview
Shaw
7
http://ejlts.ucdavis.edu
Electronic Journal of Literacy Through Science, 8 (2009)
Students. The study sample consists of 589 fifth grade students in the Abacus School
District, each of who took all three performance assessments during the 2004-2005 school year.
While the complete data set provided by the district included over 800 students, some had scores
for only one or two of the assessments. The original data set was restricted to those students with
scores on each assessment in order to support more accurate comparisons of student performance
across the assessments.
Based on local application of district guidelines, school officials assigned English
Learners to one of the following mutually exclusive subgroups (listed in order of increasing
proficiency in English): Non-English Proficient (NEP), Limited English Proficient (LEP),
reclassified from less than one to up to three years as Fluent English Proficient (EXIT)1. Of the
total sample, 44 (7.5%) are classified as English Learners. The distribution of English Learners
by subgroup is 6 (1.0%) NEP, 29 (4.9%) LEP, and 9 (1.5%) EXIT. These and other student
demographic data, such as gender and ethnicity, are presented in Table 2. As that table shows,
the total sample was predominantly students of color (396 or 67.3%) – with Hispanics (203 or
34.4%) being the largest ethnic group – and low SES (399 or 67.7%), essentially evenly matched
with respect to gender (50.2% female and 49.7% male) and with smaller numbers of students
classified as gifted (28 or 4.7%) and special education (62 or 10.5%)2. Information on nonEnglish Learner demographic groups is provided for contextual purposes; their performance is
reported elsewhere (Shaw & Nagashima, 2007).
1
In the data provided by the district, the EXIT category is further subdivided based on length of time without
receiving special English language support services (e.g., EXIT1 meaning one year without such services). Due to
the low numbers of these subgroups they were all collapsed into the single variable “EXIT.”
2
There was little overlap between English Learners and Special Education students – one student each in the LEP
and NEP categories were also classified as Special Education and none in EXIT.
Shaw
8
http://ejlts.ucdavis.edu
Electronic Journal of Literacy Through Science, 8 (2009)
TABLE 2
Demographic characteristics of the student sample (N=589)
STUDENT GROUP
N
Percent
EXIT
9
1.5%
LEP
29
4.9%
NEP
6
1.0%
545
92.5%
American Indian/Alaskan Native
13
2.2%
Asian
23
3.9%
Black
157
26.6%
Hispanic
203
34.4%
White
193
32.7%
Female
296
50.2%
Male
293
49.7%
28
4.7%
399
67.7%
62
10.5%
ENGLISH LEARNER
Non-English Learner
ETHNICITY
GENDER
GIFTED AND TALENTED
Gifted
SOCIO-ECONOMIC STATUS
Free/Reduced Lunch
SPECIAL EDUCATION
Special Educational Needs
Scores. Teachers used STEP-uP developed rubrics to score their own students’ responses.
These assessment-specific rubrics use a common 4-point scale with 4 = Advanced, 3 =
Proficient, 2 = Partially Proficient, 1 = Unsatisfactory3. The administration manual for each
assessment (STEP-uP 2003, STEP-uP 2004, STEP-uP 2005) provides teachers with rubrics and
samples of student work at each of the score levels. Although assigned to individuals, the scores
reflect the collaborative nature of the assessments either directly or indirectly. As examples of
3
Administration manual documents direct teachers to assign a score of “Not Scored” = 0 for students who “missed
50% or more of science kit instruction time” (STEP-uP, 2003, p. 50).
Shaw
9
http://ejlts.ucdavis.edu
Electronic Journal of Literacy Through Science, 8 (2009)
the former, the Proficient level of Relationships in an Ecosystem’s Visual Display rubric states,
“The web on our poster provides evidence of our conceptual understanding…” (emphasis added,
see Appendix B) and Food Chemistry’s Testing rubric declares, “Our group performed…”
(emphasis added, see Appendix D). While The World Down Under’s Lab Work rubric stresses
individual accountability – “Without assistance I was able to…” (emphasis added, see Appendix
F) – the work is actually conducted in pairs (see “Student Organization” row of Table 1).
Although each assessment employed at least two different rubrics (see “Rubrics” row in
Table 1), teachers assigned students a single whole number score on each assessment. While
presumably covered in more detail during STEP-uP assessment professional development
settings, the administration manuals provide teachers with minimal guidance for determining a
this synthesized score: “After you have marked the rubrics assign an ‘overall’ level of
‘Advanced’, ‘Proficient’, ‘Partially Proficient’ or ‘Unsatisfactory’ to the student” (STEP-uP,
2005, p. 18). Only overall scores were reported to the district; they are the scores used in this
study.
Additional Data. As part of a related study, copies of the administration manual for each
assessment were obtained and interviews were conducted with STEP-uP staff including the
principal investigators and the director of STEP-uP assessment development work. These
sources provided contextual information such as details on the assessment development process
presented above.
Methods. Given the relatively small sample sizes, especially with respect to the English
Learner subgroups (see Table 2), this study focused on describing the science achievement of
English Learners as measured by the three performance assessments. Accordingly, individual
student scores were used to generate a variety of descriptive statistics – including means,
standard deviations, and ranges – for the total sample, for English Learners as a whole, and for
the three English Learner subgroups (EXIT, LEP, NEP). In addition, student performance was
examined by subtracting mean scores for different pairs of groups on each assessment (e.g.,
English Leaner minus Non-English Learner on Ecosystems). Pearson correlation coefficients
were calculated to show the relationships among the three assessments.
Although widely divergent in number, the existence of significant differences between all
English Learners (n=44) and Non-English Learners (n=545) was investigated by conducting
three unpaired t-tests, one for each assessment. While uneven sample size is not problematic for
such a procedure, it was felt that such analyses would not yield meaningful results for the
appreciably smaller English Learner subgroup populations, especially the single digit values for
EXIT (n=9) and NEP (n=6).
Findings
Student performance findings are presented using two complementary frames of
reference: student group affiliation and score differences. The former includes comparisons from
the total sample to English Learners in the aggregate down to the English Learner subgroups.
The latter uses Non-English Learners as a common reference point for comparisons with English
Learners as a whole and the English Learner subgroups. Both sets of findings include
performance on each of the three assessments, Ecosystems, Food Chemistry, and Microworlds.
Findings related to the English Learner subgroups should be considered with caution given the
small sizes of those samples. To provide contextual information on the functioning of the
Shaw
10
http://ejlts.ucdavis.edu
Electronic Journal of Literacy Through Science, 8 (2009)
instruments themselves, the section begins with the presentation of correlational data among the
three assessments.
Inter-Assessment Correlations
Correlations were calculated for the three possible pairings of the assessments in this
study: Ecosystems and Food Chemistry, Ecosystems and Microworlds, Food Chemistry and
Microworlds. The resulting coefficients were .400, .326, and .509, respectively. All three
correlations were direct (i.e., in a positive direction) and each was significant at the p < .01 level
(see Table 3).
Findings Based on Student Group Affiliation
This set of findings includes three increasingly specific levels of comparison. First are
findings for English Learners in relation to the total sample. Second are more focused
comparisons between English Learners and Non-English Learners, including the results of
significance tests. Third are comparisons between the English Learner subgroups (EXIT, LEP,
and NEP) and the Non-English Learner population.
TABLE 3
Correlations among the three assessments (n = 589)
Ecosystems
Food Chemistry
Ecosystems
—
Food Chemistry
.400**
—
Microworlds
.326**
.509**
Microworlds
—
** Significant at p < .01
English Learners and the Total Sample. English Learners scored slightly lower (an
average .05 of a point) than the total sample on each of the three assessments (see Table 4).
Mean scores for the total sample, including English Learners, ranged from a high of 2.87 (SD =
.778) for Food Chemistry to a low of 2.83 (SD = .860) for Ecosystems with Microworlds falling
in between the two with a mean of 2.86 (SD = .729). For English Learners taken as a whole, the
means were even closer in value: 2.80 (SD = .765), 2.80 (SD = .904), and 2.82 (SD = .57) for
Ecosystems, Food Chemistry, and Microworlds, respectively. Both groups exhibited the full
range of scores (0-4) on each assessment with the exception of English Learners on Microworlds
where the range was 2-4. Rounding the means to the nearest whole number yields a “universal”
mean of 3.
English Learners versus Non-English Learners. English Learners scored slightly lower
than Non-English Learners (total sample minus English Learners; see Table 4). The highest score
for English Learners was on Microworlds at 2.82 (SD = .657). The lowest score for English
Learners was on both Ecosystems and Food Chemistry at 2.80 (SD = .765 and .904,
respectively). The highest score for Non-English Learners was on both Ecosystems and
Microworlds at 2.87 (SD = .780 and .735, respectively). The lowest score for Non-English
Learners was on Food Chemistry at 2.84 (SD = .857). As with the total sample comparison, both
Shaw
11
http://ejlts.ucdavis.edu
Electronic Journal of Literacy Through Science, 8 (2009)
TABLE 4
Student performance on the three assessments
ASSESSMENT
STUDENT GROUP (n)
Total Sample (589)
English Learner (44)
Non-English Learner (545)
EXIT (9)
LEP (29)
NEP (6)
Ecosystems
M (SD)
Range
2.83 (.860)
Food Chemistry
M (SD)
Range
2.87 (.778)
Microworlds
M (SD)
Range
2.86 (.729)
0-4
0-4
0-4
2.80 (.765)
2.80 (.904)
2.82 (.657)
0-4
0-4
2-4
2.87 (.780)
2.84 (.857)
2.87 (.735)
0-4
0-4
0-4
3.11 (.334)
3.22 (.667)
3.00 (.707)
3-4
2-4
2-4
2.79 (.675)
2.69 (.930)
2.83 (.658)
2-4
0-4
2-4
2.33 (1.36)
2.67 (.916)
2.50 (1.25)
0-4
1-4
2-3
Note. Maximum score = 4.
groups exhibited the full range of scores (0-4) on each assessment with the exception of English
Learners on Microworlds where the range was 2-4. Rounding mean scores to the nearest whole
number for both groups again yields the universal mean of 3 on all three assessments. T-test
analyses showed no significant differences in the mean scores between English Learners and
Non-English Learners on any of the assessments (see Table 5).
Shaw
12
http://ejlts.ucdavis.edu
Electronic Journal of Literacy Through Science, 8 (2009)
TABLE 5
Results of t-tests comparing English learners and non-English learners on all three assessments
ASSESSMENT
t
df
p
Ecosystems
0.24
587
.80
Food Chemistry
0.04
587
.96
Microworlds
0.23
587
.81
English Learner Subgroups. Except for EXIT, English Learner subgroups scored lower
than Non-English Learners on all three assessments (see Table 4). Mean scores for English
Learner subgroups (EXIT, LEP, NEP) show more variation than the previous two sets of
comparisons. With values of 3.22 (SD = .667) on Food Chemistry, 3.11 (SD = .334) on
Ecosystems, and 3.00 (SD = .707) on Microworlds, the EXIT subgroup had the highest mean
scores of all English Learner subgroups as well as any group within the sample. LEP subgroup
means range from a high of 2.83 (SD = .658) on Microworlds to a low of 2.69 (SD = .930) on
Food including a mean of 2.79 (SD = .675) on Ecosystems. In descending order, NEP subgroup
means range were 2.67 (SD = .916) on Food Chemistry, 2.50 (SD = 1.25) on Microworlds, and
2.33 (SD = 1.366) on Ecosystems. Rounding mean scores for English Learner subgroups reveals
a continuation of the universal mean of 3 except for NEP students on Ecosystems where the
resultant mean is 2.
English Learner subgroups’ score ranges also exhibited greater variation than groups
previously presented. LEP and NEP students showed the full range of scores (0-4) on Food
Chemistry and Ecosystems, respectively. NEP students had a four-point range of difference on
Food Chemistry (1-4) while three-point ranges of difference were exhibited by EXIT students on
Food Chemistry and Microworlds and by LEP students on Microworlds (2-4). A one-point range
of difference was seen with EXIT students on Ecosystems (3-4) and NEP students on
Microworlds (2-3).
Findings Based on Score Differences
This set of findings includes comparisons based on differences in mean scores between
English Learners and English Learner subgoups and Non-English Learners on each of the three
assessments. Absolute values for these paired differences range from .03 to .54. For English
Learners, the difference with Non-English Learners was .07 on Ecosystems, .04 on Food
Chemistry, and .05 on Microworlds. Corresponding values for the English Learner subgroups
were: .24, .38, and .13 for EXIT; .08, .15, and .04 for LEP; and .54, .17, and .37 for NEP. These
comparisons are summarized in Table 6.
Discussion
This study examined 5th grade students’ scores on three science performance assessments
embedded within inquiry-based units of instruction. Students were taught the units, which
Shaw
13
http://ejlts.ucdavis.edu
Electronic Journal of Literacy Through Science, 8 (2009)
included taking the assessments, by teachers in schools of one of five districts that participated in
a multi-year, National Science Foundation supported science education reform project. Teachers
in these districts received professional development on the science units as well as the
performance assessments. Given this context, this study set out to (a) document patterns of
performance for students as a whole and English Learners, including those at different levels of
proficiency in English, and (b) ascertain whether or not statistically significant differences
existed between the performance of English Learners and their Non-English Learner peers.
TABLE 6
English Learner versus Non-English Learner mean score differences by assessment
ASSESSMENT
COMPARISON GROUPS
Ecosystems
Food Chemistry
Microworlds
English Learner vs.
Non-English Learner
EXIT vs.
Non-English Learner
LEP vs.
Non-English Learner (545)
NEP vs.
Non-English Learner (545)
-.07
-.04
-.05
.24
.38
0.13
-.08
-.15
-.04
-.54
-.17
-.37
Note. Maximum score difference = 4.00
The latter objective was answered by t-test results: no significant differences were
found between English Learners and Non-English Learners on any of the three assessments.
With respect to the former objective, two overarching patterns emerged from the data: (1)
student groups shared a common mean score, and (2) English Learners underperformed in
relation to Non-English Learners. Both patterns – English Learner underperformance and the
homogeneity of means – were evident across all three assessments in the study. Details regarding
these patterns, including exceptions and possible explanations, are discussed below. As
previously mentioned, statements referring to the performance of English Learner subgroups
(i.e., EXIT, LEP, and NEP) need to be considered cautiously due to the small size of those
samples. Given the limited scope of the study – a single grade level within one district for one
school year – all suggested interpretations warrant further investigation.
English Learner Underperformance
Taken as a whole or considered in subgroups, English Learners scored lower than their
Non-English Learner counterparts regardless of assessment. The one exception to this pattern
was the over-performance of EXIT English Learners in comparison to Non-English Learners, as
well as the other two English Learner subgroups, on all three assessments. Lee and colleagues
(2006) noted a similar pattern with their “exited” English Learners. The additional language
support received by these students prior to reclassification may make recently reclassified
Shaw
14
http://ejlts.ucdavis.edu
Electronic Journal of Literacy Through Science, 8 (2009)
English Learners, as a group, better prepared to function in academic contexts than the typical
Non-English Learner.
The magnitude of English Learner/Non-English Learner differential performance merits
further consideration. Using absolute values, score differences between English Learners/English
Learner subgroups and Non-English Learners ranged from a high of .54 of a point for NEP
students on the Ecosystems assessment to a low of .04 of a point shared by English Learners on
Food Chemistry and LEP students on Microworlds (see Table 6). On average, English Learners
scored .19 of a point lower than Non-English Learners. Given the assessments’ four-point scale,
it is reasonable to assign a minimum .50 threshold to identify differences having “practical”
significance (differences of a half point or more would change a student’s placement on the
assessment-affiliated rubrics). Applying that criterion, only the NEP difference on Ecosystems
warrants further consideration (see Literacy Considerations below).
Both the general pattern and its exception support a mutual underlying principle:
increased levels of English proficiency correlate with higher achievement scores. Non-English
Learners scored higher than English Learners, and EXIT English Learners scored higher than
LEP English Learners, who in turn scored higher than NEP English Learners. While not
surprising, this common sense outcome lends credence to the notion that the assessments in some
respect measure proficiency in English as well as science content.
Homogeneity of Means
Raw score means for all student subgroups in the study show a limited range of variance.
As the t-test results showed, the differences in mean scores of English Learners and Non-English
Learners failed to reach statistical significance. A useful metric to determine practical
significance is the translation of raw score means to the performance levels shared by all the
rubrics for each of the three assessments (4 = Advanced, 3 = Proficient, 2 = Partially Proficient,
1 = Unsatisfactory). As previously mentioned, rounding the means to the nearest whole number
yields a universal mean of 3 or “Proficient.” The lone exception to this pattern was with NEP
students on the Ecosystems assessment; rounding in this case results in a score of two or
“Partially Proficient.”
This homogeneity of means may be attributed in part to the shared nature of the scores.
While reported for individuals, the scores likely reflect the inter-student collaboration built into
the assessments. For each task, students work either in pairs or small groups. Moreover, language
on the rubrics for Ecosystems and Food Chemistry shows an explicit sharing of scores among all
group members – “We label and our able to define…” (Appendix B) and “Our group tested and
recorded…” (Appendix D). While the rubrics for Microworlds contain more individualistic
language – “I was able to …” (Appendix F) – students worked in pairs to generate the
information on which their individual responses were based. Thus, how representative this
study’s findings are for the groups examined depends in part on the composition of the student
groups sharing the scores. The post-hoc design of this study precluded attaining such
information.
Summary. The results of this study indicate that when used to measure student learning in
inquiry-based science instructional contexts, performance assessments may contribute to a
leveling of the playing field for students with respect to proficiency in English. Whether an
English Learner or not, all students save a small group of NEP English Learners scored at the
proficient level. Although, as noted above, Non-English Learners scored higher than English
Learners on two of the three assessments, this is considered to have little practical significance; it
does not result in a different performance level designation on the four-point rubric. Contrary to
Shaw
15
http://ejlts.ucdavis.edu
Electronic Journal of Literacy Through Science, 8 (2009)
prior studies using scores from traditional assessments, English Learners did not exhibit
statistically significant differences in relation to their Non-English Learner peers (Amaral,
Garrison, & Klentschy, 2002; Lee et al., 2005).
Literacy Considerations
Literacy factors may account for some of the differences observed in student
performance. As mentioned above, at a very basic level student scores are related to their degree
of proficiency in English. As the Standards for Educational and Psychological Testing assert,
“For all test takers, any test that employs language is, in part, a measure of their language skills”
(American Educational Research Association, American Psychological Association, & National
Council on Measurement in Education, 1999, p. 91). Closer examination of the results for the
Ecosystems assessment is instructive in this regard.
As noted previously, the Ecosystems assessment was challenging for NEP students in
two ways. First, it was the one assessment on which their mean score translated to “partially
proficient” in rubric parlance. No other group scored at this level or lower on any of the other
assessments. Second, it was the one assessment on which the difference in scores between an
English Learner subgroup and its Non-English Learner comparison group passed the .50 of a
point threshold for practical significance. Looking more broadly, Ecosystems also had the lowest
mean score for the total sample and for English Learners (a distinction it shared with Food
Chemistry for the latter group). Finally, Ecosystems was the common assessment in the two
lowest correlations, indicating the measurement of something other than the science process
skills shared across all three assessments.
These combined findings suggest that the source of the relatively low student
performance associated with Ecosystems may be related to the nature of the assessment itself,
namely its unique high reliance on reading comprehension. The Ecosystems assessment is
essentially a research project that requires students to understand and synthesize information
from a variety of sources such as library books and the Internet. There are no such “external”
reading demands for either the Food Chemistry or Microworlds assessments. The Ecosystems
assessment also lacks the laboratory investigation aspect of the other two assessments. This
hands-on engagement aspect of Food Chemistry and Microworlds may make those tasks more
accessible to students, especially English Learners, thus contributing to the higher performance
levels observed with those two assessments. Granting merit to the higher reading demands
supposition, differences in student performance may be related to Shaftel et al.’s (2006) notion of
a “threshold of language proficiency” (p. 123), a point that once acquired, diffuses the
functioning of an assessment’s language features as a barrier to performance. Due to greater
reading comprehension demands, Ecosystems may have a higher language proficiency threshold
than the other two assessments.
Study Limitations
Narrow scope and small sample sizes were noted previously as constraints to the power
and generalizability of the findings from this study. In addition, results and their interpretations
need to considered in the light of the following limitations: (a) comprehensive evidence
establishing the validity of the interpretation of scores from these assessments is lacking, (b)
student responses were scored by their respective teachers with no information gathered on interShaw
16
http://ejlts.ucdavis.edu
Electronic Journal of Literacy Through Science, 8 (2009)
rater reliability, and (c) demographic data do not include information on students’ native
languages.
With respect to point (a), Messick (1996) outlines six key aspects of a “unified concept of
validity”: content, substantive, structural, generalizability, external, and consequential. While the
results of the present study contribute evidence with respect to consequential validity (e.g.,
apparent lack of adverse consequences for English Learners due to bias in scoring), more
information needs to be gathered for this and other aspects of validity.
Point (b), inter-rater reliability, speaks to scoring issues such as appropriate application of
the rubric to individual student responses. Rater familiarity with language production features
common to non-native speakers of English is an issue when scoring response from English
Learners. It has been addressed by the development LEP-specific scoring procedures such as
those put forth by the Council of Chief School Officers or CCSSO (see for example, Kopriva &
Sexton, 1999). However, such specialized training may not have been called for in this setting
due to rater prior knowledge of the students’ writing and speaking abilities in English. Student
responses were scored by their own teachers, persons with direct experience with each student’s
communication patterns.
Access to data on students' native languages, of point (c), would allow finer-grained, less
hegemonic analysis of the multifaceted group known as English Learners. For example, English
Learners may perform differently based on the degree of difference between the alphabets of
their native language and English. Similarly, factors such as country of origin and length of time
in the United States might influence English Learners’ performance.
Bearing such limitations in mind, the curriculum-embedded nature of the assessments
may be a contributing factor to the positive findings noted above. Other than a novel context or
story line, the assessments used in this study ask students to perform in ways akin to what they
would already have done if they participated in and learned from the immediate prior instruction,
i.e., classroom-based experiences. As Lee (1999) states: “Achievement gaps among ethnic,
socioeconomic, and gender groups tend to be larger on items that call on outside-of-school
knowledge and experiences” (p. 100). The results reflect the converse of this pattern; that is,
assessments based on inside-of-school knowledge and experiences tend to narrow achievement
gaps among English Learners in particular. The findings of this exploratory study point to the
need for further examination of the achievement of English Learners as measured by
performance assessments, in science and other content areas.
Concluding Remarks
Scientific literacy involves more than memorization and regurgitation of facts. It includes
the application of scientific knowledge and skills to the investigation of phenomena and
communicating the results. Performance assessments such as those in this study have the
potential to allow all students the opportunity to demonstrate such competencies in an active and
engaging manner. As one of the few studies documenting English Learners’ achievement in the
context of inquiry-based instruction as measured by performance assessments, this paper
provides valuable insights on such purportedly equitable practices in science education. Bearing
sample size and score reliability issues in mind, the findings indicate that when their inquirybased science learning is measured with performance assessments, English Learners may exhibit
levels of achievement comparable to their native English-speaking peers. Contextual factors such
as a coherent curriculum, assessments aligned with and embedded within this curriculum, and
Shaw
17
http://ejlts.ucdavis.edu
Electronic Journal of Literacy Through Science, 8 (2009)
coordinated teacher professional development on the curriculum, instruction and assessment are
likely important contributing factors to this positive outcome.
These findings connect to the larger issue of the relationship between scientific literacy
and literacy in English. Although related, the former should not be considered as dependent on
the latter, or vice versa. While in the context of US classrooms where the predominant language
of instruction is English they are in fact interdependent, a student's literacy skills in English
should not negatively impact her or his score on measures of scientific literacy where proficiency
in use of the English language is not the central focus of the assessment. Further research with
larger sample sizes (see Lee et al., 2008), especially for English Learner subgroups, and
incorporating analysis of qualitative data, such as student understanding of assessment prompts
(see Martiniello, 2008), is required to disentangle these interconnected elements so as to provide
more detailed guidance to the educators of America’s increasingly diverse student population.
Shaw
18
http://ejlts.ucdavis.edu
Electronic Journal of Literacy Through Science, 8 (2009)
References
Abedi, J. (2002). Standardized achievement tests and English language learners: Psychometric
issues. Educational Assessment, 8(3), 231-257.
Abedi, J., & Lord, C. (2001). The language factor in mathematics tests. Applied Measurement in
Education, 14(3), 219-234.
Abdei, J., Lord, C., Hofstetter, C. & Baker, E. (2000). Impact of accommodation strategies on
English language learners’ test performance. Educational Measurement: Issues and
Practice, 19(3), 16-26.
Amaral, O.M., Garrison, L., & Klentschy, M. (2002). Helping English learners increase
achievement through inquiry-based science instruction. Bilingual Research Journal
26(2), 213-239.
American Association for the Advancement of Science. (1989). Science for all Americans. New
York: Oxford University Press.
American Educational Research Association, American Psychological Association, & National
Council on Measurement in Education. (1999). Standards for educational and
psychological testing. Washington DC: American Psychological Association.
Baker, E. (1997). Model-based performance assessment. Theory Into Practice, 36(4), 247-254.
Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2nd ed.). Hillsdale, NJ:
Erlbaum.
Johnson, E., & Monreo, B. (2004). Simplified language as an accommodation on math tests.
Assessment for Effective Intervention, 29, 35-45.
Kim, E., Park, H., Kang, H., & Noh, S. (2000, April). Developing a framework for science
performance assessment. Paper presented at the annual meeting of the National
Association for Research in Science Teaching, New Orleans.
Klein, S. P., Jovanovic, J., Stecher, B. M., McCaffrey, D., Shavelson, R. J., Haertel, E., SolanoFlores, G., & Comfort, K. (1997). Gender and racial/ethnic differences on performance
assessments in science. Educational Evaluation and Policy Analysis, 19(2), 83-87.
Kopriva, R., & Sexton, U. M. (1999). Guide to scoring LEP student responses to open-ended
science items. Washington, DC: Council of Chief State School Officers.
Kuerbis, P. J., & Mooney, L. B. (2008). Using assessment design as a model of professional
development. In J. Coffey, R. Douglas, & C. Stearns (Eds.), Assessing science learning:
Perspectives from research and practice, (pp. 409-426). Arlington, VA: National Science
Teachers Association Press.
Shaw
19
http://ejlts.ucdavis.edu
Electronic Journal of Literacy Through Science, 8 (2009)
Lee, O. (1999). Equity implications based on the conceptions of science achievement in major
reform documents. Review of Educational Research, 69(1), 83-115.
Lee. O. (2005). Science education with English language learners: Synthesis and research
agenda. Review of Educational Research, 75(4), 491-530.
Lee, O., Buxton, C., Lewis, S., & LeRoy, K. (2006). Science inquiry and student diversity:
Enhanced abilities and continuing difficulties after an instructional intervention. Journal
of Research in Science Teaching, 43(7), 607-636.
Lee, O., Deaktor, R. A., Hart, J. E., Cuevas, P., & Enders, C. (2005). An instructional
intervention’s impact on the science and literacy achievement of culturally and
linguistically diverse elementary students. Journal of Research in Science Teaching,
42(8), 857-887.
Lee, O., Maerten-Rivera, J., Penfield, R. D., LeRoy, K., & Secada, W. G. (2008). Science
achievement of English language learners in urban elementary schools: Results of a firstyear professional development intervention. Journal of Research in Science Teaching,
45(1), 31-52.
Lynch, S. (2000). Equity and science education reform. Mahwah, NJ: Lawrence Earlbaum
Associates.
Martiniello, M. (2008). Language and the performance of English-language learners in math
word problems. Harvard Educational Review, 78(2), 333-368.
Messick, S. (1996). Validity of performance assessments. In G. W. Phillips (Ed.), Technical
issues in large-scale performance assessment (pp. 1-18). Washington, DC: US
Government Printing Office, Report No. NCES-96-802. Available from ERIC
Document Reproduction Service, no. ED 399 300.
National Clearinghouse for English Language Acquisition and Language Instruction Educational
Programs (no date). Retrieved March 9, 2007 from http://www.ncela.gwu.edu/.
National Research Council. (1996). National science education standards. Washington, DC:
National Academy Press.
National Research Council. (2000). Inquiry and the national science education standards: A
guide for teaching and learning. Washington, DC: National Academy Press.
Shaftel, J., Belton-Kocher, E., Glasnapp, D., & Poggio, J. (2006). The impact of language
characteristics in mathematics test items on the performance of English language learners
and students with disabilities. Educational Assessment, 11(2), 105-126.
Shaw
20
http://ejlts.ucdavis.edu
Electronic Journal of Literacy Through Science, 8 (2009)
Shaw, J. M. (1997). Threats to the validity of science performance assessments for English
language learners. Journal of Research in Science Teaching, 34(7), 721-743.
Shaw, J. M., & Nagashima, S. (2007, April). Fifth Graders’ Science Achievement in an
Elementary Reform Initiative: Findings from Curriculum-Embedded Performance
Assessments. Paper presented at the annual meeting of the National Association for
Research in Science Teaching (New Orleans).
Stecher, B. M. (1995, April). The cost of performance assessment in science: The RAND
perspective. Paper presented at the annual meeting of the National Council on
Measurement in Education, San Francisco.
Stecher, B. M., & Klein, S. P. (1997). The cost of science performance assessments in largescale testing programs. Educational Evaluation and Policy Analysis, 19(1), 1-14.
STEP-uP (Science Teacher Enhancement Program unifying the Pikes Peak Region). (2003).
Embedded assessment package for ecosystems. Colorado Springs, CO: Author.
STEP-uP (Science Teacher Enhancement Program unifying the Pikes Peak Region). (2004).
Embedded assessment package for food chemistry. Colorado Springs, CO: Author.
STEP-uP (Science Teacher Enhancement Program unifying the Pikes Peak Region). (2005).
Embedded assessment package for microworlds. Colorado Springs, CO: Author.
Wiggins, G. P., & McTighe, J. (1998). Understanding by Design. Alexandria, VA: Association
for Supervision and Curriculum Development.
Shaw
21
http://ejlts.ucdavis.edu
Electronic Journal of Literacy Through Science, 8 (2009)
APPENDICES
A. Ecosystems/Relationships in an Ecosystem Student Task Sheet
B. Ecosystems/Relationships in an Ecosystem Visual Display Rubric
C. Food Chemistry/Space Snack Student Task Sheet
D. Food Chemistry/Space Snack Testing Rubric
E. Microworlds/The World Down Under Student Task Sheet
F. Microworlds/The World Down Under Lab Work Rubric
Shaw
22
http://ejlts.ucdavis.edu
Electronic Journal of Literacy Through Science, 8 (2009)
APPENDIX A. Ecosystems/Relationships in an Ecosystem Student Task Sheet
RELATIONSHIPS IN AN ECOSYSTEM
Background of Task
When you visit a regional, state, or national park, you often find Visitors’ Centers that have
interesting displays or replications of the features of their particular area. These displays
accurately highlight the most important features of the area. They are attractive and
clearly labeled to help visitors learn and appreciate the natural components of their
surroundings. Visitors’ Centers employ naturalists who are able to answer your questions
and provide more information about the area.
Your Task
You and your ecocolumn partner will become naturalists for a different ecosystem other
than a river bank. Your teacher will assign you an ecosystem. As naturalists, you and your
partner will create a poster of the ecosystem and present it to the class. This poster will
be displayed in our classroom or possibly somewhere in our school so that others may learn
from your work.
1. Read and analyze the data sheet provided by your teacher.
2. Create a poster identifying and explaining the interrelationships among the
producers, consumers, scavengers/decomposers, and non-living components in this
ecosystem through the use of a web. Use pictures, drawings and labels to illustrate
the interrelationships.
Reminder: Think of the living/nonliving webs you made in your notebooks for ecocolumns.
(Refer to Lesson 5 and Lesson 7).
3. Orally present the poster to the class in order to educate your classmates about
your particular ecosystem. Your oral presentation should include a description of the
relationships in your ecosystem, using specific examples and answers to these
questions:
What do you think is happening in this ecosystem that you can’t see?
If you were to take out the
in this ecosystem, what might be the effects of not
having
as part of the ecosystem?
(Your teacher will decide which living or non-living thing will be removed from your
ecosystem, and you will determine the impact this would have on your ecosystem).
•
•
As young naturalists, your visual display and oral presentation will be judged in the
following way:
•
•
•
Ability to gather comprehensive information from data sheets
Content of the visual display
Communication skills in and content of oral presentation
Ecosystems Performance Assessment Task
Student Handout
41
Shaw
23
http://ejlts.ucdavis.edu
Electronic Journal of Literacy Through Science, 8 (2009)
APPENDIX B. Ecosystems/Relationships in an Ecosystem Visual Display Rubric
Shaw
24
http://ejlts.ucdavis.edu
Electronic Journal of Literacy Through Science, 8 (2009)
APPENDIX C. Food Chemistry/Space Snack Student Task Sheet
SPACE SNACKS
Background of Task
In the future, space travel will occur more frequently. Space travel will last longer and
require planning. Snack foods will be important to help space travelers stay healthy.
Quality snacks will be in demand for space travel.
Your Task
You have knowledge of the following things because of your recent work:
• Starch, glucose, fat, and protein tests
• The relationship of starch, glucose, fat, and protein to eating a nutritious diet
• Charting predictions and results
Because of your knowledge, NASA has approached you to help select appropriate snacks for
a space trip. The snack will be on a menu of available snacks. NASA has identified several
foods that may work, but they are not sure which snack will provide the best choice.
As an apprentice nutritionist, what one snack would you recommend for an astronaut to
take on their journey?
You will be evaluated on the following steps and requirements:
Make a chart on which you can record test predictions and results.
Include your investigation question on your chart.
Tell about reasons for the predictions you make.
Perform your tests carefully and accurately, and record your test results on your chart.
Keep track of tests you do more than once and the reasons for doing so.
Make a snack recommendation, thinking of reasons and evidence for your decision. Use
your test results and your knowledge from the reading selections to help you.
Think of errors you may have made during the testing and be ready to tell your teacher
about them.
Bring all charts, documents, and notebooks from testing to your interview and be ready
to discuss their contents.
Be sure to know how you think your snack is healthy for an astronaut and be ready to
defend your recommendation.
NASA is counting on you to keep their astronauts healthy. Good luck!
Task
Shaw
Student Handout
Food Chemistry Performance
36
25
http://ejlts.ucdavis.edu
Electronic Journal of Literacy Through Science, 8 (2009)
APPENDIX D. Food Chemistry/Space Snack Testing Rubric
Shaw
26
http://ejlts.ucdavis.edu
Electronic Journal of Literacy Through Science, 8 (2009)
APPENDIX E. Microworlds/The World Down Under Student Task Sheet
Scientist’s Name
The World Down Under
Background of Task
Our class has received a letter from the Water Department of the Colorado Springs
Utilities. The department has been monitoring the streams in our community. They have
found several streams they believe might be contaminated. Due to a shortage of local
scientists, they are in need of students’ assistance. The Water Department has sent us
samples of stream water and the grasses that grow along the banks.
Your Task
Using all your newly acquired expertise in collecting samples, preparing slides, and using
microscopes, the Water Department feels that you are prepared to help them conduct tests
on the water and grass samples. You and your partner(s) are going to conduct tests on the
samples provided and report your findings to the Water Department. Based on the student
reports, the Water Department will determine if the stream is contaminated. Your task is
to:
1. Select which type of slides to use to test each sample (use your science notebook
for reference).
2. Using a microscope, make a detailed observation of each water sample.
3. Make a detailed recognizable drawing of any microbe and indicate which sample it is
from.
4. Write a description of all things seen under the microscope including shape, color,
movement, and the approximate number. State the type of slide you used.
5. Review the data you collect and decide which water sample you think would be the
safest to drink.
6. Complete Final Lab Report Form
As a scientist, your work will be judged in the following way:
1.
2.
3.
4.
5.
6.
Use of microscope
Appropriate selection and preparation of slides
Drawings of observations
Written descriptions of things observed
Completeness of the Final Lab Report
Work habits and safety
Microworlds Performance Assessment Task
Shaw
Student Handout
38
27
http://ejlts.ucdavis.edu
Electronic Journal of Literacy Through Science, 8 (2009)
APPENDIX F. Microworlds/The World Down Under Lab Work Rubric
Shaw
28