Download Report

A FOURTH PATH—A NEW DIRECTION
Among high impact educational practices there are few more controversial than
assessment. Institutions addressing the assessment issue usually pursue one of three,
generally unsuccessful, paths. This article will address those paths and raise the potential
for a fourth—a new path for high impact, which is largely the result of rigorous research
over the last eight years conducted by the research team at Chalk & Wire.
The scenario
The verdict is handed down in accreditation review that
the assessment of outcomes and learning is not
adequately addressed across the institution. This is the
leading reason cited by reviewing agencies for not
handing out the revered 10-Year Reaffirmation.
Five years has become the norm. For some institutions,
reports are even required annually to show progress in
data-driven improvement efforts.
What do you do first?
1. Work on the General Education program to
ensure it covers the Gen Ed outcomes and relates
well to the institution’s educational mission.
These outcomes usually amount to 8-12 big
statements that virtually no one could argue
against.
2. Map the big outcomes to the key assignments in
a broad range of courses that students will take.
3. Start assessing using some sort of assessment
system and run reports as the data rolls in.
For most, this whole process does not go well.
Assessment inter-rater reliability is consistently low.
Put simply, well-intentioned efforts do not seem to
be successful. Why does this happen?
The all-inclusive outcome is
usually too big
The problems begin with Outcome statements, which
tend to read something like this:
“The student demonstrates communication
competency in writing and speaking standard
English, in critical reading and listening, and in
using information and research resources
effectively.”
Read that aloud a few times and ask yourself how one
would collect data to demonstrate learning progress
over time related to this broadly stated goal. It’s
daunting, because the researcher in us tells us that any
valid statement of progress over time requires
consistently measured criteria across multiple faculty,
who are teaching in different programs and disciplines,
collected from the moment the student crosses the
threshold until he or she graduates. Stated as is, the
outcome actually encompasses multiple competencies:
writing, speaking, comprehending written and oral texts,
and effective use of sources. What are the chances of
all faculty agreeing on the quality of student work
related to an outcome statement that mixes several
skills in so many contexts and over so many years?
This is, of course, a rhetorical question. But the odds
are not good.
The paths chosen all have fatal flaws
At some point many institutions trade their present
assessment circumstances for one of the options below:
1. Each instructor uses their own rubric for their
assignments.
2. A committee meets to hammer out a common
rubric for a given assignment type.
3. The institution chooses someone else’s ‘rubrics
with a pedigree’, deemed acceptable by virtue of
the authoring organization.
The first two options are problematic. Chalk & Wire
has been investigating validity in standards-based
assessment for many years. And for many years we have
been watching people build their own rubrics. Almost
without exception the language in these rubrics is
subjective, meaning that scores depend upon what
the assessor thinks is important and how he or she
interprets the work. The result is invalid data.
The third option is one that many institutions are
sprinting to, even though the long-term efficacy has
remained unexamined. A recent example is the use of
the American Association of Colleges & Universities
(AAC&U) VALUE Rubrics. What are they? Between 2007
and 2010, AAC&U developed a set of criteria called Valid
Assessment of Learning in Undergraduate Education
(VALUE). The rubrics cover 16 dimensions of learning
identified by university faculty as desirable skills for
undergraduates. Several of the individual rubrics have
been widely used by colleges and universities. Case
studies describe their successful implementation and
they have generated some valuable conversations
around assessment and learning.
On the one hand, the AAC&U has sparked a process
to move the higher education community forward in
understanding and implementing standards-based
competency assessment. On the other hand, there are
unavoidable validity issues with their use. The VALUE
Rubrics’ one claim to validity is face validity—they
appear to do what they are supposed to do. The 16
VALUE Rubrics are meant to measure skills that are
critical to career-readiness, and they appear to do that,
at least to the institutions that have chosen to use them.
Unfortunately, face validity is not legitimate evidence of
valid and reliable measure.
Taking another approach, the edTPATM rubrics
developed by the Stanford Center for Assessment,
Learning and Equity (SCALE) are in use to assess
traditional teachers at the national level. Unfortunately,
edTPATM rubrics haven’t shown to be reliable or valid
despite their ‘pedigree by association’ rating. In fact,
edTPATM won’t release their rubrics for general use, quite
possibly because their reliability cannot be established.
In one respect, the Federal Department of Education
has hit the nail on the head with their recent release of
proposed changes to the manner of assessing teacher
preparation programs. Their premise: If you want to
know if a teacher is effective, look at what that teacher’s
students have accomplished. Unfortunately, their plan to
gather the necessary information about student learning
to support this type of high stakes decision-making is
flawed. Standardized test scores do not give an accurate
picture of what students are able to do. In response, and
as mandated by accrediting agencies that are requiring
evidence of learning, many institutions of higher
education have acquired standards-based assessment
software, often with the underlying assumption that
some magic in the software will tell them about
students’ developing skills.
Both the edTPATM and AAC&U suggest that the way to
raise inter-rater reliability is to eliminate scoring levels.
Assessment scores are more likely to agree if there are
only three choices—1,2 or 3. While this may be true,
what happens in this scenario is that the majority of
students come in as a ‘2’ and leave as a ‘3’ ...and we still
don’t know much about skill development.
The Fourth Path
Fortunately, there is a new choice emerging that
may lead to better assessment practice based on
Chalk & Wire’s extensive research—The Fourth Path.
Comprehensive Standards
Validity theory tells us that the language in the criteria
used to evaluate student work must directly reflect the
intent of a comprehensive set of standards. It became
apparent years ago, as we studied the assessment
systems of hundreds of universities, that there was a
fundamental need for such a comprehensive standard to
serve as an anchor for the whole process of generating
robust scoring criteria, unencumbered by language
already owned by stakeholders.
Chalk & Wire has developed such a set of comprehensive
outcomes, adapted from the Information Literacy
Standards for Higher Education, the National
Educational Technology Standards, and the AAC&U
VALUE Rubrics. Dubbed the Critical Literacy Standards
(CLS), these describe essential life skills for active
participation in “an economy dependent upon
innovation and creativity” and define an effective
approach to lifelong learning and problem solving.
They include a broad and comprehensive set of rubrics
that can be used across campus from Gen Ed through
specialized degree programs. The Critical Literacy
Standards are currently undergoing validation at
several major universities.
The Critical Literacy Standards cover a learning
progression from novice to expert and can be used to
document developing skills, knowledge and dispositions
over a student’s entire educational career from Gen Ed
through specialization, even if he or she changes
schools. Additionally, the Critical Literacy Standards
subsume all other outcome sets. Chalk & Wire has
cross-walked all the common national Standards, the
SPAs, and many other professional Standards (Mental
Health Counseling, Nursing, Engineering, Business, etc.)
with the Critical Literacy Standards. This means that by
using the Critical Literacy Standards, it is possible to
measure all other outcomes as required for accreditation,
using only one set of rubrics.
Core Validity
The foundation for a valid standards-based assessment
system is appropriately titled ‘Core Validity’. Chalk & Wire’s
Core Validity Toolkit is available to client schools either to
use independently or with the assistance of Chalk & Wire
staff. Starting with system set-up, Chalk & Wire advises
institutions on developing reliable rubrics and linking to
the right outcomes in order to generate desired
customizable reports. Faculty are freed from grading
and rubrics are made more robust and reliable through
the natural process of assessment—no extra work.
The virtues of the Core Validity approach are that it is
research-based, uses data to shape sound decisions
that organically improve validity, and does not tangibly
increase anyone’s workload. Chalk & Wire has the
expertise to know what information you need to show
that your students are learning, and we have the tools
to help you get that information. Admittedly, not many
people have taken the fourth path yet. It’s new.
But try it. The other three have proven unreliable, of
that we are certain.