Final Technical Report E.3.1 Executive Report Rosetta Stone™

Final Technical Report E.3.1
Rosetta Stone™ Evaluation
Executive Report
Date of report: February 29, 2008
Information current as of: February 29, 2008
Contact Information:
Katie Nielson
[email protected]
www.casl.umd.edu
Authors:
Katie Nielson
Catherine Doughty
CDRL: A021
DID: DI-MISC 80508A
Contract No. MDA904-03-C0543
TTO 2101: Final Technical Report E.3.1
Table of Contents
Table of Contents ...................................................................................................................................... 2
Executive Summary .................................................................................................................................. 3
Introduction............................................................................................................................................... 4
Review and Evaluation ............................................................................................................................. 5
Software Design.................................................................................................................................. 5
Manufacturer’s Claims........................................................................................................................ 5
The Empirical Study................................................................................................................................. 6
The Study Protocol ............................................................................................................................. 6
Findings............................................................................................................................................... 8
Conclusion ............................................................................................................................................... 13
Submitted: 03.07.2008
2
TTO 2101: Final Technical Report E.3.1
Executive Summary This report discusses self study with Rosetta Stone™ – under typical agency
workplace conditions – as a way to demonstrate potential for success in language learning. Five
questions addressed outcomes and guidelines for online training in Arabic, Chinese and Spanish.
Is self-study with Rosetta Stone™ an appropriate solution for USG employees who need to
demonstrate potential for success in language training? The tremendous demand for language
training expressed to CASL researchers by client agencies was clearly reflected in the full-capacity
enrollment of 50 study volunteers per language. However, of 120 volunteers who then obtained Rosetta
Stone™ licenses, only a single participant managed to complete Rosetta Stone™ Level 1.
What language gain can be expected from absolute beginners who use Level 1 Rosetta Stone™
according to manufacturer’s instructions? Due to the severe attrition from the self-study program, it
is difficult to answer this question. Of 21 subjects remaining in self study at the first assessment, those
who completed the requisite 50 hours did well on a specific test of the specific Rosetta Stone™ lesson
content (vocabulary and brief sentences). The single volunteer who completed the self study and took an
oral proficiency interview commented: “While Rosetta Stone™ does teach a lot of words, they are not
always the words you need to have an actual conversation.”
Does Rosetta Stone™ work equally well with Arabic, Chinese, and Spanish? Again, due to the lack
of continued participation, we are not able to answer this question conclusively. Arabic and Chinese
students commented that it was difficult to grasp the script without an instructor, and that they had
trouble figuring out which words went with which pictures. This suggests that, at least initially, learning
these three languages with Rosetta Stone™ is not equal.
Should supervisors and/or trainees be given any guidelines for online self-study? Clearly finding
time to use the software in the workplace is a significant problem. Supervisors should consider
providing release time when making online training programs available.
How much supervision do trainees need with online self-study? Even with two technical points of
contact (the lead researcher and the agency technical specialist) there were significant online access
difficulties. Also, even with periodic monitoring and interim assessments, only about 25% of the study
volunteers who obtained accounts actually used Rosetta Stone™ for more than ten hours. Supervisors
should consider monitoring self-study and providing technical support, milestones and mentors.
Submitted: 03.07.2008
3
TTO 2101: Final Technical Report E.3.1
Introduction
The U.S. Government (USG) has invested significantly in the online foreign-language training program
Rosetta Stone™. This particular package has been attractive to agencies and to the Armed Services as a
convenient way to meet immediate agency language-training needs and to help fulfill the objectives of
Goal 1 of the Defense Language Transformation Roadmap.1 Rosetta Stone™ is available in thirty
different languages, including Less Commonly Taught Languages (LCTLs), such as Arabic, Chinese,
Farsi, and Pashto, and is designed to teach all four skills, reading, writing, listening, and speaking.
Students are not required to use a book—all of the lessons are entirely online. The program is
purportedly easy to access: Students can log into their accounts from any computer that can connect to
the Internet, so there are no CDs or DVDs to distribute. Finally, the software manufacturers have
marketed the package as a state-of-the-art solution using “the Dynamic Immersion Method,” which is
claimed to facilitate learning new languages very quickly and in the same communicative and interactive
way native languages are learned.
Despite its widespread use, little is known about how well this program works or of what can be
expected in terms of learners’ foreign language ability upon completion of the online self-study
program. Accordingly, the University of Maryland Center for Advanced Study of Language (CASL), a
trusted agent, was tasked with the evaluation of Rosetta Stone™, in a specific context of USG use for
three critical languages (Arabic, Chinese, and Spanish). The client group wished to gauge employees’
initial success with the program in order to determine eligibility for future language training.2 In
addition, they were generally interested in the logistics of online language training in the workplace. In
order to evaluate how well Rosetta Stone™ meets these USG language-training needs, CASL
researchers performed an expert review of Level 1 Rosetta Stone™ and conducted an empirical study to
test the software in use. In this executive report, we first briefly explain the software design and consider
1
U.S. Department of Defense document. January, 2005. http://www.defenselink.mil/news/Mar2005/d20050330roadmap.pdf
2
At these agencies, there is no in-house language training, and so managers must decide whom to fund and provide release
time for language study.
Submitted: 03.07.2008
4
TTO 2101: Final Technical Report E.3.1
the content of the software in conjunction with the claims made by the manufacturers. Then, we present
the empirical study protocol and main findings, and finally we discuss a few important implications.3
Review and Evaluation
Software Design
Rosetta Stone™ consists of reading, writing, listening, and speaking exercises that require students to
match pictures (photographs or illustrations) to audio and/or text material. In each lesson of every unit,
the user is presented with a text, sound or image to match against four possibilities; the program keeps
track of progress and provides students with a cumulative lesson score as they complete the exercises.
The entire course is in the target foreign language without translations. There are no additional resources
provided for languages that do not use the Roman alphabet, and students are expected to learn reading
and writing without any explicit instruction of non-Roman scripts. The images used by Rosetta Stone™
are the same regardless of the language being studied, and, at times, they are not culturally relevant.4
Every lesson has a “test,” which is identical to the exercises, except that scores are not revealed until the
test is finished, and users have only one chance to match each picture. The tests can be done either
before or after the lessons. The software keeps track of student performance on all tests so that learners
can refer to previous test scores to track their progress in units. In all units, the last lesson is a review of
the previous lessons, with each lesson represented by one group of images.
Manufacturer’s Claims
In a document titled “Research Basis for the Dynamic Immersion Method,” the developers of Rosetta
Stone™ cite first and second language acquisition research in an attempt to explain and justify the
approach and scope of their program. First, the authors discuss research that indicates that a grammar-
3
For more details on the evaluation of the software as well as the empirical study, please see the full version of this report:
Nielson, K. & C. Doughty, and S. Freynik (2008) TTO 2101 Technical Report: Rosetta Stone™ Evaluation. Center for
Advanced Study of Language. College Park: University of Maryland.
4
We reviewed Version 2 for all three languages. Since our review, Version 3 has been published in Arabic, English, French,
German, Italian, Portuguese, Russian, and Spanish (all other languages continue to use Version 2). We are conducting a
systematic comparison of the two versions in Arabic and Spanish, to be made available in an upcoming technical report.
Submitted: 03.07.2008
5
TTO 2101: Final Technical Report E.3.1
based approach to L2 learning is unproductive and that a more communicative approach responsive to
the learners’ “internal syllabus” is best.5 This is widely accepted in the scientific literature. However,
while there is no overt grammar explanation of any kind in Rosetta Stone™, our review of the lessons
revealed a covert grammatical syllabus. Furthermore, the sequence of these materials is not based on a
learner’s own incremental acquisition of language, but rather is pre-determined by the program
developers. The fact that all Level 1 Rosetta Stone™ lessons are translations of one another suggests that
little thought went into how best to structure input to learners for optimal L2 acquisition.
The authors claim that “by combining genuine immersion teaching methods with interactive multimedia
technology, Rosetta Stone™ replicates the environment in which learners naturally acquire new
language” (pp. 2-3). This claim is patently false. The Rosetta Stone™ interface simply presents learners
with matching activities in which they guess or use a process of elimination to determine which words
or phrases go with particular pictures. This pales in comparison with an actual “immersion
environment,” in which learners would negotiate meaning with native speakers to accomplish real-life
tasks.
Finally, contrary to the developers’ emphasis on “communication,” upon completion of the self-study
program, students have not been exposed to fundamental material that one would typically expect, such
as basic greetings, how to introduce themselves, the word for bathroom, etc. Depending upon the
language, the verb tenses and/or vocabulary words chosen are not necessarily reflective of nativespeaker usage, and few of the exercises seem to build actual fluency. Rather, this course provides
significant exposure to isolated vocabulary and phrases, but does not provide the tools for even the most
basic conversation.
The Empirical Study
The Study Protocol
After analysis of the underlying pedagogical structure of the Rosetta Stone™ self-study material, CASL
designed a study for 150 volunteers from two USG agencies in order to determine empirically the
5
The paper authors cite the seminal work of S.P. Corder (1981). Error Analysis and Interlanguage. Oxford: Oxford
University Press.
Submitted: 03.07.2008
6
TTO 2101: Final Technical Report E.3.1
effectiveness of the program and to understand the conditions of software use. The study aimed to
answer the following five questions posed by agency clients:
1. Is self-study with Rosetta Stone™ an appropriate solution for USG employees who need to
demonstrate potential for success in language training?
2. What language gain can be expected from absolute beginners who use Level 1 Rosetta Stone™
according to manufacturer’s instructions?
3. Does Rosetta Stone™ work equally well with Arabic, Chinese, and Spanish?
4. Should supervisors and/or trainees be given any guidelines for online self-study?
5. How much supervision do trainees need with online self-study?
The study was open to personnel from all levels and positions within their agencies, and participants
decided to volunteer for a variety of personal and/or career-related reasons. They were offered the
choice of studying Arabic, Chinese or Spanish, but were required to be absolute beginners. Upon
enrollment in the study, some volunteers requested and received release time from their job duties for
language study, while the majority completed the training entirely on their own time. This emulated the
conditions under which Rosetta Stone™ is typically made available to USG personnel.
All study volunteers gave informed consent and agreed to the following conditions:
Œ
Use the software for 10 hours per week for 20 weeks.
Œ
Distribute learning by studying no more than 3 hours in a particular day.
Œ
Complete a weekly electronic learner log, indicating how many hours studied each day, report
issues with the software (technical or otherwise), and indicate any outside foreign-language use.
Œ
Complete a listening/speaking telephone assessment every 5 weeks or 50 hours of study.
Œ
Complete an oral proficiency interview (OPI) after finishing the 200-hour course.
The lead researcher worked in tandem with agency technology personnel to provide Rosetta Stone™
licenses and access, and she served as the point of contact for any issues study participants encountered
while using the Rosetta Stone™ materials. In addition, she provided encouragement, monitored
progress, and sent regular group biweekly emails to all participants plus individual periodic reminder
emails to those who did not regularly login to the software. The study protocol included interim
Submitted: 03.07.2008
7
TTO 2101: Final Technical Report E.3.1
assessments every five weeks, and a more comprehensive, ACTFL6 OPI final assessment of language
gains. These were administered over the telephone by very proficient speakers of Arabic, Chinese and
Spanish, and, in the case of the OPI, by an ACTFL-certified tester.
Findings
The executive report of findings is organized by the five questions asked by the USG client group:
Is self-study with Rosetta Stone™ an appropriate solution for USG employees who need to
demonstrate potential for success in language training?
The tremendous demand for language training expressed to CASL researchers by client agencies was
clearly reflected in the full-capacity enrollment of 50 study volunteers per language.7 Despite this
significant initial interest, there was a precipitous drop-off once the study was underway. The
participation-record table below documents the severe attrition from the self-study program:
Table 1: Record of Participation
Arabic Chinese Spanish Total
Volunteered and signed consent forms
50
50
50
150
Obtained Rosetta Stone™ accounts
50
37
33
120
Actually accessed accounts
38
19
17
73
Spent more than 10 hours using Rosetta Stone™
18
13
5
32
Completed the first assessment (50 hours of use)
13
5
3
21
Completed the second assessment (100 hours of use)
5
0
1
6
Completed third and fourth assessments, and OPI
(200 hours of use)
1
0
0
1
The rapid decline in participation indicates that self-study under the conditions that mirror the USG
workplace and follow the manufacturer’s recommendations was not tenable. Despite considerable
6
American Council of Foreign Language Teachers
7
In fact, there was even more interest than reflected in these numbers: enrolment had to be closed at 50 given the number of
available licenses.
Submitted: 03.07.2008
8
TTO 2101: Final Technical Report E.3.1
encouragement, fewer than half of the initial volunteers actually accessed their accounts at all, which
might indicate that merely distributing online accounts is not sufficient for language training. In fact,
the learner logs indicated that many had difficulty with microphone and software downloads.
Furthermore, of the 73 people who did access the accounts, over half did not persist with Rosetta
Stone™ for more than 10 hours. The learner-log comments indicated that: the Rosetta Stone™ material
itself was not compelling enough for continued study; participants had difficulty with the non-roman
scripts; they had trouble finding time to use the program; and they experienced program freezing and
crashing. Finally, the fact that only a single person was able to complete the full 200-hour course of
study under the recommended conditions further suggests that simply making this package available to
USG personnel in these or similar work environments is not an ideal solution.
What language gain can be expected from absolute beginners who use Level 1 Rosetta Stone™
according to manufacturer’s instructions?
Three methods were employed to evaluate participants’ language gains: the Rosetta Stone™ unit tests,
four CASL-designed, interim assessments, and a final ACTFL oral proficiency interview. The unit tests
allowed learners to track their own progress. The interim assessments were specifically designed to
measure speaking ability based on the vocabulary, phrases, and sentences that comprised the Rosetta
Stone™ lessons for the prior period of self study. They were administered over the telephone by very
proficient speakers of Arabic, Chinese, and Spanish. Just before beginning the interview, participants
accessed a website with a series of eight images. They then completed a picture-description task in
which they were given a general prompt about each image (i.e., Describe this image) with follow-up
questions (e.g., “Is the man eating?” “Where is the ball?” and “How many horses are in the picture?”).
They were scored on three criteria per prompt: Accuracy of vocabulary, pronunciation, and whether or
not they spoke in complete sentences. Table 2 provides the scores of the first two interim assessments.
The overall pattern demonstrated by the 21 participants who completed the first assessment was that
speaking in complete sentences and pronunciation were more difficult than providing isolated
vocabulary words. Nevertheless, nearly all of the participants who actually used the program for fifty
hours prior to the first assessment were able to answer at least some of the questions successfully, and
four of the 21 received perfect scores (see Table 2). Thus, those who used Rosetta Stone™ did, in fact,
Submitted: 03.07.2008
9
TTO 2101: Final Technical Report E.3.1
learn the specific vocabulary content of the lessons, as well as how to formulate some of the simple
sentences.
Table 2: Interim Assessments of Rosetta Stone™ Content
Language/
Score
Logged
Language/
Score
Logged
Language/
Score
Logged
Student #
1
Hours
Student #
1
Hours
Student #
1
Hours
Arabic – 1
96%
50
Chinese – 1
58%
10
Spanish – 1
67%
14
2
96%
49
2
0%
unknown
2
42%
13
3
92%
40
3
0%
77
3
29%
21
4
88%
29
4
0%
11
5
63%
40
5
0%
11
6
63%
43
7
54%
60
8
38%
19
9
25%
41
10
25%
25
11
21%
unknown
12
4%
unknown
13
0%
8
Score
Logged
Score
Logged
Score
Logged
2
Hours
2
Hours
2
Hours
Arabic - 1
100%
70
--
--
75%
3
2
100%
51
3
100%
23
4
67%
31
5
54%
26
Chinese
Spanish - 1
Due to the unexpected severe attrition, it is not possible to assess with confidence what can be expected
from absolute beginners who complete Level 1 Rosetta Stone™ according to manufacturer’s
instructions. Only a single study participant completed the entire Level 1 program. He received perfect
or nearly perfect scores on all interim assessments that focused on Rosetta Stone™ content, and his
speaking proficiency as measured by the ACTFL oral proficiency rating was 0+ or Novice-High, which
is described in Table 3.
Submitted: 03.07.2008
10
TTO 2101: Final Technical Report E.3.1
After completing the oral proficiency interview, this study participant commented: “While
Rosetta Stone™ does teach a lot of words, they are not always the words you need to have an
actual conversation.” It is important to report that this person who persisted in the self-study of
Arabic was a career linguist who could already speak Korean, Italian, Spanish, and Portuguese,
thus not the typical learner of interest to our clients, that is to say, someone embarking on the
initial study of a foreign language.
Table 3: Novice-High
Speakers at the Novice-High level are able to handle a variety of tasks pertaining to the Intermediate
level, but are unable to sustain performance at that level. They are able to manage successfully a number
of uncomplicated communicative tasks in straightforward social situations.
Conversation is restricted to a few of the predictable topics necessary for survival in the target language
culture, such as basic personal information, basic objects and a limited number of activities, preferences
and immediate needs.
Novice-High speakers respond to simple, direct questions or requests for information; they are able to
ask only a very few formulaic questions when asked to do so.
Novice-High speakers are able to express personal meaning by relying heavily on learned phrases or recombinations of these and what they hear from their interlocutor. Their utterances, which consist mostly
of short and sometimes incomplete sentences in the present, may be hesitant or inaccurate. On the other
hand, since these utterances are frequently only expansions of learned material and stock phrases, they
may sometimes appear surprisingly fluent and accurate.
These speakers’ first language may strongly influence their pronunciation, as well as their vocabulary
and syntax when they attempt to personalize their utterances. Frequent misunderstandings may arise but,
with repetition or rephrasing, Novice-High speakers can generally be understood by sympathetic
interlocutors used to non-natives.
When called on to handle simply a variety of topics and perform functions pertaining to the Intermediate
level, a Novice- High speaker can sometimes respond in intelligible sentences, but will not be able to
sustain sentence level discourse.8
8
From www.languagetesting.com. February 25, 2008.
Submitted: 03.07.2008
11
TTO 2101: Final Technical Report E.3.1
Does Rosetta Stone™ work equally well with Arabic, Chinese, and Spanish?
Again, due to the lack of continued participation, it is difficult to answer this question conclusively. We
can say that the three students who studied Spanish were able to score fairly well on the interim
assessments with significantly fewer than 50 hours of study time (Spanish participants felt they were
ready before the requisite 50 hours to take the assessments). Arabic students, on the other hand, needed
to spend 50 hours in order to be able to perform well on the interim assessments. The five Chinese
students simply did not do well, regardless of amount of time studied. The learner logs contained many
complaints about the program from those students studying Arabic and Chinese; participants commented
that it was difficult to grasp the script without an instructor, and that they had trouble figuring out which
words went with which pictures. This suggests that, at least initially, learning these three languages with
Rosetta Stone™ is not equal.
Should supervisors or trainees be given any guidelines for online foreign-language self-study?
How much supervision do trainees need with online foreign-language self-study?
These last two questions were of interest since clients had reported that, even though online languagetraining is made available to personnel, very few people check it out and use it. For this reason,
additional encouragement and support for study volunteers were built into the study design via the
mechanisms of a proactive and supportive point of contact and required learner logs.9 Nonetheless, there
was significant attrition from the self-study program. Of the 119 people who dropped out of the study,
72 did not provide a reason, even when asked directly (46 did not even begin using the software, and 26
emailed the researcher to drop out without stating why). 35 people dropped out because they were
deployed, their work situation changed, or they simply did not have enough time. Nine people dropped
out for personal or family reasons and five because of technical or IT problems. Two people stated that
they dropped out because they did not like the program.
These data indicate that finding time to use the software is a significant problem. Participation in our
study would have probably been higher if subjects had all received release time from their job duties,
9
We also offered to start in-person user groups, since an interest in that was expressed at the initial volunteer meeting. No
one took us up on that offer.
Submitted: 03.07.2008
12
TTO 2101: Final Technical Report E.3.1
and this is something that supervisors should consider when making online training programs available.
In other words, supervisors and trainees should be forewarned that online self-study requires a
significant time commitment. In addition, supervisors should consider monitoring people participating in
self-study and providing milestones and mentors for them. Even with the periodic monitoring and the
interim assessments, only about 25% of the study volunteers who obtained accounts actually used
Rosetta Stone™ for more than ten hours. Use of Rosetta Stone™ might be more frequent if users are
required to check-in with an instructor weekly and report progress to their supervisors.
Finally, supervisors should be aware of the technology challenges that our participants faced. While only
five people explicitly indicated that they dropped out because of technology problems, there were
complaints from 20 different participants about technical problems in the first week of the study alone,
and this was despite continuous support from the lead researcher and full engagement of the agencies’
technology specialists. Among other issues, participants had trouble with system crashes, were unable to
use microphones in a secure environment, were unable to access the program website using wireless
Internet, and could not download Shockwave to their work computers. Supervisors should familiarize
themselves with all of the system requirements for any online training and should ensure that their
employees have proper equipment before distributing software licenses. Further, agencies should expect
to provide significant technical support.
Conclusion
The most striking finding of this study was severe attrition from self study with Rosetta Stone™ under
typical agency workplace conditions, even with additional support not ordinarily available with the
software. It seems that supervisors can not assume that providing access to online foreign-language
materials will in any way guarantee usage.
Submitted: 03.07.2008
13