Final Technical Report E.3.1 Rosetta Stone™ Evaluation Executive Report Date of report: February 29, 2008 Information current as of: February 29, 2008 Contact Information: Katie Nielson [email protected] www.casl.umd.edu Authors: Katie Nielson Catherine Doughty CDRL: A021 DID: DI-MISC 80508A Contract No. MDA904-03-C0543 TTO 2101: Final Technical Report E.3.1 Table of Contents Table of Contents ...................................................................................................................................... 2 Executive Summary .................................................................................................................................. 3 Introduction............................................................................................................................................... 4 Review and Evaluation ............................................................................................................................. 5 Software Design.................................................................................................................................. 5 Manufacturer’s Claims........................................................................................................................ 5 The Empirical Study................................................................................................................................. 6 The Study Protocol ............................................................................................................................. 6 Findings............................................................................................................................................... 8 Conclusion ............................................................................................................................................... 13 Submitted: 03.07.2008 2 TTO 2101: Final Technical Report E.3.1 Executive Summary This report discusses self study with Rosetta Stone™ – under typical agency workplace conditions – as a way to demonstrate potential for success in language learning. Five questions addressed outcomes and guidelines for online training in Arabic, Chinese and Spanish. Is self-study with Rosetta Stone™ an appropriate solution for USG employees who need to demonstrate potential for success in language training? The tremendous demand for language training expressed to CASL researchers by client agencies was clearly reflected in the full-capacity enrollment of 50 study volunteers per language. However, of 120 volunteers who then obtained Rosetta Stone™ licenses, only a single participant managed to complete Rosetta Stone™ Level 1. What language gain can be expected from absolute beginners who use Level 1 Rosetta Stone™ according to manufacturer’s instructions? Due to the severe attrition from the self-study program, it is difficult to answer this question. Of 21 subjects remaining in self study at the first assessment, those who completed the requisite 50 hours did well on a specific test of the specific Rosetta Stone™ lesson content (vocabulary and brief sentences). The single volunteer who completed the self study and took an oral proficiency interview commented: “While Rosetta Stone™ does teach a lot of words, they are not always the words you need to have an actual conversation.” Does Rosetta Stone™ work equally well with Arabic, Chinese, and Spanish? Again, due to the lack of continued participation, we are not able to answer this question conclusively. Arabic and Chinese students commented that it was difficult to grasp the script without an instructor, and that they had trouble figuring out which words went with which pictures. This suggests that, at least initially, learning these three languages with Rosetta Stone™ is not equal. Should supervisors and/or trainees be given any guidelines for online self-study? Clearly finding time to use the software in the workplace is a significant problem. Supervisors should consider providing release time when making online training programs available. How much supervision do trainees need with online self-study? Even with two technical points of contact (the lead researcher and the agency technical specialist) there were significant online access difficulties. Also, even with periodic monitoring and interim assessments, only about 25% of the study volunteers who obtained accounts actually used Rosetta Stone™ for more than ten hours. Supervisors should consider monitoring self-study and providing technical support, milestones and mentors. Submitted: 03.07.2008 3 TTO 2101: Final Technical Report E.3.1 Introduction The U.S. Government (USG) has invested significantly in the online foreign-language training program Rosetta Stone™. This particular package has been attractive to agencies and to the Armed Services as a convenient way to meet immediate agency language-training needs and to help fulfill the objectives of Goal 1 of the Defense Language Transformation Roadmap.1 Rosetta Stone™ is available in thirty different languages, including Less Commonly Taught Languages (LCTLs), such as Arabic, Chinese, Farsi, and Pashto, and is designed to teach all four skills, reading, writing, listening, and speaking. Students are not required to use a book—all of the lessons are entirely online. The program is purportedly easy to access: Students can log into their accounts from any computer that can connect to the Internet, so there are no CDs or DVDs to distribute. Finally, the software manufacturers have marketed the package as a state-of-the-art solution using “the Dynamic Immersion Method,” which is claimed to facilitate learning new languages very quickly and in the same communicative and interactive way native languages are learned. Despite its widespread use, little is known about how well this program works or of what can be expected in terms of learners’ foreign language ability upon completion of the online self-study program. Accordingly, the University of Maryland Center for Advanced Study of Language (CASL), a trusted agent, was tasked with the evaluation of Rosetta Stone™, in a specific context of USG use for three critical languages (Arabic, Chinese, and Spanish). The client group wished to gauge employees’ initial success with the program in order to determine eligibility for future language training.2 In addition, they were generally interested in the logistics of online language training in the workplace. In order to evaluate how well Rosetta Stone™ meets these USG language-training needs, CASL researchers performed an expert review of Level 1 Rosetta Stone™ and conducted an empirical study to test the software in use. In this executive report, we first briefly explain the software design and consider 1 U.S. Department of Defense document. January, 2005. http://www.defenselink.mil/news/Mar2005/d20050330roadmap.pdf 2 At these agencies, there is no in-house language training, and so managers must decide whom to fund and provide release time for language study. Submitted: 03.07.2008 4 TTO 2101: Final Technical Report E.3.1 the content of the software in conjunction with the claims made by the manufacturers. Then, we present the empirical study protocol and main findings, and finally we discuss a few important implications.3 Review and Evaluation Software Design Rosetta Stone™ consists of reading, writing, listening, and speaking exercises that require students to match pictures (photographs or illustrations) to audio and/or text material. In each lesson of every unit, the user is presented with a text, sound or image to match against four possibilities; the program keeps track of progress and provides students with a cumulative lesson score as they complete the exercises. The entire course is in the target foreign language without translations. There are no additional resources provided for languages that do not use the Roman alphabet, and students are expected to learn reading and writing without any explicit instruction of non-Roman scripts. The images used by Rosetta Stone™ are the same regardless of the language being studied, and, at times, they are not culturally relevant.4 Every lesson has a “test,” which is identical to the exercises, except that scores are not revealed until the test is finished, and users have only one chance to match each picture. The tests can be done either before or after the lessons. The software keeps track of student performance on all tests so that learners can refer to previous test scores to track their progress in units. In all units, the last lesson is a review of the previous lessons, with each lesson represented by one group of images. Manufacturer’s Claims In a document titled “Research Basis for the Dynamic Immersion Method,” the developers of Rosetta Stone™ cite first and second language acquisition research in an attempt to explain and justify the approach and scope of their program. First, the authors discuss research that indicates that a grammar- 3 For more details on the evaluation of the software as well as the empirical study, please see the full version of this report: Nielson, K. & C. Doughty, and S. Freynik (2008) TTO 2101 Technical Report: Rosetta Stone™ Evaluation. Center for Advanced Study of Language. College Park: University of Maryland. 4 We reviewed Version 2 for all three languages. Since our review, Version 3 has been published in Arabic, English, French, German, Italian, Portuguese, Russian, and Spanish (all other languages continue to use Version 2). We are conducting a systematic comparison of the two versions in Arabic and Spanish, to be made available in an upcoming technical report. Submitted: 03.07.2008 5 TTO 2101: Final Technical Report E.3.1 based approach to L2 learning is unproductive and that a more communicative approach responsive to the learners’ “internal syllabus” is best.5 This is widely accepted in the scientific literature. However, while there is no overt grammar explanation of any kind in Rosetta Stone™, our review of the lessons revealed a covert grammatical syllabus. Furthermore, the sequence of these materials is not based on a learner’s own incremental acquisition of language, but rather is pre-determined by the program developers. The fact that all Level 1 Rosetta Stone™ lessons are translations of one another suggests that little thought went into how best to structure input to learners for optimal L2 acquisition. The authors claim that “by combining genuine immersion teaching methods with interactive multimedia technology, Rosetta Stone™ replicates the environment in which learners naturally acquire new language” (pp. 2-3). This claim is patently false. The Rosetta Stone™ interface simply presents learners with matching activities in which they guess or use a process of elimination to determine which words or phrases go with particular pictures. This pales in comparison with an actual “immersion environment,” in which learners would negotiate meaning with native speakers to accomplish real-life tasks. Finally, contrary to the developers’ emphasis on “communication,” upon completion of the self-study program, students have not been exposed to fundamental material that one would typically expect, such as basic greetings, how to introduce themselves, the word for bathroom, etc. Depending upon the language, the verb tenses and/or vocabulary words chosen are not necessarily reflective of nativespeaker usage, and few of the exercises seem to build actual fluency. Rather, this course provides significant exposure to isolated vocabulary and phrases, but does not provide the tools for even the most basic conversation. The Empirical Study The Study Protocol After analysis of the underlying pedagogical structure of the Rosetta Stone™ self-study material, CASL designed a study for 150 volunteers from two USG agencies in order to determine empirically the 5 The paper authors cite the seminal work of S.P. Corder (1981). Error Analysis and Interlanguage. Oxford: Oxford University Press. Submitted: 03.07.2008 6 TTO 2101: Final Technical Report E.3.1 effectiveness of the program and to understand the conditions of software use. The study aimed to answer the following five questions posed by agency clients: 1. Is self-study with Rosetta Stone™ an appropriate solution for USG employees who need to demonstrate potential for success in language training? 2. What language gain can be expected from absolute beginners who use Level 1 Rosetta Stone™ according to manufacturer’s instructions? 3. Does Rosetta Stone™ work equally well with Arabic, Chinese, and Spanish? 4. Should supervisors and/or trainees be given any guidelines for online self-study? 5. How much supervision do trainees need with online self-study? The study was open to personnel from all levels and positions within their agencies, and participants decided to volunteer for a variety of personal and/or career-related reasons. They were offered the choice of studying Arabic, Chinese or Spanish, but were required to be absolute beginners. Upon enrollment in the study, some volunteers requested and received release time from their job duties for language study, while the majority completed the training entirely on their own time. This emulated the conditions under which Rosetta Stone™ is typically made available to USG personnel. All study volunteers gave informed consent and agreed to the following conditions: Use the software for 10 hours per week for 20 weeks. Distribute learning by studying no more than 3 hours in a particular day. Complete a weekly electronic learner log, indicating how many hours studied each day, report issues with the software (technical or otherwise), and indicate any outside foreign-language use. Complete a listening/speaking telephone assessment every 5 weeks or 50 hours of study. Complete an oral proficiency interview (OPI) after finishing the 200-hour course. The lead researcher worked in tandem with agency technology personnel to provide Rosetta Stone™ licenses and access, and she served as the point of contact for any issues study participants encountered while using the Rosetta Stone™ materials. In addition, she provided encouragement, monitored progress, and sent regular group biweekly emails to all participants plus individual periodic reminder emails to those who did not regularly login to the software. The study protocol included interim Submitted: 03.07.2008 7 TTO 2101: Final Technical Report E.3.1 assessments every five weeks, and a more comprehensive, ACTFL6 OPI final assessment of language gains. These were administered over the telephone by very proficient speakers of Arabic, Chinese and Spanish, and, in the case of the OPI, by an ACTFL-certified tester. Findings The executive report of findings is organized by the five questions asked by the USG client group: Is self-study with Rosetta Stone™ an appropriate solution for USG employees who need to demonstrate potential for success in language training? The tremendous demand for language training expressed to CASL researchers by client agencies was clearly reflected in the full-capacity enrollment of 50 study volunteers per language.7 Despite this significant initial interest, there was a precipitous drop-off once the study was underway. The participation-record table below documents the severe attrition from the self-study program: Table 1: Record of Participation Arabic Chinese Spanish Total Volunteered and signed consent forms 50 50 50 150 Obtained Rosetta Stone™ accounts 50 37 33 120 Actually accessed accounts 38 19 17 73 Spent more than 10 hours using Rosetta Stone™ 18 13 5 32 Completed the first assessment (50 hours of use) 13 5 3 21 Completed the second assessment (100 hours of use) 5 0 1 6 Completed third and fourth assessments, and OPI (200 hours of use) 1 0 0 1 The rapid decline in participation indicates that self-study under the conditions that mirror the USG workplace and follow the manufacturer’s recommendations was not tenable. Despite considerable 6 American Council of Foreign Language Teachers 7 In fact, there was even more interest than reflected in these numbers: enrolment had to be closed at 50 given the number of available licenses. Submitted: 03.07.2008 8 TTO 2101: Final Technical Report E.3.1 encouragement, fewer than half of the initial volunteers actually accessed their accounts at all, which might indicate that merely distributing online accounts is not sufficient for language training. In fact, the learner logs indicated that many had difficulty with microphone and software downloads. Furthermore, of the 73 people who did access the accounts, over half did not persist with Rosetta Stone™ for more than 10 hours. The learner-log comments indicated that: the Rosetta Stone™ material itself was not compelling enough for continued study; participants had difficulty with the non-roman scripts; they had trouble finding time to use the program; and they experienced program freezing and crashing. Finally, the fact that only a single person was able to complete the full 200-hour course of study under the recommended conditions further suggests that simply making this package available to USG personnel in these or similar work environments is not an ideal solution. What language gain can be expected from absolute beginners who use Level 1 Rosetta Stone™ according to manufacturer’s instructions? Three methods were employed to evaluate participants’ language gains: the Rosetta Stone™ unit tests, four CASL-designed, interim assessments, and a final ACTFL oral proficiency interview. The unit tests allowed learners to track their own progress. The interim assessments were specifically designed to measure speaking ability based on the vocabulary, phrases, and sentences that comprised the Rosetta Stone™ lessons for the prior period of self study. They were administered over the telephone by very proficient speakers of Arabic, Chinese, and Spanish. Just before beginning the interview, participants accessed a website with a series of eight images. They then completed a picture-description task in which they were given a general prompt about each image (i.e., Describe this image) with follow-up questions (e.g., “Is the man eating?” “Where is the ball?” and “How many horses are in the picture?”). They were scored on three criteria per prompt: Accuracy of vocabulary, pronunciation, and whether or not they spoke in complete sentences. Table 2 provides the scores of the first two interim assessments. The overall pattern demonstrated by the 21 participants who completed the first assessment was that speaking in complete sentences and pronunciation were more difficult than providing isolated vocabulary words. Nevertheless, nearly all of the participants who actually used the program for fifty hours prior to the first assessment were able to answer at least some of the questions successfully, and four of the 21 received perfect scores (see Table 2). Thus, those who used Rosetta Stone™ did, in fact, Submitted: 03.07.2008 9 TTO 2101: Final Technical Report E.3.1 learn the specific vocabulary content of the lessons, as well as how to formulate some of the simple sentences. Table 2: Interim Assessments of Rosetta Stone™ Content Language/ Score Logged Language/ Score Logged Language/ Score Logged Student # 1 Hours Student # 1 Hours Student # 1 Hours Arabic – 1 96% 50 Chinese – 1 58% 10 Spanish – 1 67% 14 2 96% 49 2 0% unknown 2 42% 13 3 92% 40 3 0% 77 3 29% 21 4 88% 29 4 0% 11 5 63% 40 5 0% 11 6 63% 43 7 54% 60 8 38% 19 9 25% 41 10 25% 25 11 21% unknown 12 4% unknown 13 0% 8 Score Logged Score Logged Score Logged 2 Hours 2 Hours 2 Hours Arabic - 1 100% 70 -- -- 75% 3 2 100% 51 3 100% 23 4 67% 31 5 54% 26 Chinese Spanish - 1 Due to the unexpected severe attrition, it is not possible to assess with confidence what can be expected from absolute beginners who complete Level 1 Rosetta Stone™ according to manufacturer’s instructions. Only a single study participant completed the entire Level 1 program. He received perfect or nearly perfect scores on all interim assessments that focused on Rosetta Stone™ content, and his speaking proficiency as measured by the ACTFL oral proficiency rating was 0+ or Novice-High, which is described in Table 3. Submitted: 03.07.2008 10 TTO 2101: Final Technical Report E.3.1 After completing the oral proficiency interview, this study participant commented: “While Rosetta Stone™ does teach a lot of words, they are not always the words you need to have an actual conversation.” It is important to report that this person who persisted in the self-study of Arabic was a career linguist who could already speak Korean, Italian, Spanish, and Portuguese, thus not the typical learner of interest to our clients, that is to say, someone embarking on the initial study of a foreign language. Table 3: Novice-High Speakers at the Novice-High level are able to handle a variety of tasks pertaining to the Intermediate level, but are unable to sustain performance at that level. They are able to manage successfully a number of uncomplicated communicative tasks in straightforward social situations. Conversation is restricted to a few of the predictable topics necessary for survival in the target language culture, such as basic personal information, basic objects and a limited number of activities, preferences and immediate needs. Novice-High speakers respond to simple, direct questions or requests for information; they are able to ask only a very few formulaic questions when asked to do so. Novice-High speakers are able to express personal meaning by relying heavily on learned phrases or recombinations of these and what they hear from their interlocutor. Their utterances, which consist mostly of short and sometimes incomplete sentences in the present, may be hesitant or inaccurate. On the other hand, since these utterances are frequently only expansions of learned material and stock phrases, they may sometimes appear surprisingly fluent and accurate. These speakers’ first language may strongly influence their pronunciation, as well as their vocabulary and syntax when they attempt to personalize their utterances. Frequent misunderstandings may arise but, with repetition or rephrasing, Novice-High speakers can generally be understood by sympathetic interlocutors used to non-natives. When called on to handle simply a variety of topics and perform functions pertaining to the Intermediate level, a Novice- High speaker can sometimes respond in intelligible sentences, but will not be able to sustain sentence level discourse.8 8 From www.languagetesting.com. February 25, 2008. Submitted: 03.07.2008 11 TTO 2101: Final Technical Report E.3.1 Does Rosetta Stone™ work equally well with Arabic, Chinese, and Spanish? Again, due to the lack of continued participation, it is difficult to answer this question conclusively. We can say that the three students who studied Spanish were able to score fairly well on the interim assessments with significantly fewer than 50 hours of study time (Spanish participants felt they were ready before the requisite 50 hours to take the assessments). Arabic students, on the other hand, needed to spend 50 hours in order to be able to perform well on the interim assessments. The five Chinese students simply did not do well, regardless of amount of time studied. The learner logs contained many complaints about the program from those students studying Arabic and Chinese; participants commented that it was difficult to grasp the script without an instructor, and that they had trouble figuring out which words went with which pictures. This suggests that, at least initially, learning these three languages with Rosetta Stone™ is not equal. Should supervisors or trainees be given any guidelines for online foreign-language self-study? How much supervision do trainees need with online foreign-language self-study? These last two questions were of interest since clients had reported that, even though online languagetraining is made available to personnel, very few people check it out and use it. For this reason, additional encouragement and support for study volunteers were built into the study design via the mechanisms of a proactive and supportive point of contact and required learner logs.9 Nonetheless, there was significant attrition from the self-study program. Of the 119 people who dropped out of the study, 72 did not provide a reason, even when asked directly (46 did not even begin using the software, and 26 emailed the researcher to drop out without stating why). 35 people dropped out because they were deployed, their work situation changed, or they simply did not have enough time. Nine people dropped out for personal or family reasons and five because of technical or IT problems. Two people stated that they dropped out because they did not like the program. These data indicate that finding time to use the software is a significant problem. Participation in our study would have probably been higher if subjects had all received release time from their job duties, 9 We also offered to start in-person user groups, since an interest in that was expressed at the initial volunteer meeting. No one took us up on that offer. Submitted: 03.07.2008 12 TTO 2101: Final Technical Report E.3.1 and this is something that supervisors should consider when making online training programs available. In other words, supervisors and trainees should be forewarned that online self-study requires a significant time commitment. In addition, supervisors should consider monitoring people participating in self-study and providing milestones and mentors for them. Even with the periodic monitoring and the interim assessments, only about 25% of the study volunteers who obtained accounts actually used Rosetta Stone™ for more than ten hours. Use of Rosetta Stone™ might be more frequent if users are required to check-in with an instructor weekly and report progress to their supervisors. Finally, supervisors should be aware of the technology challenges that our participants faced. While only five people explicitly indicated that they dropped out because of technology problems, there were complaints from 20 different participants about technical problems in the first week of the study alone, and this was despite continuous support from the lead researcher and full engagement of the agencies’ technology specialists. Among other issues, participants had trouble with system crashes, were unable to use microphones in a secure environment, were unable to access the program website using wireless Internet, and could not download Shockwave to their work computers. Supervisors should familiarize themselves with all of the system requirements for any online training and should ensure that their employees have proper equipment before distributing software licenses. Further, agencies should expect to provide significant technical support. Conclusion The most striking finding of this study was severe attrition from self study with Rosetta Stone™ under typical agency workplace conditions, even with additional support not ordinarily available with the software. It seems that supervisors can not assume that providing access to online foreign-language materials will in any way guarantee usage. Submitted: 03.07.2008 13
© Copyright 2024