Download Report

Towards Application of
User-Tailored Machine Translation
in Localization
Andrejs Vasiļjevs, Raivis Skadiņš, Inguna Skadiņa
TILDE
JEC 2011, Luxembourg
October 14, 2011
MT in Localization Work



The localization process is generally related to
the translation and cultural adaptation of
software, video games, and websites, and less
frequently to any written translation
Translation Memory is commonly used
Although there are number of MT use-cases in
practical localization process, it is not yet
widely adapted
Previous Work


Evaluation with keyboard-monitoring program and
Choice Network Analysis for measuring the effort
involved in post-editing MT output (O´Brien, 2005)
Productivity tests have been performed in translation
and localization industry settings at Microsoft
(Schmidtke, 2008):
• SMT system of Microsoft Research trained on MS tech
domain for 3 languages for Office Online 2007
localization task: Spanish, French and German
• By applying MT to all new words on average 5-10%
productivity was gained
Previous Work – Adobe

In Adobe two experiments were performed
(Flournoy and Duran, 2009):
• Small test set of 800-2000 words was machine
translated and post-edited
• Then, based on the positive results, about 200,000
words of new text were localized
• The rule-based MT was used for translation into
Russian (PROMT) and SMT for Spanish and French
(Language Weaver)
• Productivity increase between 22% and 51%
Previous Work – Autodesk

Evaluation of Autodesk Moses SMT system
(Plitt and Masselot, 2010):
• Translation from English to French, Italian, German and
Spanish with three translators for each language pair
• To measure translation time special workbench was
designed to capture keyboard and pause times for each
sentence
• MT allowed translators to improve their throughput on
average by 74%
• Varying increase in productivity: from 20% to 131%
• Optimum throughput has been reached for sentences
of around 25 words in length
LetsMT! project

User-driven cloud-based MT factory, based on open-source
MT tools

Services for data collection, MT generation, customization and
running of variety of user-tailored MT systems

Application in localization among the key usage scenarios

Strong synergy with FP7 project ACCURAT to advance datadriven machine translation for under-resourced languages
and domains
Partnership with Complementing
Competencies

Tilde (Project Coordinator) - Latvia

University of Edinburgh - UK

University of Zagreb - Croatia

Copenhagen University - Denmark

Uppsala University - Sweden

Moravia – Czech Republic

SemLab – Netherlands
Language
expertise
Research
excellence
Industry
experience
LetsMT! Main Features





Online collaborative platform for MT building from
user-provided data
Repository of parallel and monolingual corpora for MT
generation
Automated training of SMT systems from specified
collections of data
Users can specify particular training data collections
and build customised MT engines from these
collections
Users can also use LetsMT! platform for tailoring MT
system to their needs from their non-public data
User requirement analysis


21 interviews have been conducted with
localization/translation agencies.
Only 7 of the respondents replied that they
use fully automatic MT systems in their
translation practice and only one LSP
organization employs MT as the primary
translation method.
User survey
IPR of text resources in interviewee
organizations
23%
37%
no reply
interviewee has IPR
18%
interviewee has restricted/partial IPR
interviewee has no IPR
22%
User survey
Organizations' ability to share data
16%
40%
no reply/interviewee has no data now
21%
perhaps
yes
23%
no
Sharing of training data
Training
Using
SMT Resource
Directory
Giza++
Moses SMT toolkit
SMT Multi-Model
Repository
(trained SMT models)
Anonymous
access
SMT Resource
Repository
Web page
translation widget
Web browser
Plug-ins
SMT System
Directory
Moses decoder
System management, user authentication, access rights control ...
Authenticated
access
Procesing, Evaluation ...
Upload
Web page
Web service
CAT tools
LetsMT! architecture
Interface Layer
• Application Logic
• Resource Repository
(stores MT training data and trained
models)
• High-performance Computing
Cluster
(executes all computationally heavy tasks:
SMT training, MT service, Processing and
aligning of training data etc.)
Widget
https
REST
https
REST, SOAP, ...
Public API
http
REST/SOAP
Web Page UI
...
http
REST/SOAP
(webpage UI, web service API)
http/https
html
• User interface
Browser
CAT tools
plug-ins
CAT
CATtools
tools
https
REST
Web
Browsers
TCP/IP
Multitier architecture
Application Logic Layer
Resource
Repository
Adapter
REST

SMT training
High-performance Computing (HPC) Cluster
Data Storage Layer
(Resource Repository)
RR API
Translation
REST
HPC frontend
File Share
SGE
CPU
CPU
CPU
CPU
CPU
CPU
CPU
SVN
CPU
System
DB
CPU
LetsMT! architecture

Webpage interface
where users can
• See, upload and manage corpora
• See, train and manage user tailored SMT system
• Translate texts and files

Web service API
• Integration with CAT tools
• Integration in web pages and web browsers
LetsMT! architecture

Resource Repository
• Stores SMT training data
• Supports different formats – TMX, XLIFF, PDF,
DOC, plain text
• Converts to unified format
• Performs format conversions and alignment
Based on Moses SMT toolkit


Training tasks are managed with
Moses Experiment Management System
Training tasks are executed in HPC cluster
(Oracle Grid Engine)


Hosted in Web Services infrastructure which provides easy access to
on demand computing resources
New Moses features:
•
•
•
•
•
•
incremental training,
distributed language models,
interpolated language models for domain adaptation
randomized language models to train using huge corpora
translation of formatted texts,
running Moses decoder in a server mode
Bilingual corpora for the EnglishLatvian system
Bilingual corpus
Parallel units
Localization TM
~1.29 M
DGT-TM
~1.06 M
OPUS EMEA
~0.97 M
Fiction
~0.66 M
Dictionary data
~0.51 M
Web corpus
~0.9 M
Total
5.37 M
Latvian monolingual corpora
Monolingual corpus
Latvian side of parallel
corpus
News (web)
Fiction
Total, Latvian
Words
60 M
250 M
9M
319 M
The MT System



Tilde English-Latvian MT
Tools: Moses, Giza++, SRILM,
Tilde Latvian morphological analyzer
Factored phrase-based SMT system:
• Surface form  Surface form, Morphology tag
• 2 Language models:
English
Latvian
• (1) 5-gram surface form
• (2) 7-gram morphology tag
surface
surf.
lemma
lemma
morph.
morph.
synt.
stem
suffix
The MT system

Development and evaluation data
•
•
•
•
Development - 1000 sentences
Evaluation – 500 sentences
Balanced
BLEU score: 35.0
Topic
Percentage
General information about European
Union
12%
Specifications, instructions and
manuals
12%
Popular scientific and educational
12%
Official and legal documents
12%
News and magazine articles
24%
Information technology
18%
Letters
5%
Fiction
5%
Localization workflow at Tilde
Evaluate original / assign
Translator and Editor
Import into SDL Trados
Analyze against TMs
Translate
using translation suggestions for TMs
Evaluate translation quality /
Edit
(optional) Fix errors
Hand over to customer
MT Integration into Localization Workflow
Evaluate original / assign
Translator and Editor
Analyze against TMs
MT translate new sentences
Translate
using translation suggestions for TMs
and MT
Evaluate translation quality /
Edit
Fix errors
Ready translation
Evaluation of Productivity



Key interest of localization industry is to
increase productivity of translation process
while maintaining required quality level
Productivity was measured as the translation
output of an average translator in words per
hour
5 translators participated in evaluation
including both experienced and new
translators
Evaluation of Quality

Performed by human editors as part of their regular QA process

Result of translation process was evaluated, editors did not
know was or was not MT applied to assist translator

Comparison to reference is not part of this evaluation

Tilde standard QA assessment form was used covering the
following text quality areas:
• Accuracy
• Spelling and grammar
• Style
• Terminology
Error Score

QA Evaluation Form
Error Category
Weight
Amount of errors
Negative points
1. Accuracy
1.1. Understanding of the source text
3
0
1.2. Understanding the functionality of the product
1.3. Comprehensibility
1.4. Omissions/Unnecessary additions
1.5. Translated/Untranslated
1.6. Left-overs
3
3
2
1
1
0
0
0
0
0
0
2
1
1
0
0
0
0
1
1
2
1
0
0
0
0
0
2
2
0
0
0
0
0
0
Resulting Evaluation
Total
2. Language quality
2.1. Grammar
2.2. Punctuation
2.3. Spelling
Total
3. Style
3.1. Word order, word-for-word translation
3.2. Vocabulary and style choice
3.3. Style Guide adherence
3.4. Country standards
Total
4. Terminology
4.1. Glossary adherence
4.2. Consistency
Total
Additional plus points for style (if applicable)
Grand Total
Negative points per 1000 words
Quality:
QA Grades
Error Score
(sum of weighted errors)
Resulting Quality
Evaluation
0…9
Superior
10…29
Good
30…49
Mediocre
50…69
Poor
>70
Very poor
Tilde Localization QA assessment applied in the evaluation
Evaluation data
►
54 documents in IT domain
►
950-1050 adjusted words in each document
►
Each document was split in half:
►the first part was translated using suggestions from TM
only
►the second half was translated using suggestions from
both TM and MT
Integration of MT in SDL Trados
Evaluation Results
►
Average translation productivity:
► Baseline with TM only: 550 w/h
► With TM and MT: 731 w/h
 32.9% productivity increase
►
►
High variability in individual performance with
productivity increase by 64% to decrease by 5%.
Increase of error score from 20.2 to 28.6 points
but still at the level “GOOD” (<30 points)
Average error scores by error types
Error Class
Accuracy
Language quality
Style
Terminology
Baseline
scenario
6
6
3
5
MT
scenario
9
10
4
7
Biggest quality degradation in language quality
due to increase of grammatical errors.
Conclusions

The results of our experiment clearly demonstrate that it is feasible
to integrate the current state of the art SMT systems for highly
inflected languages into the localization process.

Quality of translation decreases in all error categories still
degradation is not critical and the result is acceptable for
production purposes.

Significant performance differences for individual translators hints
to the role of human factors such as training, work procedures,
personal inclinations etc.

Extended experiments are planned (I) involving more translators,
(II) translating texts in different domains and (III) in other language
pairs.
Application in localization workflow
Let's MT!
Thank you!
The research within the LetsMT! project leading to these results
has received funding from the ICT Policy Support Programme (ICT
PSP), Theme 5 – Multilingual web, grant agreement no 250456