Comparative Analysis of Errors in MT Output and Computer-as- sisted Translation: Effect of the Human Factor - Association for Computational ...

Page created by Karen Chang
 
CONTINUE READING
Comparative Analysis of Errors in MT Output and Computer-as-
           sisted Translation: Effect of the Human Factor
          Irina Ovchinnikova                                          Daria Morozova
   Sechenov First Moscow State Medical                                  HEC Paris /
       University, Moscow, Russia /                                1 rue de la Libération,
      Trubetskaya str., 8, building 2,                          78351 Jouy-en-Josas, France
         119992 Moscow, Russia                                 daria.morozova@hec.edu

           Lichi Translations LTD /
                 P.O. Box 18,
          Be'er Ya'akov, 70300 Israel
          Ira.ovchi@gmail.com

                                                                   in translation on Smartcat performed by
                        Abstract                                   professional translators uncover insuffi-
                                                                   ciency of CAT tools for the language pair
    The paper presents a comparative analysis
                                                                   as well as peculiar problems in applying
    of errors in outputs of MT and Computer-
                                                                   CAT tools while translating from Hebrew
    assisted Translation (CAT) platforms in
                                                                   into Russian.
    translation from Hebrew into Russian. A
    MT system, shared translation memory
    (TM), and dictionaries are available on
                                                               1    Introduction
    CAT platforms. The platforms allow for                     The objective of the study is to analyze the
    editing and improving any MT output as                     peculiar errors in translation from Hebrew into
    well as performing manual translation.                     Russian on a CAT platform as compared to the
    Evaluation of the efficiency of the plat-                  errors in the MT output and reveal their sources to
    forms in comparison with the MT systems                    provide developers with the translators’ feedback.
    shows advantages of the CAT platforms in                   The comparison is efficient from the practical
    the translation industry. The comparison                   point of view since a target text, being translated
    reveals the impact of the human factor on                  by a MT system or human translator, must deliver
    the CAT output providing developers with                   the source message and has to be relevant to the
    the feedback from translation industry.                    target culture. In the translation industry, revised
    The research was conducted on docu-                        MT outputs compete with human translations,
    ments translated from Hebrew into Rus-                     including those performed on CAT platforms.
    sian (approximately 35,000 words, 3118                     Therefore, awareness of the peculiar errors in
    segments) on Smartcat. Errors in MT out-                   translation on the platforms will provide basis for
    put for Russian as a target language show                  improving Hebrew-Russian MT and for choosing
    almost equal shares of fluency and accu-                   the way to translate the particular project applying
    racy errors in PBSTM and prevalence of                     MT or hiring a human translator who has access
    the accuracy errors in NMT. Errors on the                  to a CAT platform.
    Smartcat platform reveal difficulties in                      The research was conducted on the material of
    mastering semantic and stylistic coher-                    translation projects on Smartcat.1 In the paper, we
    ence of the whole document. In general,                    discuss Hebrew-Russian translation of a tourist
    however, the translation is accurate and                   guide (9 files in Smartcat; 35,0002 word forms
    readable. The influence of English as lin-                 approximately; 3118 segments; on average, 10
    gua franca appears in peculiar ortho-                      word forms in a segment) by a team of
    graphic and punctuation errors. The errors                 professional translators. The errors in the MT

©2019 The authors. This article is licensed under a Creative   2
                                                                 It is hard to determine the exact number of word forms be-
Commons 4.0 license, no derivative works, attribution,         cause the author amended the text in Hebrew owing to the
CCBY-ND.                                                       necessity to provide the accurate data and information. Nev-
1                                                              ertheless, the number of segments was constant.
  Smartcat Platform Inc. 2019 https://www.smartcat.ai/

Proceedings of MT Summit XVII, volume 2                                         Dublin, Aug. 19-23, 2019 | p. 88
output are considered as a baseline for analysis of    substantially surpasses that of PBSMT (Shter-
errors on Smartcat.                                    ionov et al., 2018).
   To the best of our knowledge, translation from         Researchers differentiate between errors in flu-
Hebrew into Russian on a CAT platform was not          ency and accuracy of translation. Fluency errors
analyzed in the aspect of the errors type as           reduce the readability of the target text, while ac-
compared to the MT output. Meanwhile, the              curacy errors distort its content. According to the
comparison allows for developers and users to          data from different evaluation systems and differ-
revise tools on the platforms. Translators will get    ent languages, fluency errors are more prevalent
the insight into specific advantages of MT systems     than accuracy errors (Aranberri et al., 2016). The
depending on the language pair. Being aware of         most typical fluency errors are grammatical errors
the advantages, translators can improve the            (close to 80%: Aranberri et al., 2016: 1880). They
quality of target texts combining MT and human         include morphological, word order and syntax er-
translation.                                           rors. In general, NMT systems outperform
   All tools of computer-assisted translation in one   PBSMT in fluency (Bentivogli et al., 2016).
place (CAT platforms) provide translators with an         The target language affects the kind of morpho-
opportunity to quickly deliver a readable and          logical information learned by the NMT system.
accurate output. The platforms support the cycle       Words of the source text are better represented in
of translation projects: selecting translators,        a morphologically poorer target language, while a
teamwork, project management, delivery of the          morphologically rich language (e.g., Hebrew and
final product, and payment transfer. A CAT             Russian) needs character-based representation of
platform includes a MT system, access to shared        less frequent words in the NMT to enhance the
TM, dictionaries, thesauruses and other necessary      quality of translation (Belinkov et al., 2017). Bi-
resources. Collaborators have the opportunity to       lingual post-editors handle the errors in the MT
discuss options, comment on the source text, and       output.
share information required to understand the
content. Nevertheless, translators and revisers        2.1   Errors in Hebrew in MT Output
cannot avoid errors while using all of the             Hebrew as a source or target language of MT has
advantages.                                            undeservingly received very sparse researchers’
   Classification of translators’ errors varies        attention. As a morphologically rich language,
according to domains. In the industry, the             Hebrew features grammatical affixes, endings and
classification is very simple and pragmatically        cliticization. The inflections and addition of the
oriented. In academia, the errors are classified       subordinate elements to the main word evokes
with respect to the target text functioning in the     difficulties in processing morphology that were
target culture, mental mechanisms of bilingualism      overcome in SMT thanks to pre-processing
and code switching. Since a human bilingual            techniques based on morphological analysis and
translator operates the platform, the output reveals   disambiguation (Singh and Habash, 2012).
particular errors. Thus, we take into consideration       In general, NMT outperform PBSMT in He-
the classification of translation errors in both       brew-Arabic / Arabic-Hebrew translation
domains (See: Hansen, 2009: 316).                      (Belinkov and Glass, 2016). For better results, He-
                                                       brew needs a character-based encoding / decoding
2    Related Work: Classification of Typi-             model that improves identification of word struc-
     cal Errors in MT Output                           ture for less frequent words, while words that are
                                                       more frequent are possible to be identified in the
On Smartcat, a translator can use different MT
                                                       word-based model (Belinkov et al., 2017; Rich-
systems evaluating and editing their output. Thus,
                                                       ardson et al., 2016). Nowadays, the most suitable
we consider the errors distribution for phrase-
                                                       solution for MT translation from Modern Hebrew
based statistical and neural MT systems (PBSMT
                                                       is Google’s Multilingual NMT that involves Eng-
and NMT, respectively).
                                                       lish as an interlanguage (Johnson et al., 2017). In
   Distribution of the errors in MT outputs is usu-
                                                       translation from Hebrew (Modern and Archaic)
ally described in the aspect of quality difference
                                                       into English, the omissions and additions occur
between statistical and neural MT systems
                                                       due to high degree of compression in Hebrew
(Bentivogli, et al., 2016). Human and automatic
                                                       (Cheesman and Roos, 2017: 11).
quality evaluations of outputs of MT systems
show different results; however, NMT quality

Proceedings of MT Summit XVII, volume 2                               Dublin, Aug. 19-23, 2019 | p. 89
2.2   Errors in Russian in MT Output                   segment in the aspect of the target language norms
                                                       and usage.
Errors in Hebrew-Russian MT output have not
                                                          In general, the target text delivers its message
been described or explained in publications. In the
                                                       and performs the adequate function in the target
translation industry, translators often prefer to
                                                       culture thanks to the accuracy of its discourse and
apply a Hebrew-English-Russian MT, as in the
                                                       pragmatic features, and their correspondence to
case of Google’s Multilingual NMT. Due to this
                                                       those of the source text. For different target lan-
practice, we need to consider errors in English-
                                                       guages, peculiar MT systems were developed to
Russian MT output. According to human
                                                       translate English texts of various domains (Specia
evaluation, NMT English-Russian output
                                                       et al., 2017). Since every source text is semanti-
received marks “Near native or Native” for 75%
                                                       cally coherent and has contiguity, pragmatic pur-
segments, whilst PBSMT got the same marks for
                                                       poses, and discourse peculiarities, application of a
60% of segments in the output (Castilho et al.,
                                                       relevant MT system affects the corresponding
2017b: 121). The most frequent errors are
                                                       quality of the MT output. Meanwhile, the peculiar
morphological (42% for PBSMT, 38% for NMT),
                                                       MT systems do not exist for Hebrew-Russian or
wrong word order occurs in 12% and 9% of the
                                                       Hebrew-English-Russian. Therefore, every He-
segments for PBSMT and NMT, respectively
                                                       brew-Russian MT output needs post-editing in the
(Castilho et al., 2017b: 124).
                                                       aspect of its discourse and pragmatic peculiarities.
   The distribution of the accuracy errors varies in
                                                          The discourse and pragmatic characteristics de-
different domains and genres (Castilho et al.,
                                                       scribe the whole document, while the object of the
2017a). The category of accuracy errors includes
                                                       MT output evaluation is a text segment. Thus, the
additions, omissions, mistranslations, and
                                                       evaluation of the MT output does not consider dis-
terminology (Burchardt et al., 2017). The class of
                                                       course-pragmatic errors. Eliminating these errors,
terminology errors contains wrong choice in
                                                       the evaluation of MT output considers the seg-
terminology, while mistranslations concern
                                                       ment of the target text but skips the evaluation of
general lexicon (Lommel, 2014). In English-
                                                       the correspondence between the source and target
Russian output, omissions occur in 12% of
                                                       messages. Rules for software localization envis-
segments, equally for NMT and PBSMT; almost
                                                       age consideration of the discourse and pragmatic
the same frequency describes the additions (11%
                                                       issues in the MT output (Specia et al., 2017: 61).
equally for the both) (Castilho et al., 2017b: 124).
                                                       The CAT platforms acquire tools for localization
Meanwhile, mistranslations cover 23% in PBSMT
                                                       of the target text. Therefore, in the evaluation of
and 30% in NMT (Castilho et al., 2017b: 125). In
                                                       Smartcat output, we take into consideration all
Russian-English       output,     PBSMT         also
                                                       types of errors described in (Hansen 2009: 316).
outperforms NMT in accuracy of lexical choice
                                                       We apply the data of the errors distribution in the
(Toral and Sánchez-Cartagena 2017). In
                                                       MT output as the baseline to consider whether a
translations into Russian as a language with rich
                                                       human translator offers a better option than a raw
morphology, NMT systems lead to less accurate
                                                       or even post-edited output of MT systems. Errors
output as compared to the best of PBSMT; the
                                                       and mistakes in translation on Smartcat disclose
PBSMT contained fewer mistranslations
                                                       the value of the human factor as a contributor to
(Castilho et al., 2017b: 125).
                                                       the quality of the final product.
2.3   Classification of Errors
                                                       3     Results: Description of Errors in
In the MT output evaluation, the category of                 Translation on Smartcat
fluency       errors      includes     grammatical
(morphological, word order, syntax), orthographic      3.1    Working on Smartcat
and punctuation errors. The category of accuracy
                                                          CAT platforms transform the translators’ envi-
errors      contains       omissions,     additions,
                                                       ronment into computer-mediated communication
mistranslations, and wrong terminology choice.
                                                       (CMC) with colleagues and customers. In CMC
The classification does not account for discourse
                                                       and in the translation industry, English functions
and pragmatic errors because to detect and prevent
                                                       as lingua franca. CMC restricts the feedback to
these errors, additional tools are needed (Khadivi
                                                       comments in a chat window on the platform. Fa-
et al., 2017). A reviser of the MT output evaluates
                                                       cilitating decoding and encoding, working on a
semantic correlation between two segments (the
                                                       CAT platform exposes a translator / editor / reviser
source and the target) and adequacy of the target
                                                       to the effect of text formatting in the working win-
                                                       dow with segments of the source text. Under the

Proceedings of MT Summit XVII, volume 2                               Dublin, Aug. 19-23, 2019 | p. 90
effect, even a competent translator experiences in-         to the difference in the errors classification be-
terference of different languages in CMC.                   tween the industry and academia. Some of the dis-
Smartcat provides tools for monitoring task per-            course-pragmatic errors are considered as mis-
formance, navigating in the document, tracking              translations in the MT output.
revisions of target segments, and quality assur-            2)       In Smartcat, omissions, additions and
ance. The CAT platform enhances the efficacy of             wrong lexical choice account for 15% of all errors,
the translator’s work, on the one hand; on the other        while in the MT output the accuracy errors occur
hand, it makes possible the mixed influence of the          in 46% of segments for PBSMT and 53% for
human factor on the final product: a post-editor            NMT.4
revises the output enhancing the target text, alt-          3)       Style-shifting usually manifests in a
hough it is an opportunity to miss errors. Transla-         wrong choice of a word from the synset. The er-
tors and post-editors rarely use a particular post          rors are represented on Smartcat as a 10% share
editor’s tool or environment to identify the errors         included in the category of discourse-pragmatic
(Blagodarna 2018: 16). In addition, they often ne-          errors. In the MT output, the style-shifting is prob-
glect the MT and shared TM in the process of                ably identified as mistranslations. Therefore, the
translation (Zaretskaya, Pastor, Seghiri 2015).             difference in the distributions of accuracy errors
                                                            between Smartcat and MT could appear less es-
3.2        Distribution of the Errors: Comparison           sential.
           between MT and Smatrcat                          4)       Smartcat output is almost error-free from
We analyze the completed translation of a tourist           grammatical errors. Nevertheless, errors in or-
guide that was accepted by the customer as the              thography and punctuation diminish the fluency
first draft of the book to be edited by a                   of the target text.
professional writer. Three professional revisers
performed the manual error evaluation. An expert,           4       Discussion of the Errors in Hebrew-
the professional linguist,3 annotated the errors.                   Russian Translation on Smartcat
Such expert evaluation of the final product
                                                            4.1       Reasons for Errors of Different Types
appears to be a common practice in the industry.
In the revised Hebrew-Russian Smartcat                      Even after revisions, the discourse-pragmatic
translation of the tourist guide, 11 segments with          errors (unnecessary style-shifting and provoca-
various errors include approximately 1080 word              tive intertextual associations) occur regularly. The
forms (3% of the word forms of the source text).            stylistic errors (included in the category of dis-
The distribution of the errors reflects particular          course-pragmatic errors) reflect the well-known
characteristics of the translation and target text          peculiarities in Hebrew-Russian translation
revision on Smartcat (See Table 1).                         caused by the rich network of synonyms in the
                                                            Russian vocabulary in comparison with the He-
    Type   Dis-    Ortho- Pun-    Ter-      Gram- Omis-     brew lexicon, and usage of the distinctive syntac-
           cour-   gra-   ctua-   mino-     ma-ti- sion /   tic constructions in Russian texts according to the
           se –    phic   tion    logy /    cal    Addi-
           prag-                  lexical          tion
                                                            particular style. For example, in the following
           matic                  choice                    sentence, official and high literary styles are
    %      40      18     18      14        9      1        mixed: Шахматная держава, национальная и
                                                            международная,                    прославившаяся
   Table 1. Errors distribution in the Hebrew-Rus-          достижениями как в юношеской, таки и во
sian translation on Smartcat (percentage to all er-         взрослой категориях (literal translation: Chess
rors in the draft).                                         empire, national and international, famous for
                                                            achievements in both youth and adult categories).
   The distribution differs from that in the MT out-        Besides that, the meaning of the lexeme держава
put for Russian.                                            (empire) semantically contradicts the attribute
1)       The most typical of the Smartcat transla-          международная (international). However, the
tion failures are the discourse-pragmatic errors.           content of the sentence is most seriously damaged
We are not able to compare our data with the vol-           by the association generated by Chess empire: the
ume of the discourse errors in the MT output due            phrase associates with Ostap Bender, a popular

3                                                           4
 The expert is Professor, PhD in Russian Linguistics from       See the data in 2.2.
Saint-Petersburg State University.

Proceedings of MT Summit XVII, volume 2                                          Dublin, Aug. 19-23, 2019 | p. 91
adventurer from Russian satirical novels. The as-        translators overused commas (,) after comple-
sociation adds an ironic estimation to the city de-      ments in the beginning of the sentence and often
scribed as Chess empire. The irony ruins the prag-       use a colon (:) instead of an em dash (–). Due to
matic purpose of the city guide translation.             the signs of text formatting on the platform, trans-
   The percent of the orthographic and punctu-           lators miss marks in compound sentences. Almost
ation errors is surprisingly high as Smartcat pre-       30% of the errors show insufficient competence in
supposes automatic spelling and grammar check-           Russian punctuation norms.
ing to prevent the errors. In table 2, we provide           Terminology and lexical errors in Smartcat
examples of the orthographic errors (marked by           are similar to those in MT; they reveal misunder-
bold).                                                   standing of terminology and wrong lexical
                                                         choices. For example, instead of блуждающие
Description   of   %    Example                          пески (wondering sand) the translator used
error                                                    зыбучие пески (quicksand). The most typical of
Skipping spaces    27   в концешестидесятых годов
between words
                                                         the lexical choice errors concerns wrong selection
Overuse of capi-   59   Война     за    Независимость    within the synset ignoring collocations and se-
talisation              Израиля                          mantic restrictions. MT systems outperform hu-
Wrong spelling     11   На      территории      центра   man translators in the lexical choice associated
and misprint            действуют            городская
                        консерватория        "Акадма",   with peculiar semantic restriction. For example, to
                        балетная школа и студия танца,   refer to people or other entities in Russian, speak-
                        местные ансамбли исполнителей    ers need to choose between two different words;
                        и городские оркестры, а так же
                        великолепный музеей искусств,    имя (name) is appropriate only for people, while
                        известный по всей стране и за    objects are referred to by their название (name).
                        рубежом.
                        Ультра-ортодоксальный
                                                         NMT systems are able to process the semantic dif-
Erratum in com- 3
pound                                                    ference offering the relevant Russian word in He-
                                                         brew-English-Russian translation.
Table 2. Description of the orthographic errors in          The grammatical errors are akin to those in
                  the target text                        Russian colloquial speech. Translators and a post-
                                                         editor recognized the specific errors similar to
   The orthographic errors reveal interference           those in the MT output, but they sometimes failed
with the English language norms and gaps in              to identify word forms and idioms that belong to
technological competence and fluency in the              official style, which is irrelevant for the tourist
target language. The effect of text formating in the     guide. The most typical of the grammatical errors
working field of Smartcat appears because of the         belong to the morphological class when a wrong
use of signs preserving the formatting of the            inflection generates wrong syntactic dependencies
source text. The signs mask the space between            in long clauses. In addition, adverbial participles
words, so what is displayed on the work screen is        regularly occur in impersonal sentences that is
not what will be transferred into the final output       prohibited in Russian: Проведя (Adv. Participle-
in the target text.                                      past-perf.) время в парке, рекомендуется (Verb-
   The misuse of capitalization shows the effect of      pres.-imper.-impersonal) продолжить прогулку
the English language norm on the Russian output.         в южном направлении по прекрасной
In Hebrew, capitalization is not in use. In Russian,     прогулочной дороге (literal translation: After
the norms usually prescribe to capitalize the first      spending time in the park, it is recommended to
word in compound names of organizations and              continue walking in the south direction along the
events. The translator made mistakes under the in-       beautiful walking road).
fluence of English as lingua franca.                        Translating into Russian, professional transla-
   The two reasons – display of the translated text      tors attempt to shorten target segments and some-
and the influence of English as lingua franca –          times this leads to omissions (Kunilovskaya, Mor-
explain 86% of the orthographic errors on                goun, Pariy 2018). Meanwhile, omissions and ad-
Smartcat. Another 11% of orthographic errors are         ditions in the MT output from Hebrew appear due
caused by gaps in the translator’s target language       to a concise character of the language (Cheesman,
competence.                                              Roos 2017: 11). The omission, as well as the ad-
   Similar reasons cause the punctuation errors.         dition, can be useful for semantic coherence of the
Under the influence of the English language,             whole document as means to avoid repetition in
                                                         contact segments and establish cohesion for dis-
                                                         tant segments. Thus, an omission of information

Proceedings of MT Summit XVII, volume 2                                 Dublin, Aug. 19-23, 2019 | p. 92
in the process of Hebrew-Russian translation rep-      those that occur in the MT output. Incorrect use of
resents an error in accuracy in the segment, but       pronouns can be recognized in the process of post-
can be purposeful in the whole text perspective.       editing the target text.
Nevertheless, in the human-revised Hebrew-Rus-            Peculiar errors reveal the problems associated
sian translation on Smartcat, some of the omis-        with the source text segmentation into sentences.
sions and additions lead to the distortion of infor-   This can trigger a translator to preserve the sen-
mation in the target segment: Исследователи            tence boundaries and use a complicated Russian
приводят два возможных варианта жителей                compound sentence leading to punctuation errors.
крепости, руины которой находятся на холме
(literal translation: The researchers raised two       5    Conclusion
possible options of the inhabitants of the fortress
                                                       Our study of the set of errors in Hebrew-Russian
whose remains were found on the hill). In the
                                                       translation on the CAT platform found that CAT
source segment in Hebrew, the author mentioned
                                                       platforms provide users with good translation
two different theories explaining the origin of the
                                                       quality. The quality is better than the MT output
fortress inhabitants.
                                                       for this pair of languages. The negative impact of
4.2   The Human Factor as the Ground for               the human factor is associated with the mismatch
      Errors on Smartcat                               of the capabilities of the CAT tools and the degree
                                                       of their use by translators. Our analysis found that
In summary, orthographic and punctuation errors        the particular errors are caused by the effect of
reveal insufficient command of the CAT tools and       English as lingua franca in the translation industry
gaps in the target language competence of              and CMC. These errors diminish the fluency,
translators. On the one hand, it is necessary to       while the discourse-pragmatic errors decrease the
train skills to master CAT beforehand; on the other    accuracy of the target text. In this aspect, the
hand, due to the errors, CAT platform developers       translation on the Smartcat is similar to the NMT
can foresee particular problems of implementing        output for Russian in that the fluency of the target
text-formatting instruments in the platform. The       text is better than the accuracy. The discourse-
orthographic and punctuation errors uncover            pragmatic errors are not recognised in the MT
interlanguage interference and impact of English       output evaluation because the contiguity of the
in Hebrew-Russian translation as English strongly      whole text does not appear as an object of the MT
affects CMC (Jiménez-Crespo 2010). The                 quality evaluation. By combining human
discourse-pragmatic errors are caused by               competence and computer tools, translation on the
neglecting the target language usage, the target       CAT platforms enables acceptable translation
cultural context and the purpose of the text (the      quality to be quickly generated.
message itself). Alongside with lexical errors,           The comparison of errors in MT and on the
they break the contiguity of the target text and its   CAT platform for the Hebrew-Russian language
semantic coherence.                                    pair provides a basis for training MT systems to
   Compared to the errors in the MT outputs, the       achieve the acceptable quality. The distribution of
translation errors on the CAT platform disclose a      the errors in translation on Smartcat shows the di-
skillful mastery of the target language grammar        rection for translator and post-editor training.
and more accurate lexical choice. However, the         These results are also of importance to developers
MT provides post-editors with the translation that     of CAT platforms as enhancement of user inter-
is almost free of orthographic errors. Smartcat im-    faces considering the human factor-triggered er-
proves the technological environment for transla-      rors might contribute to greater accuracy and effi-
tors and overcomes disadvantages of MT thanks          ciency of translations.
to the opportunity to use different tools according
to the particular source segment.
                                                       References
4.3   Errors Associated with Design of CAT             Aranberri, Nora, Eleftherios Avramidis, Aljoscha
      Platform                                           Burchardt, Ondřej Klejch, Martin Popel, and Maja
                                                         Popovic. 2016. Tools and Guidelines for Principled
The source text segmentation and working
                                                         Machine Translation Development. Proceedings of
window formatting on the CAT platform provoke            the 10th Conference on Language Resources and
difficulties in expressing the coherence and the         Evaluation (LREC). Portorož, Slovenia 1877–1882.
anaphora resolution in distant semantically
coherent segments. The problems are similar to         Belinkov, Yonatan, Nadir Durrani, Fahim Dalvi, Has-
                                                         san Sajjad, and James Glass. 2017. What do neural

Proceedings of MT Summit XVII, volume 2                               Dublin, Aug. 19-23, 2019 | p. 93
machine translation models learn about morphol-           Macduff Hughes, and Jeffrey Deanet. 2017.
  ogy? Proceedings of the 55th Annual Meeting of the        Google’s multilingual neural machine translation
  Association for Computational Linguistics (ACL).          system: Enabling zero-shot translation. Transac-
  Vancouver, Canada 861–872.                                tions of the Association for Computational Linguis-
                                                            tics, vol. 5, 339–351.
Belinkov, Yonatan, and James Glass. 2016. Large-scale
  machine translation between Arabic and Hebrew:          Khadivi, Shahram, Patrick Wilken, Leonard Dahl-
  Available corpora and initial results. Proceedings of     mann, and Evgeny Matusov. 2017. Neural and Sta-
  the Workshop on Semitic Machine Translation, Aus-         tistical Methods for Leveraging Meta-information in
  tin, Texas, USA 7–12.                                     Machine Translation. Proceedings of Machine
                                                            Translation Summit XVI, vol. 1: Research Track.
Bentivogli, Luisa, Arianna Bisazza, Mario Cettolo and
                                                            Nagoya, Japan 41–54.
  Marcello Federico. 2016. Neural versus phrase-
  based machine translation quality: a case study. Pro-   Kunilovskaya, Maria, Natalia Morgoun, and Alexey
  ceedings of the 2016 Conference on Empirical              Pariy. 2018. Learner vs. professional translations
  Methods in Natural Language Processing, Austin,           into Russian: Lexical profiles. Translation & Inter-
  Texas, USA 257–267.                                       preting, 10(1), 33–52
Blagodarna, Olena. 2018. Insights into post-editors'      Lommel, Arle R., Aljoscha Burchardt, and Hans Usz-
  profiles and post-editing practices. Revista              koreit. 2014. Multidimensional Quality Metrics
  Tradumàtica. Tecnologies de la Traducció, 16: 35–         (MQM): A Framework for Declaring and Describ-
  51.                                                       ing Translation Quality Metrics. Revista tra-
                                                            dumàtica: traducció i tecnologies de la informació i
Burchardt, Aljoscha, Vivien Macketanz, Jon Dehdari,
                                                            la comunicació, 12, 455–463.
  Georg Heigold, Jan-Thorsten Peter, and Philip Wil-
  liams. 2017. A linguistic evaluation of rule-based,     Richardson, John, Taku Kudo, Hideto Kazawa, and Sa-
  phrase-based, and neural MT engines. The Prague           dao Kurohashi. 2016. A generalized dependency
  Bulletin of Mathematical Linguistics, 108(1), 159–        tree language model for SMT. Information and Me-
  170.                                                      dia Technologies, 11, 213–235.
Castilho, Sheila, Joss Moorkens, Federico Gaspari,        Shterionov, Dimitar, Riccardo Superbo, Pat Nagle,
  Iacer Calixto, John Tinsley, and Andy Way. 2017a.         Laura Casanellas, Tony O’Dowd, and Andy Way.
  Is neural machine translation the new state of the        2018. Human versus automatic quality evaluation of
  art?. The Prague Bulletin of Mathematical Linguis-        NMT and PBSMT. Machine Translation, 32(3),
  tics, 108(1), 109–120.                                    217–235.
Castilho, Sheila, Joss Moorkens, Federico Gaspari,        Singh, Nimesh, and Nadir Habash. 2012. Hebrew mor-
  Rico Sennrich, Vilelmini Sosoni, Panayota Geor-            phological preprocessing for statistical machine
  gakopoulou, Pintu Lohar, Andy Way, Antonio Bar-            translation. Proceedings of The European Associa-
  one, and Maria Gialama, M. 2017b. A comparative            tion for Machine Translation (EAMT12), Trento, It-
  quality evaluation of PBSMT and NMT using pro-             aly 43–50.
  fessional translators. Proceedings of Machine
                                                          Specia, Lucia, Kim Harris, Frederic Blain, Vivien
  Translation Summit XVI, vol. 1: Research Track.
                                                            Macketanz, Aljoscha Burchardt, Inguna Skadina,
  Nagoya, Japan 116–132.
                                                            Matteo Negri, and Marko Turchi. 2017. Translation
Cheesman, Tom, and Avraham Roos. 2017. Version              Quality and Productivity: A Study on Rich Morphol-
  Variation Visualization (VVV): Case Studies on the        ogy Languages. Proceedings of Machine Transla-
  Hebrew Haggadah in English. Journal of Data Min-          tion Summit XVI, vol. 1: Research Track. Nagoya,
  ing and Digital Humanities, July, 5, 2017, Special        Japan 55-71.
  Issue on Computer-Aided Processing of Intertextu-
                                                          Toral, Antonio, and Victor M. Sánchez-Cartagena.
  ality in Ancient Languages. 1–12.
                                                            2017. A multifaceted evaluation of neural versus
Hansen, Gyde. 2009. A classification of errors in           phrase-based machine translation for 9 language di-
  translation and revision. CIUTI-Forum 2008:               rections. Proceedings of the 15th Conference of the
  Enhancing Translation Quality: Ways, Means,               European Chapter of the Association for Computa-
  Methods. Peter Lang, Bern, Switzerland. 313–326.          tional Linguistics, vol. 1: Long Papers, Valencia,
                                                            Spain 1063–1073.
Jiménez-Crespo, Miguel A. 2010. Localization and
   writing for a new medium: a review of digital style
   guides. Tradumàtica: traducció i tecnologies de la
   informació i la comunicació, 8, 1–9.
Johnson, Melvin, Mike Schuster, Quoc V. Le, Maxim
  Krikun, Yonghui Wu, Zhifeng Chen, Nikhil Thorat,
  Fernanda Viégas, Martin Wattenberg, Greg Corrado,

Proceedings of MT Summit XVII, volume 2                                  Dublin, Aug. 19-23, 2019 | p. 94
You can also read