Comparative Analysis of Errors in MT Output and Computer-as- sisted Translation: Effect of the Human Factor - Association for Computational ...
←
→
Page content transcription
If your browser does not render page correctly, please read the page content below
Comparative Analysis of Errors in MT Output and Computer-as- sisted Translation: Effect of the Human Factor Irina Ovchinnikova Daria Morozova Sechenov First Moscow State Medical HEC Paris / University, Moscow, Russia / 1 rue de la Libération, Trubetskaya str., 8, building 2, 78351 Jouy-en-Josas, France 119992 Moscow, Russia daria.morozova@hec.edu Lichi Translations LTD / P.O. Box 18, Be'er Ya'akov, 70300 Israel Ira.ovchi@gmail.com in translation on Smartcat performed by Abstract professional translators uncover insuffi- ciency of CAT tools for the language pair The paper presents a comparative analysis as well as peculiar problems in applying of errors in outputs of MT and Computer- CAT tools while translating from Hebrew assisted Translation (CAT) platforms in into Russian. translation from Hebrew into Russian. A MT system, shared translation memory (TM), and dictionaries are available on 1 Introduction CAT platforms. The platforms allow for The objective of the study is to analyze the editing and improving any MT output as peculiar errors in translation from Hebrew into well as performing manual translation. Russian on a CAT platform as compared to the Evaluation of the efficiency of the plat- errors in the MT output and reveal their sources to forms in comparison with the MT systems provide developers with the translators’ feedback. shows advantages of the CAT platforms in The comparison is efficient from the practical the translation industry. The comparison point of view since a target text, being translated reveals the impact of the human factor on by a MT system or human translator, must deliver the CAT output providing developers with the source message and has to be relevant to the the feedback from translation industry. target culture. In the translation industry, revised The research was conducted on docu- MT outputs compete with human translations, ments translated from Hebrew into Rus- including those performed on CAT platforms. sian (approximately 35,000 words, 3118 Therefore, awareness of the peculiar errors in segments) on Smartcat. Errors in MT out- translation on the platforms will provide basis for put for Russian as a target language show improving Hebrew-Russian MT and for choosing almost equal shares of fluency and accu- the way to translate the particular project applying racy errors in PBSTM and prevalence of MT or hiring a human translator who has access the accuracy errors in NMT. Errors on the to a CAT platform. Smartcat platform reveal difficulties in The research was conducted on the material of mastering semantic and stylistic coher- translation projects on Smartcat.1 In the paper, we ence of the whole document. In general, discuss Hebrew-Russian translation of a tourist however, the translation is accurate and guide (9 files in Smartcat; 35,0002 word forms readable. The influence of English as lin- approximately; 3118 segments; on average, 10 gua franca appears in peculiar ortho- word forms in a segment) by a team of graphic and punctuation errors. The errors professional translators. The errors in the MT ©2019 The authors. This article is licensed under a Creative 2 It is hard to determine the exact number of word forms be- Commons 4.0 license, no derivative works, attribution, cause the author amended the text in Hebrew owing to the CCBY-ND. necessity to provide the accurate data and information. Nev- 1 ertheless, the number of segments was constant. Smartcat Platform Inc. 2019 https://www.smartcat.ai/ Proceedings of MT Summit XVII, volume 2 Dublin, Aug. 19-23, 2019 | p. 88
output are considered as a baseline for analysis of substantially surpasses that of PBSMT (Shter- errors on Smartcat. ionov et al., 2018). To the best of our knowledge, translation from Researchers differentiate between errors in flu- Hebrew into Russian on a CAT platform was not ency and accuracy of translation. Fluency errors analyzed in the aspect of the errors type as reduce the readability of the target text, while ac- compared to the MT output. Meanwhile, the curacy errors distort its content. According to the comparison allows for developers and users to data from different evaluation systems and differ- revise tools on the platforms. Translators will get ent languages, fluency errors are more prevalent the insight into specific advantages of MT systems than accuracy errors (Aranberri et al., 2016). The depending on the language pair. Being aware of most typical fluency errors are grammatical errors the advantages, translators can improve the (close to 80%: Aranberri et al., 2016: 1880). They quality of target texts combining MT and human include morphological, word order and syntax er- translation. rors. In general, NMT systems outperform All tools of computer-assisted translation in one PBSMT in fluency (Bentivogli et al., 2016). place (CAT platforms) provide translators with an The target language affects the kind of morpho- opportunity to quickly deliver a readable and logical information learned by the NMT system. accurate output. The platforms support the cycle Words of the source text are better represented in of translation projects: selecting translators, a morphologically poorer target language, while a teamwork, project management, delivery of the morphologically rich language (e.g., Hebrew and final product, and payment transfer. A CAT Russian) needs character-based representation of platform includes a MT system, access to shared less frequent words in the NMT to enhance the TM, dictionaries, thesauruses and other necessary quality of translation (Belinkov et al., 2017). Bi- resources. Collaborators have the opportunity to lingual post-editors handle the errors in the MT discuss options, comment on the source text, and output. share information required to understand the content. Nevertheless, translators and revisers 2.1 Errors in Hebrew in MT Output cannot avoid errors while using all of the Hebrew as a source or target language of MT has advantages. undeservingly received very sparse researchers’ Classification of translators’ errors varies attention. As a morphologically rich language, according to domains. In the industry, the Hebrew features grammatical affixes, endings and classification is very simple and pragmatically cliticization. The inflections and addition of the oriented. In academia, the errors are classified subordinate elements to the main word evokes with respect to the target text functioning in the difficulties in processing morphology that were target culture, mental mechanisms of bilingualism overcome in SMT thanks to pre-processing and code switching. Since a human bilingual techniques based on morphological analysis and translator operates the platform, the output reveals disambiguation (Singh and Habash, 2012). particular errors. Thus, we take into consideration In general, NMT outperform PBSMT in He- the classification of translation errors in both brew-Arabic / Arabic-Hebrew translation domains (See: Hansen, 2009: 316). (Belinkov and Glass, 2016). For better results, He- brew needs a character-based encoding / decoding 2 Related Work: Classification of Typi- model that improves identification of word struc- cal Errors in MT Output ture for less frequent words, while words that are more frequent are possible to be identified in the On Smartcat, a translator can use different MT word-based model (Belinkov et al., 2017; Rich- systems evaluating and editing their output. Thus, ardson et al., 2016). Nowadays, the most suitable we consider the errors distribution for phrase- solution for MT translation from Modern Hebrew based statistical and neural MT systems (PBSMT is Google’s Multilingual NMT that involves Eng- and NMT, respectively). lish as an interlanguage (Johnson et al., 2017). In Distribution of the errors in MT outputs is usu- translation from Hebrew (Modern and Archaic) ally described in the aspect of quality difference into English, the omissions and additions occur between statistical and neural MT systems due to high degree of compression in Hebrew (Bentivogli, et al., 2016). Human and automatic (Cheesman and Roos, 2017: 11). quality evaluations of outputs of MT systems show different results; however, NMT quality Proceedings of MT Summit XVII, volume 2 Dublin, Aug. 19-23, 2019 | p. 89
2.2 Errors in Russian in MT Output segment in the aspect of the target language norms and usage. Errors in Hebrew-Russian MT output have not In general, the target text delivers its message been described or explained in publications. In the and performs the adequate function in the target translation industry, translators often prefer to culture thanks to the accuracy of its discourse and apply a Hebrew-English-Russian MT, as in the pragmatic features, and their correspondence to case of Google’s Multilingual NMT. Due to this those of the source text. For different target lan- practice, we need to consider errors in English- guages, peculiar MT systems were developed to Russian MT output. According to human translate English texts of various domains (Specia evaluation, NMT English-Russian output et al., 2017). Since every source text is semanti- received marks “Near native or Native” for 75% cally coherent and has contiguity, pragmatic pur- segments, whilst PBSMT got the same marks for poses, and discourse peculiarities, application of a 60% of segments in the output (Castilho et al., relevant MT system affects the corresponding 2017b: 121). The most frequent errors are quality of the MT output. Meanwhile, the peculiar morphological (42% for PBSMT, 38% for NMT), MT systems do not exist for Hebrew-Russian or wrong word order occurs in 12% and 9% of the Hebrew-English-Russian. Therefore, every He- segments for PBSMT and NMT, respectively brew-Russian MT output needs post-editing in the (Castilho et al., 2017b: 124). aspect of its discourse and pragmatic peculiarities. The distribution of the accuracy errors varies in The discourse and pragmatic characteristics de- different domains and genres (Castilho et al., scribe the whole document, while the object of the 2017a). The category of accuracy errors includes MT output evaluation is a text segment. Thus, the additions, omissions, mistranslations, and evaluation of the MT output does not consider dis- terminology (Burchardt et al., 2017). The class of course-pragmatic errors. Eliminating these errors, terminology errors contains wrong choice in the evaluation of MT output considers the seg- terminology, while mistranslations concern ment of the target text but skips the evaluation of general lexicon (Lommel, 2014). In English- the correspondence between the source and target Russian output, omissions occur in 12% of messages. Rules for software localization envis- segments, equally for NMT and PBSMT; almost age consideration of the discourse and pragmatic the same frequency describes the additions (11% issues in the MT output (Specia et al., 2017: 61). equally for the both) (Castilho et al., 2017b: 124). The CAT platforms acquire tools for localization Meanwhile, mistranslations cover 23% in PBSMT of the target text. Therefore, in the evaluation of and 30% in NMT (Castilho et al., 2017b: 125). In Smartcat output, we take into consideration all Russian-English output, PBSMT also types of errors described in (Hansen 2009: 316). outperforms NMT in accuracy of lexical choice We apply the data of the errors distribution in the (Toral and Sánchez-Cartagena 2017). In MT output as the baseline to consider whether a translations into Russian as a language with rich human translator offers a better option than a raw morphology, NMT systems lead to less accurate or even post-edited output of MT systems. Errors output as compared to the best of PBSMT; the and mistakes in translation on Smartcat disclose PBSMT contained fewer mistranslations the value of the human factor as a contributor to (Castilho et al., 2017b: 125). the quality of the final product. 2.3 Classification of Errors 3 Results: Description of Errors in In the MT output evaluation, the category of Translation on Smartcat fluency errors includes grammatical (morphological, word order, syntax), orthographic 3.1 Working on Smartcat and punctuation errors. The category of accuracy CAT platforms transform the translators’ envi- errors contains omissions, additions, ronment into computer-mediated communication mistranslations, and wrong terminology choice. (CMC) with colleagues and customers. In CMC The classification does not account for discourse and in the translation industry, English functions and pragmatic errors because to detect and prevent as lingua franca. CMC restricts the feedback to these errors, additional tools are needed (Khadivi comments in a chat window on the platform. Fa- et al., 2017). A reviser of the MT output evaluates cilitating decoding and encoding, working on a semantic correlation between two segments (the CAT platform exposes a translator / editor / reviser source and the target) and adequacy of the target to the effect of text formatting in the working win- dow with segments of the source text. Under the Proceedings of MT Summit XVII, volume 2 Dublin, Aug. 19-23, 2019 | p. 90
effect, even a competent translator experiences in- to the difference in the errors classification be- terference of different languages in CMC. tween the industry and academia. Some of the dis- Smartcat provides tools for monitoring task per- course-pragmatic errors are considered as mis- formance, navigating in the document, tracking translations in the MT output. revisions of target segments, and quality assur- 2) In Smartcat, omissions, additions and ance. The CAT platform enhances the efficacy of wrong lexical choice account for 15% of all errors, the translator’s work, on the one hand; on the other while in the MT output the accuracy errors occur hand, it makes possible the mixed influence of the in 46% of segments for PBSMT and 53% for human factor on the final product: a post-editor NMT.4 revises the output enhancing the target text, alt- 3) Style-shifting usually manifests in a hough it is an opportunity to miss errors. Transla- wrong choice of a word from the synset. The er- tors and post-editors rarely use a particular post rors are represented on Smartcat as a 10% share editor’s tool or environment to identify the errors included in the category of discourse-pragmatic (Blagodarna 2018: 16). In addition, they often ne- errors. In the MT output, the style-shifting is prob- glect the MT and shared TM in the process of ably identified as mistranslations. Therefore, the translation (Zaretskaya, Pastor, Seghiri 2015). difference in the distributions of accuracy errors between Smartcat and MT could appear less es- 3.2 Distribution of the Errors: Comparison sential. between MT and Smatrcat 4) Smartcat output is almost error-free from We analyze the completed translation of a tourist grammatical errors. Nevertheless, errors in or- guide that was accepted by the customer as the thography and punctuation diminish the fluency first draft of the book to be edited by a of the target text. professional writer. Three professional revisers performed the manual error evaluation. An expert, 4 Discussion of the Errors in Hebrew- the professional linguist,3 annotated the errors. Russian Translation on Smartcat Such expert evaluation of the final product 4.1 Reasons for Errors of Different Types appears to be a common practice in the industry. In the revised Hebrew-Russian Smartcat Even after revisions, the discourse-pragmatic translation of the tourist guide, 11 segments with errors (unnecessary style-shifting and provoca- various errors include approximately 1080 word tive intertextual associations) occur regularly. The forms (3% of the word forms of the source text). stylistic errors (included in the category of dis- The distribution of the errors reflects particular course-pragmatic errors) reflect the well-known characteristics of the translation and target text peculiarities in Hebrew-Russian translation revision on Smartcat (See Table 1). caused by the rich network of synonyms in the Russian vocabulary in comparison with the He- Type Dis- Ortho- Pun- Ter- Gram- Omis- brew lexicon, and usage of the distinctive syntac- cour- gra- ctua- mino- ma-ti- sion / tic constructions in Russian texts according to the se – phic tion logy / cal Addi- prag- lexical tion particular style. For example, in the following matic choice sentence, official and high literary styles are % 40 18 18 14 9 1 mixed: Шахматная держава, национальная и международная, прославившаяся Table 1. Errors distribution in the Hebrew-Rus- достижениями как в юношеской, таки и во sian translation on Smartcat (percentage to all er- взрослой категориях (literal translation: Chess rors in the draft). empire, national and international, famous for achievements in both youth and adult categories). The distribution differs from that in the MT out- Besides that, the meaning of the lexeme держава put for Russian. (empire) semantically contradicts the attribute 1) The most typical of the Smartcat transla- международная (international). However, the tion failures are the discourse-pragmatic errors. content of the sentence is most seriously damaged We are not able to compare our data with the vol- by the association generated by Chess empire: the ume of the discourse errors in the MT output due phrase associates with Ostap Bender, a popular 3 4 The expert is Professor, PhD in Russian Linguistics from See the data in 2.2. Saint-Petersburg State University. Proceedings of MT Summit XVII, volume 2 Dublin, Aug. 19-23, 2019 | p. 91
adventurer from Russian satirical novels. The as- translators overused commas (,) after comple- sociation adds an ironic estimation to the city de- ments in the beginning of the sentence and often scribed as Chess empire. The irony ruins the prag- use a colon (:) instead of an em dash (–). Due to matic purpose of the city guide translation. the signs of text formatting on the platform, trans- The percent of the orthographic and punctu- lators miss marks in compound sentences. Almost ation errors is surprisingly high as Smartcat pre- 30% of the errors show insufficient competence in supposes automatic spelling and grammar check- Russian punctuation norms. ing to prevent the errors. In table 2, we provide Terminology and lexical errors in Smartcat examples of the orthographic errors (marked by are similar to those in MT; they reveal misunder- bold). standing of terminology and wrong lexical choices. For example, instead of блуждающие Description of % Example пески (wondering sand) the translator used error зыбучие пески (quicksand). The most typical of Skipping spaces 27 в концешестидесятых годов between words the lexical choice errors concerns wrong selection Overuse of capi- 59 Война за Независимость within the synset ignoring collocations and se- talisation Израиля mantic restrictions. MT systems outperform hu- Wrong spelling 11 На территории центра man translators in the lexical choice associated and misprint действуют городская консерватория "Акадма", with peculiar semantic restriction. For example, to балетная школа и студия танца, refer to people or other entities in Russian, speak- местные ансамбли исполнителей ers need to choose between two different words; и городские оркестры, а так же великолепный музеей искусств, имя (name) is appropriate only for people, while известный по всей стране и за objects are referred to by their название (name). рубежом. Ультра-ортодоксальный NMT systems are able to process the semantic dif- Erratum in com- 3 pound ference offering the relevant Russian word in He- brew-English-Russian translation. Table 2. Description of the orthographic errors in The grammatical errors are akin to those in the target text Russian colloquial speech. Translators and a post- editor recognized the specific errors similar to The orthographic errors reveal interference those in the MT output, but they sometimes failed with the English language norms and gaps in to identify word forms and idioms that belong to technological competence and fluency in the official style, which is irrelevant for the tourist target language. The effect of text formating in the guide. The most typical of the grammatical errors working field of Smartcat appears because of the belong to the morphological class when a wrong use of signs preserving the formatting of the inflection generates wrong syntactic dependencies source text. The signs mask the space between in long clauses. In addition, adverbial participles words, so what is displayed on the work screen is regularly occur in impersonal sentences that is not what will be transferred into the final output prohibited in Russian: Проведя (Adv. Participle- in the target text. past-perf.) время в парке, рекомендуется (Verb- The misuse of capitalization shows the effect of pres.-imper.-impersonal) продолжить прогулку the English language norm on the Russian output. в южном направлении по прекрасной In Hebrew, capitalization is not in use. In Russian, прогулочной дороге (literal translation: After the norms usually prescribe to capitalize the first spending time in the park, it is recommended to word in compound names of organizations and continue walking in the south direction along the events. The translator made mistakes under the in- beautiful walking road). fluence of English as lingua franca. Translating into Russian, professional transla- The two reasons – display of the translated text tors attempt to shorten target segments and some- and the influence of English as lingua franca – times this leads to omissions (Kunilovskaya, Mor- explain 86% of the orthographic errors on goun, Pariy 2018). Meanwhile, omissions and ad- Smartcat. Another 11% of orthographic errors are ditions in the MT output from Hebrew appear due caused by gaps in the translator’s target language to a concise character of the language (Cheesman, competence. Roos 2017: 11). The omission, as well as the ad- Similar reasons cause the punctuation errors. dition, can be useful for semantic coherence of the Under the influence of the English language, whole document as means to avoid repetition in contact segments and establish cohesion for dis- tant segments. Thus, an omission of information Proceedings of MT Summit XVII, volume 2 Dublin, Aug. 19-23, 2019 | p. 92
in the process of Hebrew-Russian translation rep- those that occur in the MT output. Incorrect use of resents an error in accuracy in the segment, but pronouns can be recognized in the process of post- can be purposeful in the whole text perspective. editing the target text. Nevertheless, in the human-revised Hebrew-Rus- Peculiar errors reveal the problems associated sian translation on Smartcat, some of the omis- with the source text segmentation into sentences. sions and additions lead to the distortion of infor- This can trigger a translator to preserve the sen- mation in the target segment: Исследователи tence boundaries and use a complicated Russian приводят два возможных варианта жителей compound sentence leading to punctuation errors. крепости, руины которой находятся на холме (literal translation: The researchers raised two 5 Conclusion possible options of the inhabitants of the fortress Our study of the set of errors in Hebrew-Russian whose remains were found on the hill). In the translation on the CAT platform found that CAT source segment in Hebrew, the author mentioned platforms provide users with good translation two different theories explaining the origin of the quality. The quality is better than the MT output fortress inhabitants. for this pair of languages. The negative impact of 4.2 The Human Factor as the Ground for the human factor is associated with the mismatch Errors on Smartcat of the capabilities of the CAT tools and the degree of their use by translators. Our analysis found that In summary, orthographic and punctuation errors the particular errors are caused by the effect of reveal insufficient command of the CAT tools and English as lingua franca in the translation industry gaps in the target language competence of and CMC. These errors diminish the fluency, translators. On the one hand, it is necessary to while the discourse-pragmatic errors decrease the train skills to master CAT beforehand; on the other accuracy of the target text. In this aspect, the hand, due to the errors, CAT platform developers translation on the Smartcat is similar to the NMT can foresee particular problems of implementing output for Russian in that the fluency of the target text-formatting instruments in the platform. The text is better than the accuracy. The discourse- orthographic and punctuation errors uncover pragmatic errors are not recognised in the MT interlanguage interference and impact of English output evaluation because the contiguity of the in Hebrew-Russian translation as English strongly whole text does not appear as an object of the MT affects CMC (Jiménez-Crespo 2010). The quality evaluation. By combining human discourse-pragmatic errors are caused by competence and computer tools, translation on the neglecting the target language usage, the target CAT platforms enables acceptable translation cultural context and the purpose of the text (the quality to be quickly generated. message itself). Alongside with lexical errors, The comparison of errors in MT and on the they break the contiguity of the target text and its CAT platform for the Hebrew-Russian language semantic coherence. pair provides a basis for training MT systems to Compared to the errors in the MT outputs, the achieve the acceptable quality. The distribution of translation errors on the CAT platform disclose a the errors in translation on Smartcat shows the di- skillful mastery of the target language grammar rection for translator and post-editor training. and more accurate lexical choice. However, the These results are also of importance to developers MT provides post-editors with the translation that of CAT platforms as enhancement of user inter- is almost free of orthographic errors. Smartcat im- faces considering the human factor-triggered er- proves the technological environment for transla- rors might contribute to greater accuracy and effi- tors and overcomes disadvantages of MT thanks ciency of translations. to the opportunity to use different tools according to the particular source segment. References 4.3 Errors Associated with Design of CAT Aranberri, Nora, Eleftherios Avramidis, Aljoscha Platform Burchardt, Ondřej Klejch, Martin Popel, and Maja Popovic. 2016. Tools and Guidelines for Principled The source text segmentation and working Machine Translation Development. Proceedings of window formatting on the CAT platform provoke the 10th Conference on Language Resources and difficulties in expressing the coherence and the Evaluation (LREC). Portorož, Slovenia 1877–1882. anaphora resolution in distant semantically coherent segments. The problems are similar to Belinkov, Yonatan, Nadir Durrani, Fahim Dalvi, Has- san Sajjad, and James Glass. 2017. What do neural Proceedings of MT Summit XVII, volume 2 Dublin, Aug. 19-23, 2019 | p. 93
machine translation models learn about morphol- Macduff Hughes, and Jeffrey Deanet. 2017. ogy? Proceedings of the 55th Annual Meeting of the Google’s multilingual neural machine translation Association for Computational Linguistics (ACL). system: Enabling zero-shot translation. Transac- Vancouver, Canada 861–872. tions of the Association for Computational Linguis- tics, vol. 5, 339–351. Belinkov, Yonatan, and James Glass. 2016. Large-scale machine translation between Arabic and Hebrew: Khadivi, Shahram, Patrick Wilken, Leonard Dahl- Available corpora and initial results. Proceedings of mann, and Evgeny Matusov. 2017. Neural and Sta- the Workshop on Semitic Machine Translation, Aus- tistical Methods for Leveraging Meta-information in tin, Texas, USA 7–12. Machine Translation. Proceedings of Machine Translation Summit XVI, vol. 1: Research Track. Bentivogli, Luisa, Arianna Bisazza, Mario Cettolo and Nagoya, Japan 41–54. Marcello Federico. 2016. Neural versus phrase- based machine translation quality: a case study. Pro- Kunilovskaya, Maria, Natalia Morgoun, and Alexey ceedings of the 2016 Conference on Empirical Pariy. 2018. Learner vs. professional translations Methods in Natural Language Processing, Austin, into Russian: Lexical profiles. Translation & Inter- Texas, USA 257–267. preting, 10(1), 33–52 Blagodarna, Olena. 2018. Insights into post-editors' Lommel, Arle R., Aljoscha Burchardt, and Hans Usz- profiles and post-editing practices. Revista koreit. 2014. Multidimensional Quality Metrics Tradumàtica. Tecnologies de la Traducció, 16: 35– (MQM): A Framework for Declaring and Describ- 51. ing Translation Quality Metrics. Revista tra- dumàtica: traducció i tecnologies de la informació i Burchardt, Aljoscha, Vivien Macketanz, Jon Dehdari, la comunicació, 12, 455–463. Georg Heigold, Jan-Thorsten Peter, and Philip Wil- liams. 2017. A linguistic evaluation of rule-based, Richardson, John, Taku Kudo, Hideto Kazawa, and Sa- phrase-based, and neural MT engines. The Prague dao Kurohashi. 2016. A generalized dependency Bulletin of Mathematical Linguistics, 108(1), 159– tree language model for SMT. Information and Me- 170. dia Technologies, 11, 213–235. Castilho, Sheila, Joss Moorkens, Federico Gaspari, Shterionov, Dimitar, Riccardo Superbo, Pat Nagle, Iacer Calixto, John Tinsley, and Andy Way. 2017a. Laura Casanellas, Tony O’Dowd, and Andy Way. Is neural machine translation the new state of the 2018. Human versus automatic quality evaluation of art?. The Prague Bulletin of Mathematical Linguis- NMT and PBSMT. Machine Translation, 32(3), tics, 108(1), 109–120. 217–235. Castilho, Sheila, Joss Moorkens, Federico Gaspari, Singh, Nimesh, and Nadir Habash. 2012. Hebrew mor- Rico Sennrich, Vilelmini Sosoni, Panayota Geor- phological preprocessing for statistical machine gakopoulou, Pintu Lohar, Andy Way, Antonio Bar- translation. Proceedings of The European Associa- one, and Maria Gialama, M. 2017b. A comparative tion for Machine Translation (EAMT12), Trento, It- quality evaluation of PBSMT and NMT using pro- aly 43–50. fessional translators. Proceedings of Machine Specia, Lucia, Kim Harris, Frederic Blain, Vivien Translation Summit XVI, vol. 1: Research Track. Macketanz, Aljoscha Burchardt, Inguna Skadina, Nagoya, Japan 116–132. Matteo Negri, and Marko Turchi. 2017. Translation Cheesman, Tom, and Avraham Roos. 2017. Version Quality and Productivity: A Study on Rich Morphol- Variation Visualization (VVV): Case Studies on the ogy Languages. Proceedings of Machine Transla- Hebrew Haggadah in English. Journal of Data Min- tion Summit XVI, vol. 1: Research Track. Nagoya, ing and Digital Humanities, July, 5, 2017, Special Japan 55-71. Issue on Computer-Aided Processing of Intertextu- Toral, Antonio, and Victor M. Sánchez-Cartagena. ality in Ancient Languages. 1–12. 2017. A multifaceted evaluation of neural versus Hansen, Gyde. 2009. A classification of errors in phrase-based machine translation for 9 language di- translation and revision. CIUTI-Forum 2008: rections. Proceedings of the 15th Conference of the Enhancing Translation Quality: Ways, Means, European Chapter of the Association for Computa- Methods. Peter Lang, Bern, Switzerland. 313–326. tional Linguistics, vol. 1: Long Papers, Valencia, Spain 1063–1073. Jiménez-Crespo, Miguel A. 2010. Localization and writing for a new medium: a review of digital style guides. Tradumàtica: traducció i tecnologies de la informació i la comunicació, 8, 1–9. Johnson, Melvin, Mike Schuster, Quoc V. Le, Maxim Krikun, Yonghui Wu, Zhifeng Chen, Nikhil Thorat, Fernanda Viégas, Martin Wattenberg, Greg Corrado, Proceedings of MT Summit XVII, volume 2 Dublin, Aug. 19-23, 2019 | p. 94
You can also read