STUDYING THE USEFULNESS AND RELIABILITY OF ENGLISH TO CHINESE MACHINE TRANSLATION - Master IDTM

Page created by Raul Crawford
 
CONTINUE READING
STUDYING THE USEFULNESS AND RELIABILITY OF ENGLISH TO CHINESE MACHINE TRANSLATION - Master IDTM
Université Clermont Auvergne
UFR Langues, Cultures et Communication

            STUDYING THE
      USEFULNESS AND RELIABILITY
        OF ENGLISH TO CHINESE
        MACHINE TRANSLATION

                              Marion Batisse

                   Master 2 Langues Étrangères Appliquées
            Ingénierie de la Documentation Technique Multilingue

                    Directed by Dacia Dressen-Hammouda
                                 2017-2018
STUDYING THE USEFULNESS AND RELIABILITY OF ENGLISH TO CHINESE MACHINE TRANSLATION - Master IDTM
2
Abstract
The purpose of this study is to identity if Google Translate is reliable and to which level, and if it is
useful in a working environment. An emphasis was placed on English to Chinese language as there is
very few research in this direction. Through an analysis of a corpus analysis translated from English to
Chinese by machines translators, it is shown that Google Translate is more accurate than expected
with ratings higher than average. Interviews of managers dealing with translations daily were also
conducted in order to define several situations where machine translators would be useful like
checking if the general meaning of the text or some words and small sentences, cross-referencing an
agency’s translations or seeing what a target language translation would look like in a layout.
However, it has to be kept in mind that machine translators should be used with precaution as there
is no way to assess the quality of translations when having no language knowledge. The use of this
tool is recommended only when mastering the language, and post-editing and proofreading phases
are mandatory.

Keywords: Chinese, Google Translate, machine translation, reliability, translation project, usefulness,
working environment

Résumé
Le but de cette étude est de définir si Google Traduction est fiable et à quel niveau, et s'il est utile
dans le cadre de l’entreprise. L'accent a été mis sur la traduction de l'anglais vers le chinois car il y a
très peu de recherches dans ce sens. Grâce à l'analyse d'un corpus de textes traduit de l'anglais vers
le chinois par des traducteurs automatiques, il est démontré que Google Traduction est plus
performant qu’imaginé avec des notes supérieures à la moyenne. Des interviews avec des managers
qui s'occupent quotidiennement de traductions ont également été menées afin de définir plusieurs
situations où les traducteurs automatiques seraient utiles. Par exemple, pour vérifier le sens général
du texte ou certains mots et petites phrases, pour intégrer les traductions d'une agence dans le
document d’origine et donc faire des références croisées, ou encore pour voir à quoi ressemblerait
une traduction dans la langue cible au niveau de la mise en page du document. Cependant, il faut
garder à l'esprit que les traducteurs automatiques doivent être utilisés avec précaution car il n'y a
aucun moyen d'évaluer la qualité des traductions lorsque l’utilisateur ne possède pas de
connaissances suffisantes en langues étrangères. L'utilisation de cet outil n'est donc recommandée
que pour la maîtrise de la langue, et les phases de post-édition et de relecture sont obligatoires.

Mots-clés : cadre de travail, chinois, fiabilité, Google Traduction, projet de traduction, traduction
automatique, utilité

                                                     3
4
1. Introduction
A little more than 7000 languages are currently spoken in the world (Fennig & Simmons, 2018). Some
are spoken by the majority of the word population and some are used by a minority of people. Either
way, the need to understand others’ languages came very early with the development of different
civilizations. Translation comes from the Latin word translatio, which means “to carry across” (Vélez,
2016). It is an old process that probably appeared back during antiquity according to discussions. As
civilizations evolved and the world continued to develop itself, translation became quite essential for
the diffusion of information. As a nation consistently growing, China is one of the countries that
attracts a great number of international companies. English to Chinese translation has become a
necessity to communicate effectively in the global market.

Technological advances actually led to big changes in the translation process (Li, 2015; Doherty,
2016). With the development of internet, you can find more content, more easily than ever and in
the language that you speak. However, when some attention is drawn on internet content, we see
that there is a dominance of English language. The numbers show that in 2018, 25.3% of internet
users are speaking English and that 52.4% of the available content is in English. This is definitely a big
part of internet content and we can find it logical as English is the most spoken and understood
language in the world (Crystal, 2008). However, a fact to be noted is that although 19.4% of internet
users speak Chinese, only 1.8% of the content is in Chinese. The available content on the internet is
clearly not proportional to the number of users speaking the language. Chinese being the second
language with the highest number of internet users, its speakers should have access to a greater
amount of information and content. This is the same situation with all languages. Although actual
internet content for each language may be satisfactory enough for the population, we can see that
the diffusion of information has a language barrier to overcome. Moreover, those former numbers
do not grow quickly. For Chinese content, they are even falling because of the creation of more
content in other languages. Therefore, we need technological tools to improve this situation.
Something that could answer our need is machine translation.

However, machine translation systems have many risks whether they are used for business or
personal daily life (MacKenzie, 2014). While it is cheap and supposedly “effortless”, it can lower the
value and quality of the content. Researches – that we will see in more details later – have been
conducted and actually found several solutions to reduce the error rate like using controlled
languages or post-editing (Vivien, 2013). The aim of this study is to see if machine translation can be
considered as a reliable tool to use in a company and to find in which cases it can be useful.
Therefore, this paper main questions are: Is a perfect translation really important for a company?

                                                    5
And what is a perfect translation? To which level is it ok to rely on the use of machine translation? In
which situations can we say that machine translation can be useful in a working environment?

To answer those questions, a corpus analysis of a text which was translated from English to Chinese
with a machine translator was chosen. This analysis will allow to rate whether machine translation is
reliable or not for this language. Moreover, two interviews were conducted with a French project
manager and a Chinese product manager in order to find the problems related to translation in a
company. These interviews allowed me to define different situations where machine translation
could be useful and how to use it. My hypothesis was that although machine translation quality has
greatly evolved recently, it is hard to consider it reliable and that the results of the translated corpus
would be very wide-ranging. I thought that both my interviewees would not use machine translation
since its reputation is quite bad (MacKenzie, 2014). However, they would provide a lot of situations
where the use of machine translator could be useful.

First, an overview of machine translation will be given with its dangers and how researchers have
found solutions to reduce the number of mistakes. Then there will be more details of English and
Chinese translation and about the reliability of machine translators. Secondly, the methodology of
this experiment will be explained before giving the results of the analysis and interviews. Lastly, the
previous results will be discussed and the last part will define in which situation machine translation
could be useful in a working environment. Several lines of reflection for possible further study will
also be suggested.

                                                    6
2. Theoretical background
2.1 Machine translation
Machine translation is the process of translating a text from a source language to a target language
by using a computer (Hutchins, 1995). According to this definition, a human should not need to be
involved in the process but we will see later that this is not necessarily true. There are several types
of technologies used for translation: statistical, rule-based, example-based, hybrid and neural
machine translators. This paper focuses on one the most used free machine translator in the world:
Google Translate.

Google Translate was launched in 2006. It originally worked with a statistical-based technology. It
consisted in translating the source language into English and then translating it into the target
language thanks to bilingual references which are called parallel corpora (Koehn, 2010). However, it
brings some big disadvantages to languages which do not have a lot of human-translated documents
and resources. Since 2016, a new technology is used by Google: Google Neural Machine Translation.
According to Google, this system is capable to learn by itself how to produce more fluent translations
(Turovsky, 2016). The change is that instead of translating pieces of sentences and putting them
together, this new system translates the sentence as a whole without needing to translate it in
English as a first step. Without going into technical details, this system replicates the human brain by
analyzing the meaning, the grammar and finally the context of the sentence. According to Turovsky,
Google Translate is the most used machine translator in the world with more than 500 million daily
users benefiting from this tool’s features.

2.2 Problems and some solutions
Machine translators, such as Google Translate, are used for their many advantages. Indeed, they can
work much faster than manual translation and save a lot of time. According to Boitet (2008), you can
save one hour and five minutes by translating the text with a machine translator and post-editing it,
instead of just translating it manually. Moreover, it is cheaper than addressing a professional
translator and a certain confidentiality can be kept. However, machine translators can easily bring
problems of accuracy and context which can – in certain situations – do a lot of damages.

MacKenzie (2014) explains that people mostly use Google Translate when shopping on e-Commerce
websites, reading blogs or articles from overseas. In addition, Google can suggest a page translation
of a website, and more and more businesses are relying on those methods to translate their website.
According to the author, it is a dangerous decision to make. She affirms that translations are not
entirely accurate as machine translators can still not understand and use local nuances, peculiarities

                                                    7
and idiosyncrasies present in all the world’s languages. In addition, grammar being quite different
according to the language pair like English/Chinese for example, the final meaning or syntax is often
wrong. That is why machine translator is a risky tool for businesses. While it is cheap and effortless, it
can lower the value and quality of the content. For the author, businesses in technical industry like
manufacturing, engineering or chemical companies should definitely avoid machine translators as
there is a vast amount of important terminology and grammar that should be translated carefully.

Previous research developed and improved solutions to ease the translation process by using
machine translators. Ferret (2015) studied if the pre-translation of a text could improve the quality of
machine translation. He demonstrated that pre-translation could avoid recurrent errors in
translations and simplify post-editing, but that it is not enough to use independently. Therefore, he
decided to combine pre-translation and controlled languages which are a set of generally accepted
rules that enables to avoid machine translation mistakes as much as possible. His results showed that
today, there is no better alternative than the controlled languages approach, and that pre-translation
does not show good enough results. However, given the fast-developing speed of machine
translation, there is the high possibility that Farret’s method could be more efficient in the near
future. The controlled language approach was studied in more details by Vivien (2013). She
investigated the quality of machine translation with the English-French language pair. She explains
that the quality and number of mistakes of machine translation require a mandatory post-editing by
the human hand. In order to reduce as much as possible this post-editing phase, Vivien created a
template for translatable contents by using principles from controlled languages. After her
experiment on nine texts, she observed that using the template does not remove the post-editing
phase but that the translation quality is better and post-editing clearly faster and easier. She
concludes that while her template is not perfect because of the need of human participation, it
would definitely help people who wish to offer a good online language experience to their target
readers.

2.3 English and Chinese translation
Chinese and English are very different languages which make the translation between the two even
harder. The grammar is not the same and there are no letters, plural forms or tenses in Chinese.
While in English the verb is conjugated to imply the tense, it is done by adding words to the sentence
in Chinese. The order of the words in a sentence can also be very different which could cause
confusion. In addition, some Chinese characters can mean several English words and vice versa. In
her research, Brazill (2016) identified the problems of Chinese to English translation and one of them
was machine translators. Because machine translators cannot translate idiomatic expressions or

                                                    8
understand the context, cultural awareness is important to improve the translation quality. She
thinks that a machine translator is best used for rough translations in order to understand the
essence of the text even if some parts are not accurate. It is important to have a professional
translator to at least proofread the translated texts. What we can understand from this is that
machine translation should not be used by people with no knowledge of the target language.
However, it could be different if we could assess that the quality is good enough for a general
understanding.

2.4 Machine translators and reliability
Assessing machine translation quality can be quite difficult to do. Many studies have tried to evaluate
if machine translators were reliable. However, translation quality can be rated as high “according to
some standards and be a bad translation according to others” (Görög, 2014). He points out:

      “A translator's work might be excellent in terms of fluency (meaning it sounds natural or
      intuitive), but how about the adequacy of the translation (and its fidelity to the source text) or
      errors made based on an error typology (such as terminology, country standards and
      formatting)?”.

He thinks about a way to grade translation as somebody would rate a hotel with stars. With
comprehensibility as a standard, less than one-star would be a text not even considered as translation
since the meaning would be completely different. A five-star translation could be either very fluent or
accurate to the source text but done with tight deadlines. A similar system to rate machine translation
could be very useful. For this reason, being able to analyze whether machine translators are reliable
and to which level is necessary.

Roig Allué (2017) studied the reliability of Google Translate by analyzing mistakes made after the
translation process. She defined three variables for her corpus compilation which are the language
(English/Spanish), the direction of the translation (English into Spanish and vice-versa) and the genre
of the texts (tourist: tourist texts and sport: football match reports). Her hypothesis was that Google
Translate would be reliable enough to have a general idea of what the text meant rather than to
obtain a professional translation. After her analysis, she discovered that lexicogrammatical and
syntactic mistakes can frequently be found in the translated texts for both tourism and sport genres.
Sometimes, the meaning is still understandable, but other times the texts have a lot of mistakes
which leads to misunderstandings. Translations of tourist texts are better when done from Spanish to
English. She supposes that this situation is due the fact that the more frequent a genre is online, the
better the quality of its translations will be. She concludes that Google Translate cannot be

                                                   9
considered reliable as it does not fulfill the user’s need for understanding, although the quality will
presumably improve during the next few years. However, this does not eliminate the fact that
machine translators could useful in several situations. The objective is to find out the right situations
to use it.

                                                   10
3. Methodology
3.1 Research design
The purpose of this study was to determine whether machine translators can be useful for Chinese-
English translations, in particular in working settings. A number of authors have written extensively
about Chinese to English translations, highlighting its difficulties (Brazill, 2016; Vilar et al., 2006).
A qualitative approach was used. In order to study reliability, a text corpus of twelve different texts
was built analyzed after they were translated from English to Chinese by a machine translator. Those
translated were categorized by using a grid. This method is based on Roig Allué’s research (2017)
who decided to evaluate the reliability of Google Translate by using three variables: the languages
being translated, the direction of the translation and the genre of the texts.
In order to gain further insight into how machine translators could be useful in a work environment,
interviews of two persons working at the company where I am currently doing my internship were
conducted. Sharing their point of view concerning translation/machine translation for projects in a
company setting and finding solutions to solve the translation problems they encountered would be
a good way to see if machine translators would be useful in different situations.

3.2 Participants
A native Chinese speaking person helped to do the translated corpus analysis. She is a twenty-two
years old student currently living and studying in Beijing. She is also able to speak English, Korean and
Japanese at a level where it is possible to follow classes at university so she has an affinity with
languages. She was the one who graded the translated Chinese text as my Chinese level was not high
enough to do it.

A total of two participants took part in the interviews of the study. One is a French project manager
who creates e-learning modules for a French company’s commercial delegates. The other is a
Chinese product manager currently working in China for the Chinese branch of the same company. I
had the opportunity to meet her during a training where we both discussed translation problems for
sales representatives and employees in China. They both had encountered several situations where
they had to translate documents or e-learning modules in from English to Chinese and had difficulties
to do so.

It is right and overdue to remark that I personally know the people who were interviewed. I needed
to have an idea of their level in Chinese so that the interview would be relevant to my study.
However, that fact will not affect the results of this study as the analysis method is completely
objective

                                                      11
3.3 Materials and procedure for data collection
To evaluate if machine translation is reliable and useful in different situations, a 1,835 words text
corpus of different genres was built from three different websites (see Appendix A). The three genres
were chosen because they are quite different from one another where the level of translation
accuracy could have a different impact: recipes, news articles and medical notices for children. They
also were selected because the use of Google Translate is justified for all three. Indeed, someone
could simply be searching for a recipe but have problems with the names of ingredients or
instructions. News articles are also often translated since potential users would like to know what is
happening in the rest of the world. Regarding medical notices for children, the use of machine
translation could be very useful if the user is in another country where a language is spoken that is
not the language used in the notice. However, it is important to note that a wrongly translated
cooking recipe has technically less impact than a mistranslated medicine notice for children.

For each genre, different text extracts were selected as seen in Table 1. Three texts were chosen for
News (585 words) and for Medicine (635 words), while six texts were selected for Recipes (615
words) because the content was smaller. Please note that the corpus can be considered as relatively
small because the translation was done for two different machine translators.

                          Table 1: Number of words used for the text corpus

                              TOPIC           TEXT      NUMBER OF WORDS
                             Recipes            1              49
                                                2              82
                                                3             221
                                                4             126
                                                5              57
                                                6              80
                                             TOTAL            615
                               News             1             267
                                                2             197
                                                3             121
                                             TOTAL            585
                             Medicine           1             247
                                                2             104
                                                3             284
                                             TOTAL            635

All the source texts were modified according to Vivien’s (2013) controlled languages template. This
approach was proved to reduce the number of translation mistakes and was chosen in this present
study because machine translators have to be as reliable as possible.

                                                   12
Each text was then translated by two machine translators to compare which one was more reliable
and faithful to the source text. Google Translate and Baidu Fanyi were chosen because the first is the
most used machine translator in the world (Turovsky, 2016) and the second is from a Chinese
company that is currently the leader of the online search market in China (Incitez China, 2015). Then,
following Görög for rating a translation (2014), a Chinese assistant helped to rate the translated texts
from one to five for their reliability in terms of the meaning of the translation:
         1.   The text makes no sense
         2.   A large portion of the text is incomprehensible
         3.   Some elements are wrong or missing and the meaning of the text is hard to understand
              (some important information is left out)
         4.   Some elements are wrong or missing but the meaning of the text is clear
              (you get the essential information)
         5.   The text was faithfully translated

For my co-workers’ interviews, a list of ten questions mostly inspired by previous discussions on the
subject was prospectively built (see Appendix B). They were also mostly based on a research from
Brazill (2016) who carried out several interviews and built surveys in order to identify and solve
Chinese to English translation problems. Because my French colleague was working at the same site,
it was possible to interview her directly. The interview was recorder in order to explore and analyze
her remarks. For my Chinese colleague, the distance did not enable a direct interview. The solution
that she would record herself answering the questions was chosen.

                                                    13
14
4. Results
First, the translated corpus analysis will be explained, followed by the interviews outcomes.

4.1 Translated corpus analysis
Table 2 shows that most of the translations are considered better by a Chinese native speaker when
translated by Google Translate. This implies that the meaning of the translated text was closer to the
original and less information was lost. Only Recipe 5 and Recipe 6 translations were actually better
when translated by Baidu Fanyi. Because 83% of the translation texts were chosen for Google
Translate, this means that this machine translator is typically more reliable than Baidu Fanyi.

                 Table 2: Machine translator which had the best translation results

              TOPIC            TEXT        GOOGLE TRANSLATE                   BAIDU
             Recipes             1                 X
                                 2                 X
                                 3                 X
                                 4                 X
                                 5                                               X
                                 6                                               X
              News               1                     X
                                 2                     X
                                 3                     X
            Medicine             1                     X
                                 2                     X
                                 3                     X

Concerning the rating of the texts, we can note that the results are quite average as shown in table 3.
Indeed, no translated texts scored 1, 2 or 5 out of 5. 50% of the texts scored 3 and 50% scored 4 out
of 5. This indicates that in all texts, some elements were wrong or missing. In Recipes 1/2/3 and in
Medicines 1/2/3, the meaning of the translated text is clear enough to get the essential information.
In Recipes 4/5/6 and News 1/2/3, the meaning of the text was hard to understand because
important information was left out. If we analyze the scores by theme, we can see than Medicine
was better translated with a score of 4 out of 5 for its average. Just behind we have Recipes with an
average of 3.5 out of 5 and in last place News with 3 out of 5.

                                                  15
Table 3: Rating the meaning of translated texts

                                                            RATING
                                 TOPIC          TEXT
                                                            (1 TO 5)
                                Recipes           1             4
                                                  2             4
                                                  3             4
                                                  4             3
                                                  5             3
                                                  6             3
                                  News            1             3
                                                  2             3
                                                  3             3
                               Medicine           1             4
                                                  2             4
                                                  3             4

4.2 Interviews results
A total of ten questions were answered by the two participants (see Appendix B).

    1. While taking into account that only a few people in the company present in China can
        speak English, do you think it is important to translate contents intended for commercial
        delegates in their native language (in Chinese)? Why?

Both interviewees think that translating content intended for commercial delegates in their native
language is very important. They mentioned that in their company, very few Chinese sales persons
actually speak English. The Chinese product manager said that sometimes the workers will make an
effort to ask someone who can speak English to translate, and sometimes they will just ignore the
document. This situation will clearly have an influence on their business. The French project manager
stated that delivering content is not the only important part: the feeling of wanting to read and
commitment are also necessary. If not, the translated texts will not be efficient. The commitment lies
in the ease of navigation through the content, and language will be a way to smooth this navigation
to eventually assimilate the information better. She added that although language was important,
cultural differences were also fundamental. How people perceive a subject is very different from one
country to another. She related a previous experience when she was told that she was being too
familiar with German employees while she was speaking the same way as she would do in France.
Therefore, the content does not only need to be translated but also adapted to Chinese sales
representatives.

                                                  16
2. By which part this translation should be done? France (with the source language), China
        (with target language), internally (by the services responsible for the documents…) or
        externally (translation agency…)?

Both interviewees think that translations should be done both by a translation agency and internally.
The Chinese product manager mentioned that for translations related to products knowledge,
company employees will do a better job because translation agencies do not know the company’s
products, wording or corporate codes. According to the French project manager, the translation
agency could be very useful to trim off the text. This would ease the process by translating general
sentences that take the most time to do. Then, the proofreading would be done by a native speaker
because he knows about the company’s specifications. She said that although this solution is the
most complete, it is not the most efficient because of the wording used in the company.
Standardizing the vocabulary and manner of speaking could be more efficient in the long run. This
solution is longer to carry out but is still more relevant and fitting.

    3. Generally, are translations done directly from French language to Chinese, or from French
        to English to Chinese?

The source language chosen for the translation depends on the start request. The interviewees said
that generally, translations are done from English to Chinese although the company is French. This is
because the first training module or the document created is most of the time the English one.

    4. What do you think is the biggest challenge when you translate English to Chinese?

The manager from China believes that the biggest challenges when you translate English to Chinese
would be some language habits such as the fact that the grammar is different between the two
languages and that long sentences in Chinese give a more complicated feeling compared to English.
The other French interviewee mentioned that the biggest challenge is to replace one language by the
other is a document by doing the match-ups. The problem is that she usually sends files in a language
that she masters but receives the translated files in a language that she does not understand. Thus,
proof-reading is mandatory after changing the source text by the translated text on the training
module. Moreover, this situation causes some graphic issues since the way English and Chinese are
written is completely different. Indeed, Chinese sentences with characters take less space or can
even be written from top to bottom.

    5. Some words are quite difficult/impossible to translate (for ex: product name, brand…) In
        those cases, who decides about the final name/translation?

There is also the issue of some words being impossible to translate into Chinese, like products names
or brands. In this case, the company will translate it based on the pronunciation. According to the

                                                     17
Chinese manager, they sometimes just choose a nice name which sounds good in Chinese and not
necessarily relevant to the English name. The French manager added that the global marketing
director is usually the one who has the final say. He makes his decision with each country manager
because there are specificities in each country and market.

    6. To your knowledge, was there some cases of mistranslation? What was the impact?

To my interviewees’ knowledge, there have been several cases of mistranslation before in the
company. According to the Chinese product manager, such mistranslation are often not serious
mistakes. The employees can still understand the meaning of the text but it can make people laugh
and have less efficiency to deliver its message. There were of course mistranslations which resulted
in simple misunderstanding, and in the most serious cases, those employees understood in a
completely different way than what was intended. The French project manager related another
experience. In her experience, the training team tried to produce an e-learning module intended for
the company’s Chinese sales representatives. However, the result was the non-delivery of the final
document. Indeed, the training team received a text which has been translated by a translation
agency but no one could do the cross-referencing of the text. China did not have the necessary
resources to do it and France did not have Chinese knowledge to finish it. She added that it was not a
lack of vision and willingness but a lack of resources. Another case was mentioned where a degraded
German translation was delivered. It was originally proof-read by a native. Although the training was
important and necessary for the German team, the number of learners who did participate online
was unusually low. Further feedback informed the training department that the translation was of
very low quality although it was proof-read by a native. She concluded that in a same language, the
nuances and regional intricacies can have a big impact on the training comprehension, frequentation
and efficiency.

    7. Do you think that some translation mistakes due to the fact that the translator did not
        speak a perfect English or Chinese have a lot of importance?

The Chinese product manager answered that even if the translator was a native, this could also lead
to mistakes. She related an example of a former Chinese product manager who could speak both
languages very well but did not have any knowledge in the target fields which were chemistry and
physics. Therefore, to speak a perfect Chinese or English is not the only thing that is important. The
project manager from France added that it is not a problem of vocabulary but more a problem of
culture and phrasing. In her opinion, a translation agency will translate word-by-word in order to be
as close as possible to the text. She added that it was a pity to follow the original structure of the
sentence because this process prevents the use of daily life idioms and phrasing. Even if the

                                                    18
translation is right, the sentence is wrong with regard to acculturation. This is what machine
translators are actually lacking. She also mentioned the Deepl online translator which she thinks does
a better job in this field.

    8. Do you think that if the person making the document (ex: e-learning) knew Chinese, it
         could make the translation process easier?
    9. Do you think that if the person doing the translated text integration in the document knew
         Chinese, it could make the translation process easier?

For the questions concerning if the person making the document (e-learning) or the translated text
integration knew Chinese, they both answered that it could make the translation process easier. If
you have the Chinese language’s codes in mind when creating a document in English, the creation
could be optimized so that the translated text is easier to do. They added that knowing Chinese
would also make the text integration easier and this could avoid a situation like we saw previously
with the non-delivery of the training module.

    10. What do you think about the use of machine translators in the company?

Finally, both interviewees think that machine translators are not really reliable in a working setting
but could still be useful depending on how it is used. The Chinese manager would not rely on it for
full sentences. She said that it would be useful to translate some words or small phrases. The French
interlocutor added that she uses machine translator for the languages that she masters or have some
knowledge of. She needs it when she has a doubt or in order to feel reassured about the right word
but she knows where she wants to go with it. The mistake would be to use machine translator
without any knowledge. For her, machine translators can be used if somebody already has languages
notions because they allow a critical thinking. Therefore, not having this knowledge will result in
translating an approximate language and will not make the flow or the learning easier. She concluded
that machine translators have meaning only when the level of language proficiency is enough.

                                                   19
20
5. Discussion
5.1 Reliability is a matter of content
It was no surprise that Google Translate had the best translation results. In a non-scientific article
written in 2018, a test was conducted in order to see which machine translator had the best quality
output (Fu, 2018). The results showed that Google Translate performed well for relating the main
idea of a text. In this article, the machine translators were classified by tiers. While Google Translate
was listed as a first tier machine translator, Baidu was listed as second tier. Another fact to be noted
is that the official translation used by the Chinese government was very similar to the translation
from Google Translate. The author supposes that it is possible that they used Google Translate as a
reference. To this day, a scientific study comparing machine translators for the English-Chinese
language pair was not found. Therefore, it could be interesting to see which Machine Translator is
the most reliable.

In another study in 2016, the authors analyzed whether Neural Machine Translation was really better
than Statistical Machine Translation quality wise (Junczys-Dowmunt et al., 2016). For all languages
pairs involving Chinese, NMT always had better results than SMT and by a great range (see Figure 1).

             Figure 1: Comparison between NMT and SMT systems for 6 languages pairs

  Source: Junczys-Dowmunt, M., Dwojak, T., & Hoang, H. (2016). Is Neural Machine Translation Ready for Deployment?
                                    A Case Study on 30 Translation Directions.

                                                        21
Therefore, we can see that the NMT system is better adapted to Asian languages than the SMT
system.

One of my hypothesis was that some genres of text would be more faithfully translated than others. I
thought that Recipes would have the best translations, then News and finally Medicine. The reasons
were because Recipes did not have very difficult words to translate, News would have some mistakes
mainly because of names that would be hard to translate into Chinese and finally Medicine because
of the difficult lexical field. However, this was not the case. When we study the reasons why the
results were not exactly what was expected, we can note several points. The medicine notices were
quite easy to follow and the instructions were actually straight to the point. Therefore, the
translation process was easier for the machine translator. One of the hardest point could have been
the disease/medicine names translation but they were also translated correctly. We did not give a 5
out of 5 to the rating because although the sentences were understandable, it was not always right
grammatically. The fact that Recipes has similar rating is for the same reasons. The most redundant
mistake was that the words were reversed, which often changed the meaning of the whole sentence.
We suppose that there was too much important information in a small sentence. We also assume
that the News texts had the lower score because of the writing style which is quite formal and more
difficult to translate than instructions, names of people or places and citations. Some words were not
even translated. Thanks to those results, we can say that a translation will be more or less reliable if
the genre is easy to translate. In addition, the way the source text is written will have an impact on
the translation reliability: for example, if a company want to translate a technical datasheet for a
product, it should be careful of words like products or brand names because those kinds of words do
not have a match in other languages. The translation is bound to have mistakes if a machine
translator is used like this. This is why the author of the document should think about the content
and how it will be translated. An example would be to use standard wording and avoid cultural
references or the use of slang because it does not translate very well in other languages and cultures
(Rimalower, 2009). Therefore, it would greatly help if the author of the document had knowledge
about the target language, which is often not the case.

Another surprising fact was that no text was rated 1 or 2. This proves that machine translation is
useful when you want to know what a text is about. However, no text was rated with a 5 out of 5.
Actually, we can ask ourselves if 4 could be the highest grade we can give to a machine translator.
This could be the case because to this day, human post-editing is still necessary to have an error-free
and fluent sounding translation. Because of this fact, having a translation rated 4 would be no
different from a 5 points translation. While the meaning will be clear, there will still be work to do on

                                                   22
the style, grammar or even expressions. Even if the text was faithfully translated by a machine
translator, it would still be far from perfect.

5.2 A perfect translation
Even if a translation is really faithful to the source text, it does not necessarily mean that it is perfect.
In part 4.2, we saw that acculturation is very important for commitment in order to give people the
desire to look at the content we write. Researchers explain that literal translation was considered as
an ideal during certain times (Lilova, 1987; Jin, 1997). However, this concept has changed a lot along
with the development of translation. Lilova states that “Translation is not just an activity of
reproduction but is one of creation”. The fact that the effect is more important than matching all the
words is well known but not necessarily applied. In the interview, the project manager mentioned
that some of the translations she received from an agency were too similar to the original text and
did not suit the target language. This result is comparable to what a machine translator could give.
The difference is that for a professional translator, we are sure that the meaning is right even if it is
not well adapted. An interesting non-scientific article was found about whether perfect translation
was a reality. The author wonders if translations are a matter of taste and if it should be treated
subjectively (Rourke, 2015). A translation can be good for a native and completely wrong for another,
which was a problem identified through the interviews. To solve it, we could either inquire in detail
about the target and adapt the translation perfectly or choose to translate subjectively so that the
text is understood by a majority. This first solution is definitely the best regarding quality and
potential commitment. However, in case of several different targets needing the same language
translation, this would be too difficult and expensive to do. For example, if we had to translate the
text into Chinese, we should take into account that the main spoken language is Mandarin. However,
if the Chinese headquarters are in Shanghai, translating the text in Shanghainese dialect would have
better results because using local expressions would raise involvement. It could also be possible that
some people working in Shanghai come from a lot of different places and do not necessarily speak
this dialect. Therefore, the only viable solution for businesses would be to follow the majority by
using the most used language. Being aware of language nuances is good but it raises several
challenges for machine translation.

Translation is more a problem of culture and phrasing instead of vocabulary. Even if we want to
convey the same thing, we have to be careful of which expression or tone to use. In 4.2, we saw an
example of when the project manager used the same distance approach to communicate with the
German and the French sales forces. However, she had feedback that she was acting too familiar
with the Germans. This is due to the fact that German culture is considered as low-context culture

                                                     23
(Meyer, 2014). This term was first introduced by Edward Hall (1976). According to him, low-context
cultures have a specific behavior. In order to communicate effectively, messages should be explicit.
Privacy and having a certain distance between people are also essential. All of those cultural aspects
have to be taken into account when translating and localizing.

We saw that the effect is the most important part of a perfect translation. However, machine
translators are known for being as close as possible to the source text. The French interviewee
mentioned that the free machine translator Deepl actually gives better translation regarding
matching expressions instead of word-to-word. Several recent researches found out that Google
Translate used a literal translation while Deepl seemed more fluent and nuanced (Coldeway &
Lardinois, 2017; Isabelle & Kuhn, 2018). However, it is still not possible to test the English-Chinese
language pair as it is not available on the platform. In addition to this, we have to mind languages
habits. Like said in 5.1, translating expressions or slangs can be challenging. For example, if we take
the Chinese expression 加油 (jiāyóu), it literally means “fill a tank with petrol”. However, this

expression is used to encourage somebody and could be translated as many things like “do your
best, hang in there, good luck”. Machine translators have difficulties to translate those kinds of
expressions as seen in the following picture.

                 Picture 1: Difficult Chinese-to-English translation of the word 加油

                                          Source: Google Translate

Another point to be noted is that when we translate Chinese, we have to think about the future
document layout. Writing in characters is very different from writing in letters. When making an e-
learning, creating the layout is a big part. Therefore, we have to think beforehand about the final

                                                    24
result and if it will be suitable for the target language. This is where machine translation can be
useful as people can have an idea of what the text will look like although the content will not be
right.

Because of all of this, it is hard to say what would be a perfect translation. We can only conclude that
there would be as many perfect translations as people and situations.

5.3 The right situations to use machine translation
As said in 5.1, a rated 4 translation is actually good enough for a machine translator because a post-
editing and proofreading phases are still necessary. Therefore, the results are actually similar to Roig
Allué’s (2017). She concluded that although the meaning of the translated texts is sometimes
understandable, misunderstandings can happen frequently and this situation makes machine
translators unreliable tools. Even though we cannot entirely rely on machine translators, it does not
mean that it is not useful in a working environment.

Machine translators are useful if people can assess the translation quality. This means that people
need knowledge about languages and their codes in order to see if the meaning is right or not. If
somebody does not master the language or have sufficient skills, it is recommended to use a
translation agency because they can deliver a reliable document. Indeed, the translation process in
agencies is well supervised, done by experts and the translation is sent after one or several
proofreadings. However, choosing the right translator is also important. It has to be an expert in the
field. Consider asking for a fluent sounding translation that could help raise commitment as it is ok to
not have a word-to-word result. You can also provide a glossary for the company’s wording and a
style guide with all the company’s conventions for documents in order to have the best translation as
possible with the least eventual modifications.

These tools can also be helpful to do the cross-reference of the source text and the translated text in
a document. Even if the person responsible for this task do not have any knowledge of the used
languages, machine translators enable to check if the meaning is at least similar, which is enough for
this situation. However, this requires several mandatory proofreadings to be sure the cross-
referencing was done correctly.

In the situation where a collaborator cannot read English and a Chinese translation is not available,
machine translators can be used to have the main ideas of the text. However, it is important to be
careful with this method. As machine translators are considered unreliable, it is necessary to confirm
the right meaning with other persons. If the user knows the source language, the best solution to
avoid as much mistakes as possible is to work on the source text. It was indeed proved that using
controlled languages gives better translation results.

                                                   25
Using machine translator in order to check single words or to translate general sentences is also
conceivable. However, people have be careful to avoid elements likely to cause errors like specific
wording, slang or expressions. In can also help for the document layout during the creation phase.
The author can see what the result would look like and do an early adaptation process to simplify the
text integration.

                                                  26
6. Conclusion
The purpose of this current study was to evaluate whether machine translators could be used in a
work environment. It was also designed to see to which level we can trust it for the English-Chinese
language pair. The findings of this investigations were that Google Translate did a better job than
expected. Indeed, the quality of the translations regarding the meaning was higher than average for
half of the texts. This means that the main ideas and information were comprehensible. However, for
the other half of the texts, some important details were lost in translation, which compromised
Google Translate quality for these languages. What came out of this was that machine translators
should be used if the knowledge in the source and target languages are high enough to assess if the
translation meaning is correct. Overall, this study strengthens the fact that post-editing and
proofreading is always necessary. Having the opportunity to interview both a Chinese product
manager and a French project manager who deal with translations daily was also a good way to
identify situations where machine translators would help rather than be harmful. Those findings
support the facts that machine translators are suited for supporting human translation, translating
words or short sentences, inserting the translated text into the document without having knowledge
in the language, taking a look of what a translated text could look like in a target language – mostly
for the possible layout result – or even helping somebody who does not understand the text to know
most of its general meaning. We still have to note that this solution is far from being ideal as
information can be lost.

Before this study, evidence of Google Translate accuracy for translating English to Chinese texts was
purely anectodical. Indeed, several researchers analyzed which translator did a better job with
quality or if machine translation was accurate but never a scientific article for the English to Chinese
pair (Fu, 2018; Isabelle & Kuhn, 2018). This study’s results allow us to see a real analysis of a
translated corpus and see a real proof of how much Google Translate is reliable. The main weakness
of this study is that analyzing a bigger amount of text would have given more specific results. Half of
the text was rated 3 while the other half was rated 4 out of 5. Other texts analysis would have maybe
turned the results around and we could have observed which part would be the highest. Moreover,
my ability to speak Chinese being not sufficient to do this study by myself, it cannot be ensured that
my Chinese colleague actually rated the text to my own understanding of it. We both discussed the
rating together in order to be as accurate as possible, and the rating we used was as objective as
possible. More interviews could have also been a good way to collect various points of view and raise
new questions about machine translators. Despite its exploratory nature, this study certainly adds to
our understanding of how English to Chinese machine translation developed throughout the years.
We can now see that the misconception that machine translators – translating English to Chinese in

                                                    27
particular – make complete no-sense is not right at all. It can clearly be useful in business and solve
problems related to translation. The only thing is that caution should always be exercised when
dealing with these tools. This study should be repeated using a larger text corpus. This will enable a
fairer number and see if the most obtained results would be 3 or 4 out of 5. Further research could
also be conducted to determine which machine translator is the most accurate and feel the most
fluent for this language pair. One point mentioned in this study was that Deepl made translations of
better quality. Because research has been conducted for available languages and studies got good
results (Coldeway & Lardinois, 2017; Isabelle & Kuhn, 2018), we have to keep watch on this machine
translator. It would be very interesting to wait for the Chinese language to be implemented and
compare which translator between Google and Deepl would give the most accurate and fluent
translations.

                                                   28
7. References
Boitet, C. (2008). La Traduction Automatique : ça marche ou non ?.

Brazill, S. (2016). Chinese to English Translation: Identifying Problems and Providing Solutions.
Graduate Theses & Non-Theses, 71.

Incitez China. (2015). China Search Engine Market Overview in 2014. Retrieved September 10, 2018,
from https://www.chinainternetwatch.com/12678/search-engine-market-overview-2014/

Coldewey, D., & Lardinois, F. (2017). DeepL schools other online translators with clever machine
learning. Retrieved September 10, 2018, from https://techcrunch.com/2017/08/29/deepl-schools-
other-online-translators-with-clever-machine-learning/?guccounter=1

Crystal, D. (2008). Two thousand million?. English Today, 24(01), 3-6. doi:
10.1017/s0266078408000023

Doherty, S. (2016). The Impact of Translation Technologies on the Process and Product of
Translation. International Journal Of Communication, 10, 947-969.

Farret, J. (2015). Machine Translation: how to avoid errors?.

Fennig, C., & Simmons, G. (2018). Ethnologue: Languages of the World, Twenty-first edition. Dallas,
Texas: SIL International. Retrieved September 10, 2018, from http://www.ethnologue.com

Fu, Y. (2018). Who Offers the Best Chinese-English Machine Translation? A Comparison of Google,
Microsoft Bing, Baidu, Tencent, Sogou, and NetEase Youdao · Yiqin Fu. Retrieved September 10,
2018, from https://yiqinfu.github.io/posts/machine-translation-chinese-english-june-2018/

Görög, A. (2014). Evaluating quality in translation [Ebook]. Retrieved September 10, 2018, from
https://www.multilingual.com/article/201412-22.pdf

Hall, E. (1976). Beyond culture.

Hutchins, W. (1995). Machine Translation. In S. Chan & D. Pollard, An Encyclopedia of Translation (pp.
591-602).

Isabelle, P., & Kuhn, R. (2018). A Challenge Set for French -> English Machine Translation. doi:
arXiv:1806.02725v2

Jin, D. (1997). What is a perfect translation?. Babel Revue Internationale De La Traduction /
International Journal Of Translation, 43(3), 267-272. doi: 10.1075/babel.43.3.06jin

Junczys-Dowmunt, M., Dwojak, T., & Hoang, H. (2016). Is Neural Machine Translation Ready for
Deployment? A Case Study on 30 Translation Directions. doi: arXiv:1610.01108

Koehn, P. (2014). Statistical machine translation (pp. 4-7). Cambridge: Cambridge University Press.

Li, A. (2015). Machines, Lost In Translation: The Dream Of Universal Understanding.

                                                   29
Lilova, A. (1987). The perfect translation - Ideal and reality. In M. Gaddis Rose, Translation Excellence:
Assessment, Achievement, Maintenance (pp. 9-18).

MacKenzie, E. (2014). The dangers of machine translation. Retrieved September 10, 2018, from
http://blog.webcertain.com/the-dangers-of-machine-translation/09/07/2014/

Meyer, E. (2014). The Culture Map: Breaking Through the Invisible Boundaries of Global Business.
New York, NY: PublicAffairs.

Rimalower, G. (2009). Tips for Writing a Document Destined for Translation. Intercom (pp. 21-22).

Roig Allué, B. (2017). The Reliability and Limitations of Google Translate: A Bilingual, Bidirectional and
Genre-Based Evaluation.

Rourke, J. (2015). Does the perfect translation exist?. Retrieved September 10, 2018, from
https://silvertonguetranslations.com/perfect-translation/

Turovsky, B. (2016). Found in translation: More accurate, fluent sentences in Google Translate.
Retrieved September 10, 2018, from https://blog.google/products/translate/found-translation-
more-accurate-fluent-sentences-google-translate/

Vélez, F. (2016). Antes de Babel (pp. 3-21). Granada: Comares.

Vilar, D., Xu, J., D'Haro, L., & Ney, H. (2006). Error analysis of statistical machine translation output.
Proceedings Of LREC, 697-702.

Vivien, J. (2013). A Loosely-Defined Controlled Language Template can help the English-to-French
Machine Translation of Non-Technical Texts.

                                                     30
8. Appendices
Appendix A: corpus of text used for the translation analysis
                                                RATING
    TOPIC    TEXT
                                                (1 TO 5)
                       https://www.allrecipes.com/recipe/260527/maple-bacon-
   Recipes    1
                                              crepe-stack/
                    https://www.allrecipes.com/recipe/241528/fudgy-nutella-mug-
              2
                                                  cake/
              3     https://www.allrecipes.com/recipe/23600/worlds-best-lasagna/
                        https://www.allrecipes.com/recipe/93234/honey-walnut-
              4
                                                shrimp/
                      https://www.allrecipes.com/recipe/14746/mushroom-pork-
              5
                                                 chops/
              6      https://www.allrecipes.com/recipe/21313/banana-pudding-iii/
    News      1           https://www.bbc.com/news/world-europe-44531448
              2             https://www.bbc.com/news/uk-politics-44532500
              3         https://www.bbc.com/news/world-us-canada-44538110
                    https://www.medicinesforchildren.org.uk/amoxicillin-bacterial-
  Medicine    1
                                              infections-0
              2      https://www.medicinesforchildren.org.uk/metformin-diabetes
                         https://www.medicinesforchildren.org.uk/topiramate-
              3
                                          preventing-seizures

                                       31
Appendix B: Questions asked during the interviews

 1. While taking into account that only a few people in the company present in China can
    speak English, do you think it is important to translate contents intended for
    commercial delegates in their native language (in Chinese)? Why?

 2. By which part this translation should be done? France (with the source language),
    China (with target language), internally (by the services responsible for the
    documents…) or externally (translation agency…)?

 3. Generally, are translations done directly from French language to Chinese, or from
    French to English to Chinese?

 4. What do you think is the biggest challenge when you translate English to Chinese?

 5. Some words are quite difficult/impossible to translate (for ex: product name, brand…)
    In those cases, who decides about the final name/translation?

 6. To your knowledge, was there some cases of mistranslation? What were the impacts?

 7. Do you think that some translation mistakes due to the fact that the translator did not
    speak a perfect English or Chinese have a lot of importance?

 8. Do you think that if the person making the document (ex: e-learning) knew Chinese, it
    could make the translation process easier?

 9. Do you think that if the person doing the translated text integration in the document
    knew Chinese, it could make the translation process easier?

 10. What do you think about the use of machine translators in the company?

                                            32
You can also read