Participant Tracking in Text Unfolding: Insights from Portuguese-Chinese Translation and Post-Editing Task Logs

Page created by Kenneth Palmer
 
CONTINUE READING
Participant Tracking in Text Unfolding: Insights from Portuguese-Chinese Translation and Post-Editing Task Logs
Participant Tracking in Text Unfolding:
Insights from Portuguese-Chinese
Translation and Post-Editing Task Logs

                                                AuTema-PostEd Group

        Ana Luísa V. Leal1, Márcia Schmaltz1, Derek Wong1, Lidia Chao1,
                       James TL Wang1, Adriana Pagano2, Fabio Alves2,
                                       Igor da Silva3, Paulo Quaresma4

                                              University of Macau (UM)1
                            Federal University of Minas Gerais (UFMG)2
                               Federal University of Uberlândia (UFU)3
                                               University of Evora (UE)4
Participant Tracking in Text Unfolding: Insights from Portuguese-Chinese Translation and Post-Editing Task Logs
Outline

• Research Context and Aims

• Research Assumptions and Questions

• Experimental Design

• Methodology of Analysis

• Preliminary Results

• Discussion

• Next Steps
Participant Tracking in Text Unfolding: Insights from Portuguese-Chinese Translation and Post-Editing Task Logs
Research Context and Aims

Joint Project between Department of Portuguese /University of
Macau and LETRA – Laboratory for Experimentation in Translation,
Federal University of Minas Gerais, to investigate the process and
final output of human translated text as compared to human post-
editing of machine translated texts by Portuguese-Chinese
Translation System (PCT).
Research Assumptions and Questions

• Identity cohesive chains responsible for co-reference and participant
 tracking are crucial to construing a coherent representation of a text.
 (Halliday & Hasan 1976, Halliday 1989, Halliday & Matthiessen 2004)

         Do translation process data show evidence of the role of
         identity chains as exerting cognitive demands upon task
         executors?

• Translating demands more cognitive effort than post-editing (Carl et al
 2011)

         Do data in our study confirm impact of task type on effort?
Translation Process Research

•   Eye-Mind Assumption (Just & Carpenter 1980)

•   User Activity Data (Carl & Jakobsen 2009, Carl 2012b)

•   (Un)challenge translation (Carl & Dragsted 2012)
    • More / less time consuming
    • Look few / more words ahead into the ST
    • More / less keyboard activities (insertions, deletions)

•   Long pauses, regressive saccades and refixation on words already
    read (Carl & Jakobsen 2009)
•   Integrating quantitative and qualitative analysis to have a full picture
    of the translation process (Alves & Gonçalves 2013, O’Brien 2006)
Experimental Design

• Materials
   • Translog-II (Carl 2012a)
   • Eye tracker Tobii T120
   • Tobii Studio 3.2.1

• Participants
   • 12 translators, L1 Chinese, 23-32 years old, BA Portuguese
      Studies, < 1 year experience, glasses or contact lenses

• Setting and conditions
   • Eye-tracking lab at University of Macau
   • No time pressure
Experimental Design

• Input
    • 4 texts: ca. 80 words/word-equivalents – news reports
    • 2 Chinese source texts
    • 2 Portuguese source texts
    • MT output provided by PCT (Portuguese-Chinese Translator)
      (Wong et al 2012)
    • Randomized for both subjects and tasks

• Tasks
    • 1 L1 translation
    • 1 L2 translation
    • 1 post-editing into L1
    • 1 post-editing into L2
    • Recall Protocols
Experimental Design

• Input
    • 4 texts: ca. 80 words/word-equivalents – news reports
    • 2 Chinese source texts
    • 2 Portuguese source texts [Focus on Text 2]
    • MT output provided by PCT (Portuguese-Chinese Translator)
      (Wong et al 2012)
    • Randomized for both subjects and tasks

• Tasks
    • 1 L1 translation
    • 1 L2 translation
    • 1 post-editing into L1
    • 1 post-editing into L2
    • Recall Protocols
Input
                                  ST Identity Cohesive Chains
Os brasileiros estão em lua-de-mel com o mundo.
The Brazilians are in honeymoon with the world.

A Petrobras e sua presidente e sua presidente estão entre as maiores do mundo.
Petrobras and their/its president are among the greatest of the world.

Conforme uma revista americana, a presidente é uma das 100 maiores lideranças mundiais e a petrolífera
estatal é uma das 10 maiores companhias do planeta.
According to an American magazine, the president is one of the 100 greatest world leaders, and the state-
owned oil company is one of the 10 greatest companies in the planet.

A revista classifica as empresas com base em diversos indicadores além do lucro.
The magazine classifies the companies building on several indicators besides profit.

É por isso que a brasileira surge em posição de liderança na classificação.
That’s why the Brazilian [company/president] emerges in leadership position in the classification.

[Elipse] Está inclusive à frente de grandes como a Apple.
[The company/Brazil/The president] are even beating big [ones/companies/countries/presidents], such as
Apple.
Input
                                  ST Identity Cohesive Chains
Os brasileiros estão em lua-de-mel com o mundo.
The Brazilians are in honeymoon with the world.

A Petrobras e sua presidente e sua presidente estão entre as maiores do mundo.
Petrobras and their/its president are among the greatest of the world.

Conforme uma revista americana, a presidente é uma das 100 maiores lideranças mundiais e a petrolífera
estatal é uma das 10 maiores companhias do planeta.
According to an American magazine, the president is one of the 100 greatest world leaders, and the state-
owned oil company is one of the 10 greatest companies in the planet.

A revista classifica as empresas com base em diversos indicadores além do lucro.
The magazine classifies the companies building on several indicators besides profit.

É por isso que a brasileira surge em posição de liderança na classificação.
That’s why the Brazilian [company/president] emerges in leadership position in the classification.

[Elipse] Está inclusive à frente de grandes como a Apple.
[The company/Brazil/The president] are even beating big [ones/companies/countries/presidents], such as
Apple.
Methodology of Analysis

• User Activity Data
   • FU ≤ 400ms, PU ≤ 1000ms
   • Source Tokens (ST)
   • Target Tokens (TT)
       Extracted: gaze time, fixation number, insertions, deletions,
       editions, first pass fixation time (FD), STid, TTid.

• Statistical studies
    • Comprehension of ST, TT and Production
    • Subset studies with the 6 identities references (Principal,
      Secondary, Others)
Methodology of Analysis

• Quantitative
    • Linear Mixed-Effects Regression Model (LMER)
         lmerTest running on the statistical tool R (3.0.2) (Baayen
          2008, Balling 2008, 2013, Sjorup 2012)
    • p
Methodology of Analysis
    Variables              Comprehension                        Production
Dependent       Total Fixation Time                  Production time
                Total Fixation Number
                First Pass Fixation Time

Independent /   AOIs, Participant                    AOIs, Participant
Random
Independent /   Length                               Character count
Fixed           Position                             Position
                Frequency (corpus’ Banco de          Rendition (Yes or No)
                Português)                           Frequency (corpus’ CCL/UPK)
                Trigram Probability                  Trigram Probability
                Task (Translation or Post-editing)   Task (Trans. or Post-Ed.)
                Type (Reference or Comparative) Type (Reference or Comparative)
Preliminary Results
Source Text Comprehension
                               Gazing and Fixation

Variables     Total Fixation   Total Fixation     First Pass
                  Time            Number        Fixation Time

(Intercept)
Target Text Comprehension
                                   Gazing and Fixation

Variables     Total Fixation   Total Fixation     First Pass
                  Time            Number        Fixation Time

(Intercept)
Target Text Production
                              Keylogging

       Variables     Production Time

(Intercep)
Input
                                  ST Identity Cohesive Chains
Os brasileiros estão em lua-de-mel com o mundo.
The Brazilians are in honeymoon with the world.

A Petrobras e sua presidente e sua presidente estão entre as maiores do mundo.
Petrobras and their/its president are among the greatest of the world.

Conforme uma revista americana, a presidente é uma das 100 maiores lideranças mundiais e a petrolífera
estatal é uma das 10 maiores companhias do planeta.
According to an American magazine, the president is one of the 100 greatest world leaders, and the state-
owned oil company is one of the 10 greatest companies in the planet.

A revista classifica as empresas com base em diversos indicadores além do lucro.
The magazine classifies the companies building on several indicators besides profit.

É por isso que a brasileira surge em posição de liderança na classificação.
That’s why the Brazilian [company/president] emerges in leadership position in the classification.

[Elipse] Está inclusive à frente de grandes como a Apple.
[The company/Brazil/The president] are even beating big [ones/companies/countries/presidents], such as
Apple.
ST Identity Chains
                                    Gazing and Fixation

Variables     Total Fixation   Total Fixation     First Pass
                  Time            Number        Fixation Time

(Intercept)
TT Identity Chains
                                    Gazing and Fixation

Variables     Total Fixation   Total Fixation     First Pass
                  Time            Number        Fixation Time

(Intercept)
Production Identity Chains
                                  Keylogging

       Variables        Production Time

(Intercep)
Non-Parametric Tests

• ST Reading:
   • Significant differences between reference types for TNF (0.27)
    and TFT (0.46) and between subjects for TNF, TFT and FIRST
   • No significant differences between tasks

•TT Reading:
   • Significant differences between subjects for TNF, TFT and FIRST
   • Significant differences between tasks for TNF (0.13) and TFT
    (0.025)
   • No significant differences between reference types
ProgGraph P23
• Some translators take the wrong road when accepted the MT or due the
 comprehension of the ST.
       50
       45
       40
       35
       30
       25
       20
       15

                     及
                     以            西
                                  巴                 裁                                                              统
       10

            133000       134000   135000   136000       137000   138000   139000   140000   141000   142000   143000
Prograph P23
         • Rereading and ... “click”!
80
75
70
65
60
55
50
45
40
35
30
25
20
15

                                                                                                                     其            裁
                                                                                                                                  总
                                                                                             统总 西      巴
10

579500   580500   581500   582500   583500   584500   585500   586500   587500   588500   589500   590500   591500   592500   593500   594500
Discussion
Research question: Do translation process data show
evidence of the role of identity chains as exerting cognitive
demands upon task executors?

• They do with regard to the source text considering only the
 non-parametric tests. The multiple regression tests show no
 significant results considering the set of variables used in the
 experiment.

• Cohesive chains seem to have an impact on comprehension
 when reading the ST – reading for translation involves
 anticipating how chains will have to be dealt with in the TT
 (whether explicitation will be needed in the TT)
Discussion
Research question: Do data in our study confirm impact of task type
on effort?

• No. However four subjects built identity chains different from those in
 the ST.
     Their inferential path was different from the other subjects

• The results of non-parametric tests shed light on relevant aspects of
 the inferential processing in post-editing and translation
     Translation TT-driven, but a result of a comprehension process
     Tasks differ concerning the target text production
     Post-editing demands less cognitive effort for lexical rendition
       (as confirmed by RVP, i.e. less character insertions), but
       requires more effort to reorganize structures (as shown in RVP)
Next Steps
• Fine-grained, qualitative analysis of user activity data (UAD) and
  translation progression graphs

• Analysis of the impact of the first task on the second task (Ferreira
  2010)

• Further studies collecting data from native speakers of Portuguese to
  contrast reading effort in cohesive chains.

• Analyses of results for all four texts used in the experiment will permit
  more robust results.
References
ALVES, F., GONÇALVES, J. 2013. “Investigating the Conceptual-procedural Distinction in the Translation Process”. Target 25:1. p. 107-124.
BALLING, L., CARL M. (to appear). “Production Time Across Language and Tasks: A Large-scale Analysis Using the CRITT Translation
Process Database. In: Schwieter, J., Ferreira, A. (eds.) The development of Translation Competence: Theories and Methodologies from
Psycholinguistics and Cognitive Science. Cambridge: Cambridge Scholar Publishing.
CARL, M. 2012. ‘Translog-II: a program for Recording User Activity Data for Empirical Reading and Writing Research’, in Proceedings of the
Eighth International Conference on Language Resource and Evaluation. Istanbul 21-27 May 2012, Istanbul: European Language Resources
Association, 4108-4112.
CARL M., DRAGSTED, B., ELMING, J., HARDT, D. & JAKOBSEN, A. L. 2011.The Process of Post-Editing: a Pilot Study. In B. Sharp, M. Zock,
M. Carl, A.L. Jakobsen (orgs.). Proceedings of the 8th Natural Language Processing and Cognitive Science Workshop. Copenhagen Studies
in Language Series 41. p. 131-142.
HALLIDAY, M. A.K., HASAN, R. 1976. Cohesion in English. London: Longman.
HALLIDAY, M. A.K. 1989. Spoken and Written Language. Oxford: Oxford University Press.
HALLIDAY, M. A.K., Matthiessen, C. 2004. An Introduction to Functional Grammar. London: Arnold
O'BRIEN, S. 2002. Teaching post-editing: A proposal for course content. In: Teaching Machine Translation - the 6th International Workshop of
the European Association of Machine Translation, Centre for Computational Linguistics, UMIST: Manchester. p. 99-106.
O`BRIEN, S. 2004. Machine translatability and post-editing effort: how do they relate. In: Proceedings of translating and the computer 26.
London: Aslib.
O’BRIEN, S. 2006. Controlled Language and Post-Editing. The Guide From Multilingual. p. 17-19.
O’BRIEN, S. & ALMEIDA, G. 2010. Analysing post-editing performance: correlations with years of translation equivalence. In: Proceedings of
EAMT 2010: the European Association for Machine Translation, St Raphael, France.
PAVLOVIĆ, N. & JENSEN, K. T. H. 2009. Eye tracking translation directionality. In A. Pym and A. Perekrestenko (eds). Translation Research
Projects 2. Tarragona: Universitat Rovira i Virgili. p.101-119.
SJORUP, A. (2013). Cognitive Effort in Metaphor Translation. PhD Thesis. Copenhagen: Copenhagen Business School.
WONG, D., OLIVEIRA, F., LI, YP. 2012. Hybrid Machine Aided Translation System based on Constraint Synchronous Grammar and
Translation Corresponding Tree. Journal of Computers, 7(2): p. 309-316.
Thank you! Obrigada! 谢谢!
                     谢谢
You can also read