Going beyond Google Translate?
←
→
Page content transcription
If your browser does not render page correctly, please read the page content below
Going beyond Google Translate? Francesca Chessa Gavin Brelstaff DLS, University of Sassari CRS4, Loc. Piscina Manna, Ed. 1 Sassari (SS) 09010 Pula (CA) Italy. Italy. fch @ uniss.it gjb @ crs4.it ABSTRACT words via a popup text-box – any misalignment between We motivate and describe the design and implementation of a compounds of words cannot be easily corrected, and can web-based system for the alignment of parallel texts. It builds on significantly impede semantic interpretation. Misalignment the interactive color-highlight interface now deployed at Google happens whenever SMT indicates a false equivalence between Translate. By a series of simple point and click operations text-ranges in the original and translation – and is usually a fair translators can mark up equivalent text-ranges in their own sign that SMT could not assimilate adequate context. Here, we translation and in the original. When successful, the visual cues provide an interface that lets the human translator markup up what created by this activity should benefit the understanding of readers they consider a correct alignment between words, or groups of of limited degrees of bilingualism – and, may also capture aspects words, in the original and their own translation – with a view to of semantic context not readily available to algorithmic SMT. We articulating context that may not be readily available to SMT. We provide a working demonstration that treats poetic texts. detail below how the interface runs off a web-page and allows the alignment of equivalent ranges in parallel texts via a simple point- and-click action. Alignments created by the user are Categories and Subject Descriptors instantaneously made visible using a variant of the interactive H.1.2 [User/Machine Systems] Human factors. H.5.2 [User color-highlight system mentioned above. Key to reducing the Interfaces] (D.2.2, H.1.2, I.3.6) Natural language. I.7.2 complexity of the implementation of the interface is our systematic deployment of open-standard, non-proprietary, web [Document Preparation] Markup languages. H.5.3 [Group and technologies. The same ideas might be integrated into a full text Organization Interfaces] Web-based interaction. editor – but we prefer to deliver an alignment tool directly from General Terms the web-page in order to promote web-based collaborative Documentation, Design, Human Factors, Standardization. interaction between translators. As such, we progress beyond Germann’s Yawat demonstrator [6] to a cross-browser solution facilitated by the jQuery Javascript library [7]. Keywords Multilingual web, Translation, Parallel texts, Semantic context, Intermediate representation, TEI markup, Poetry, Cross-browser. 2. HCI TO COMPLEMENT SMT 1. INTRODUCTION Machine Intelligence typically progresses by the development of In terms of HCI, computer-assisted translation is still relatively algorithms that emulate aspects of human perceptual and unsophisticated. Typically statistical machine translation (SMT) cognitive activity – where these algorithms process the same data is first computed and then presented to the reader as a fait available to humans (e.g. digital texts), and attempt to produce the accompli, however inaccurate it might be. Recently, however, same or better results. The tendency is towards objective SMT web-services such as Google Translate [1] have adopted an algorithmic optimization that is to be achieved without explicit interactive color-highlight system by which words or phrases in access to semantic context – in the hope that such elusive context the source text that correspond to those in the translated text light might emerge implicitly as statistical correlations inherent in the up as the reader passes the cursor over them – benefiting thus computation of the algorithm. This is fine in scenarios where full from an existing metaphor refined over the years [2,3,4,5]. automation can achieved, but can be counterproductive otherwise Although that service does let readers improve on the translated – especially when human intervention is later required in order to first find and then correct a significant percentage of mistaken results, as can happen with SMT. Indeed it may be preferable to design algorithms to complement acknowledged human skills rather than focus on optimizing existing algorithms to compete with and supersede them. A first step towards designing such algorithms, for SMT, is to establish an intermediate representation that might reasonably articulate semantic context so that it can be readily manipulated by both man and machine – by either cognition or computation (sketched in Figure 1).
Figure 1: A spatio-visual intermediate representation for semantic context We side-step the innate complexities of a theoretical approach most languages, and ideograms in Chinese. This focuses the [8,9] and seek, instead, an intermediate representation amongst user immediately on their task of aligning words between the elements presented as interactive color highlights in user texts. interfaces like Google Translate. In the first instance, this 2. Any translator is necessarily engaged in a language-based representation might be pictured as simply drawing a set of task and thus we try to keep our interface non-verbal so not labeled boxes around words and phrases in the original and translated texts and then joining lines between those boxes that to disturb their cognitive activity – by focusing as far a possible on spatio-visual cues. carry equivalent meaning. Depending on the type of equivalence the lines might be colored differently (e.g. green: literal; yellow: 3. The default cursor sprite displayed in any web browser is bad approximate; or red: paraphrase). This perspective makes it clear for pointing at words while they are being read: Either a little that we are dealing with a spatio-visual mapping between parallel white hand, or a small vertical dark bar, obscures letters in texts. Although a diagram of such a mapping should be easy to the word – disrupting the task. We swap-in a little see- visually digest, the task of constructing it, in a standard machine- through cursor, instead – if the browser permits (Opera alone readable format, may be considered beyond the competence of an does not). typical translator – even furnished with the latest computer graphics tools: the manipulation of intersecting lines quickly 4. Since the two parallel texts are displayed, on screen, side-by- becomes overly complex for the non-technical user. Translators, side the words to be aligned across the texts are almost by nature, are familiar with constructing texts rather than always in view at the same time. Thus when the user scrolls graphics. Thus we provide a paired-down graphical user interface down in one text it is useful to automatically scroll down the (GUI) designed specifically to simplify the mapping task to a other synchronously – as we have programmed. series of point-and-click operations occurring on top of the 5. Our eyes’ fovea, cannot resolve much more that a few words familiar interactive color-highlight system. Thus the translator at a time – and reading fluently generally requires a delicate may conceive of their task as a traditional markup task that choreography of eye movements involving short term visual operates upon text, not graphics, and which they can do in memory. Thus, although our interface seems to pretend we sequence, (focusing on only one equivalence at a time) while can, we can never truly read two parallel texts simultaneously periodically reviewing the overall results – either by tracing the – it is simply impossible to resolve the letters in two well cursor along the lines of text, or by the other means described separated locations at once [11]. Instead, our interface is later. Beneath the GUI, the intermediate representation is designed to facilitate a smooth switching of gaze between maintained as digital text, not graphics – using the TEI markup those two locations. We do this by reinforcing the visibility language [10]: an open standard format defined in XML, and used of the highlighted words – so when they are not being extensively within academic text-annotation and archival resolved by the fovea our peripheral vision is better able to communities. direct the next eye movement towards them. Such reinforcement is achieved by the use of an extended semantic highlighting scheme– detailed below. 3. DESIGN FACTORS We were guided by both ergonomic and pragmatic factors in the 4. DOMAIN: POETIC TEXTS design of the GUI: We focus on texts considered to be an extreme challenge for SMT at Google: poetry [12] – with the intention of expressing elusive 1. Text selection is made by point-and-click, and not by the aspects of semantic communication in order to differentiate those click-and-drag operation ubiquitous in text-editors. This that can be spatio-visually articulated from others that cannot. mandates the automatic pre-segmentation (via jQuery) of each text into its constituent semantic atoms: e.g. words in
Any translator, committed to provide a definitive version of a inside an s element and within it would be nested several w poem, eventually arrives at an irreversible order of words – and elements. Now convenient for segmentation mark-up, this format may actually wish to document their choices by justifying their has the advantage that it remains valid TEI/XML, while it can correspondence to the original. They may deviate from literal easily be transformed back into canonical line/stanza form by correspondence for many good reasons – seldom due to a wish to applying an XSL stylesheet that implements a technique known as mystify or add artifice. To convey the thought expressed in a grouping [14]. source text while judiciously ignoring literalness, word order, or grammatical voice is to obtain what Nida terms dynamic We also rationalize alignment mark-up with respect to equivalence [13]. Such deviation from literalness is also essential conventional TEI practice: by labeling each tag using an n simply to reestablish an equivalent esteem, in the translation, to attribute simply composed from the text in that element – first that attained by the original work in its original language. That substituting with underscore characters any punctuation and SMT seldom achieves this often becomes pitifully apparent when intervals of white space. Thus we label the Latin text in loca and then we align to it an English translation as follows for In TEI [10] the tag is provided to mark-up word-like entities parts where we the prefix la: indicates the source (not necessarily orthographic words) and we adopt it as the language is Latin. To disambiguate any multiple occurrences of a smallest semantic units for alignment. It is also useful to align given phrase in the source text an ordinal postfix is appended: compounds of such semantic units to indicate the textual e.g.: n="la:in_loca.2". An additional type attribute is expression of a coherent idea – for this we enclose the units inserted whenever the translation is not to be considered literal – within a TEI (phrase) or (sentence) tag. Here we to indicate if it is approximate or a paraphrase, e.g: intend a loose definition of a sentence that is again not tied to any for parts. particular typographic convention: Poets often violate punctuations and omit full-stops and commas to gain their effect, This direct labelling avoids the additional complexity that would yet they still insist on pedantic positioning of their line-breaks and be incurred the conventional TEI practice of deploying link and stanzas. In the latter case, TEI offers the and the tag linkGrp tags [15]. Finally, we provide for limited rich-text to delimit the start and end of each line and stanza, respectively – rendering by respecting TEI tags , as the following extract from T.S. Eliot’s Ash Wednesday (rendered as italics) and , (rendered in bold) illustrates: – with each treated the same as the tag, but with an additional rend attribute to distinguish them. Here are the years that walk between, bearing Away the fiddles and the flutes, restoring One who moves in the time between sleep and waking, wearing 5. METHOD 5.1 Web delivery White light folded, sheathing about her, Both source and translation text are delivered to the browser as folded. The new years walk, restoring XML in the TEI format detailed above, and loaded thus into ... individual frames an HTML frameset – so that the two texts may While jewelled unicorns draw by the gilded be read side by side as is traditional for parallel texts. To this end hearse. the XML in each frame is immediately transformed using a simple XSL stylesheet so that the each s, phr, or w element becomes an Yet this encoding becomes less attractive when we wish to grant HTML span element of class s, phr, or w, respectively (denoted the freedom to segment over more than one line or stanza. For herein as span.s, span.phr and span.w). Thus the example if we attempt to segment the single phrase “wearing translated Latin phrase above becomes: White light” straddling the two stanzas above: for wearingWhite parts. light. we run into the problem of overlapping mark-up [13]: the XML For each span.w a hover event-handler is attached using jQuery, stops being valid when one range-based tag intersects another: so that whenever the reader passes the cursor over that element it Two hierarchies ({lg, l} and {s,phr,w}) are competing and one receives a color highlight. By default the highlight is green, needs to be given priority. TEI provides a way to preserve yellow if type is “approx”, and red if it is a paraphrase. In our and tags by assigning an enjamb attribute to those on the system the same highlighting effect is applied in the parallel frame newline. Since semantic segmentation is our priority we instead at any span.w that shares the same n attribute value (modulo a transform and tags into point-based mark-up and language prefix). Finally any span.phr and span.s enclosing dedicate range-based delimitation to segmentation and alignment, highlighted span.w are themselves highlighted, in white and thus: grey respectively – in order to project semantic context to the reader. ... waking, wearing White light folded, sheathing about her, 5.2 Interactive alignment folded.... Once color highlighting is activated each word in the text is then made available for interactive alignment, via mouse clicks. This For clarity, above we show only one level of semantic achieved in the following stages: segmentation – a phr element – which would normally be nested
Figure 2: GUI: Selection by Click in the source text – the original Sardinia poem by Antonino Mura Ena (top) Nothing is highlighted while cursor is not over a word of the text; (middle) Cursor now hovers over the word peraula which becomes highlighted in green, with the surrounding phrase in white and idea/sentence in lighter grey; (lower) Following a mouse click the word peraula gains a dashed black border – which indicates that it is selected. Selection might alternatively have been initiated in the text on the right.
Figure 3: GUI: Selection by Alignment of the translation (Following on from Figure 2) (top) Moving the cursor to the right hand text highlights a chosen word, as before. The previous selection continues to be indicated by the dashed border though it loses its highlight. (middle) Following a mouse click the word word also gains a border – it is now selected too; Further words could be selected in either frame before continuing or the color could be changes to indicate approximation (yellow) or paraphrase (red). (lower) A further click with a meta-key held down invokes an alignment – with all currently selected words (here peraula and word) being marked-up in literal correspondence (in green). Immediately, both borders are dissolved – indicating the operation is complete – and instantly the highlighting switches on both words to indicate the newly made alignment. From now on, whenever the cursor passes over either word peraula or word both will light up, unless a split operation is later applied to one of them Figure 4: Review: by clicking on the Notes icon (not shown) words from the source text are temporarily displayed in blue next to their aligned counterparts in the translation. Here we illustrate only the first two line.
1. Atomic segmentation: Each separate word in the text is 7. ACKNOWLEDGMENTS automatically enclosed in a span.w element – for which is Our thanks go to Gianluigi Zanetti,CRS4 and Sergio Usai . generated an n attribute duplicating its textual content. For languages like Chinese each ideogram is segmented individually, rather than using just white-space boundaries. Punctuation marks 8. REFERENCES always cause boundaries between span.w segments. [1] Google Translate, accessed May 2011. http://translate.google.com. 2. Restoring state: Any text delivered within a tag is given [2] Brelstaff, G.J., Chessa, F. 1998 "Sustaining the paper the n attribute of that tag when it becomes a span.w element – metaphor with Dynamic HTML", Conference Companion, thus state gets restored on delivery. Generally such elements HCI 98, Ed. Jon May et al, Sheffield UK, 16-17. contain more than one word and are the product of a previous merge operation (see point 4 below). [3] Bouvin, N.O., Zellweger, P.T., Gronbaek, K., Mackinlay, J.D. 2002, Fluid Annotations Through Open Hypermedia: 3. Selecting by click: Each resultant span.w element is assigned Using and Extending Emerging Web Standards, WWW an on-click event-handler so that once clicked it draws a thin Conf., (Honolulu, Hawaii, USA. May, 2002), 160-171. DOI= black rectangular border around its text (or removes it upon a http://doi.acm.org/10.1145/511446.511468. second click). The user can thus select a group of elements one by one in one or both of the parallel texts – with the borders [4] Multilingual markup demo – accessible since 2009. providing the visual feedback. http://fch.uniss.it/MLM. [5] Tiedemann, 2006 J. ISA & ICA—Two web interfaces for 4. Merge operation: When several elements are selected they can interactive alignment of bitexts, In Proceedings of LREC. then be merged. This is achieved by clicking upon one of them Genova, Italy. while holding down a meta-key (e.g. Control, Alt, Command key) on the keyboard. The event-handler, this time, assigns the same n [6] Germann, U. 2008. Yawat: yet another word alignment tool, attribute value to all of the selected span.w elements. That Proceedings of the ACL-08: HLT Demo Session, (Columbus, value is the concatenation of each component words, in text-order Ohio, USA June 2008), 22-28. – omitting punctuation and separated by underscores. [7] jQuery JavaScript Library v1.4.2, accessed May 2011. 5. Alignment: When elements from both frames are http://jquery.com/ simultaneously selected for a merge it becomes an alignment [8] Civera, J., Juan, A. 2007, Domain Adaptation in Statistical operation which is achieved by assigning a common label – the n Machine Translation with Mixture Modeling, Proceedings of attribute value – to each element selected. The label is computed the Second Workshop on Statistical Machine Translation, as the concatenation of only those words from the source text (Prague, June 2007) 177–180. frame. When it is assigned to the elements in the translation [9] Behrens, C., Kashyap, V. 2002, The Emergent Semantic frame the language prefix is attached – as discussed earlier. Web: A Consensus Approach for Deriving Semantic 6. Split operation: When a merge operation is attempted on a Knowledge on The Web. In Real world semantic web previously merged group of elements then the event-handler applications Eds: Kashyap & Shklar, IOS Press, 69-90. simply splits the elements back into their component elements by [10] Text Encoding Initiative Consortium, 2011. TEI P5: reassign their previous n attribute value. Guidelines for Electronic Text Encoding and Interchange. 7. Review: At any time the Notes icon can be clicked whereby XML Version, 1.9.0. updated on Feb 21 2011. words from the source text are temporarily displayed in blue next http://www.tei-c.org/release/doc/tei-p5-doc/en/html/ to their aligned counterparts in the translation (see Figure 4). [11] Pelli, D. G., Tillman, K. A., Freeman, J., Su, M., Berger, T. Another way to review is simply to trace the cursor along the D., & Majaj, N. J. (2007). Crowding and eccentricity reading line word by word and observer the highlighting – a determine reading rate. Journal of Vision, 7(2):20, 1–36, process that can be automatically emulated by clicking on the http://journalofvision.org/7/2/20/, doi:10.1167/7.2.20. Walkthrough button– one word per second – without the need to [12] Genzel, D., Uszkoreit, J., Och, F. 2010. “Poetic” Statistical move the mouse. In addition, once the Save button is pressed the Machine Translation: Rhyme and Meter, Proceedings of the resultant XML is displayed and it may be reviewed or even edited 2010 Conference on Empirical Methods in Natural before submitting it to an archive server. Language Processing, 158-166. [13] Marinelli, P., Vitali, F., Zacchiroli, S. 2008.Towards the 6. DEMO unification of formats for overlapping markup. The New Our interactive color-highlight interface has been successfully Review of Hypermedia and Multimedia. Vol.1,No.14, 57-94. used by translators to align multilingual parallel texts of poems involving the following languages: English, Russian, Chinese, [14] Tennison, J. 2008. XSL Pages: Grouping. accessed May Italian, Latin and Sardinian. – as can be reviewed at: 2011 http://www.jenitennison.com/xslt/grouping/index.html http://fch.uniss.it/_MLW [15] Boot, P. 2009. Towards a TEI-based encoding scheme for Figures 2 and 3 show screenshots of the GUI in action aligning the annotation of parallel texts, Literary and Linguistic the parallel text of a Sardinian poem and its English translation. Computing, Vol.24, No.3, 347-361.
You can also read