What's up, Switzerland? Final workshop - Elisabeth Stark, University of Zurich - What's up, Switzerland
←
→
Page content transcription
If your browser does not render page correctly, please read the page content below
What‘s up, Switzerland? Final workshop Elisabeth Stark, University of Zurich estark@rom.uzh.ch University of Zurich 03/11/2018
The project • Funded by the Swiss National Science Fundation • Project sum: CHF 1'597'904 • Project leader: Elisabeth Stark (estark@rom.uzh.ch) • Involved Universities: Zurich, Bern, Neuchâtel, Leipzig • Project duration: 36 months (1/1/2016 – 31/12/2018) plus extension 03/11/2018 2
Two Overall Research Questions • What do Swiss WhatsApp messages look like? What has changed overall between Swiss SMS and Swiss WhatsApp messages, and why (as regards linguistic structures, use of images in a broad sense, spelling, register- specific style, individualization vs. accommodation)? • What is said / done by the individual users and the media in/on WhatsApp messages and chats, in relation to the findings for question 1? 03/11/2018 3
Aims of this workshop 1. Present and discuss results of doctoral students and post-docs • …with the respective external experts, but also • the whole audience! 2. Get valuable guidance for the last part of their work: writing up! 03/11/2018 4
Other large-scale WhatsApp copora • Siebenhaar et al.: What‘s up, Deutschland? • Beißwenger et al.: MoCoDa (Mobile Communication Database): Aufbau einer Datenbank zur digitalen Kurznachrichtenkommunikation (WhatsApp, SMS & Co.) als Ressource für Forschung, Lehre und Unterricht. (ongoing) Non-public WhatsApp corpora • Hilte et al. (Antwerpen): Flemish Online Teenage Talk • Lieke Verheijen (Groningen): Dutch WhatsApp collection Lists of CMC corpora: • www.cmc-corpora.ch • www.clarin.eu/resource-families/cmc-corpora 03/11/2018 5
Subproject A: Language(s) of WhatsApp: Verbal Periphrases (VP) and Argument Drop (AD) • Research question: VP and AD as register-specific features (in the sense of Biber 1995) and/or mainly technologically provoked? • Lead: Elisabeth Stark (Zurich), Silvia Natale (Bern) • Doctoral students: Franziska Stuntebeck (Fr 14.45 – 15.30), Rossella Maraffino (Sa 9.45 – 10.30) • Collaboration partners: – Florence Lefeuvre (Université Sorbonne Nouvelle, Paris 3) – Liliane Haegeman (Ghent University), Fr 14.00 – 14.45: Aspects of subject omission in the diary register – Christiane von Stutterheim (University of Heidelberg), Sa 11.00 – 11.45: Event unit formation under a cross linguistic perspective – Monique Flecken (Max Planck Institute for Psycholinguistics, Nijmegen), Sa 09.00 – 09.45: Influences of aspect on event processing 6
Subproject B: Language Design in WhatsApp: Icono/Graphy • Research question: nature and function of graphic strategies, especially new sets of iconographic signs (emojis). • Lead: Christa Dürscheid (Zurich), Federica Diémoz (Neuchâtel) • Postdocs: Christina Siever (Thu 13.30 – 14.15), Etienne Morel (Fr 11.45 – 12.30) /Silvia Natale (Fr 11.00 – 11.45). • Collaboration partners: – Jürgen Spitzmüller (University of Vienna), Thu 14.15 – 15.00: Mediatised Lifeworlds - Young people's narrative constructions, connections and appropriations: Introducing a Research Platform – Marie-José Béguelin 03/11/2018 7
Subproject C: Individuals in WhatsApp • Research question: Individual vs. group variation, patterns of accommodation in interaction. • Lead: Beat Siebenhaar (Leipzig) • Doctoral student: Samuel Felder (Fr 09.45 – 10.30) • Collaboration partners - external expert: Michael Beißwenger and Steffen Pappert (University of Duisburg/ Essen), Thu 15.30 – 16.15: Use of emojis in a didactic peer-feedback setting: A pragmatic analysis and description framework – Jannis Androutsopoulos (Hamburg) – Peter Schlobinski (Hannover) 03/11/2018 8
Subproject D: The Cultural Discourses and Social Meanings of Mobile Communication • Research question: What does the public discourse on graphic mobile communication via WhatsApp (and SMS) look like? • Lead: Crispin Thurlow (Bern) • Doctoral student: Vanessa Jaroski (Thu 10.30 – 11.15) • Collaboration partners: – Lauren Squires (Ohio State University). – Ana Deumert (University of Cape Town) Thu 09.15 – 10.00: Sociolinguistics and Mobile Communication - Looking back and looking ahead, a view from the global south – Rodney Jones ( Reading University, England) Thu 11.15 – 12.00: GDPR, digital surveillance, and the semiotic coercion of consent 9
Where we stand • All doctoral students and the postdocs have finished their empirical analyses and are now starting to write up. • Students wrote 10 papers based on the corpus. • 111 presentations were given by the team. • 32 publications with a link to the project / the corpus were published. Many more are in preparation. • The corpus was used by 23 people outside the project for their research. • 123 articles about the project appeared in the printed press, radio or television until now. • The project was also presented to a general public, e.g. at the Scientifica 2017 (UZH, A. Göhring), at the Kinderuniversität 2017 (UZH, Ch. Siever) or at the Wissenschaftsfestival 2018 (UZH, Ch. Dürscheid). 03/11/2018 10
Some selected presentations: • 09/06/2017: Samuel Felder/Beat Siebenhaar: Individual, accommodation, synchronisation. The use of emojis in WhatsApp communication. International Conference on Language Variation in Europe, Malaga. • 19/07/2017: Samuel Felder: Stylistic Variation as a Means for Identity Construction in WhatsApp Interactions. 15th International Pragmatics Conference, Belfast. • 21/07/2017: Etienne Morel and Cécile Petitjean: "Hahaha": How and why to produce laughter in WhatsApp conversations. 15th International Pragmatics Conference, Belfast. • 27/07/2017: Christa Dürscheid/Christina Siever: On the Relation of Writing and Images in Digital Communication. AILA, Rio de Janeiro. • 27/06/2018: Adam Jaworski, Crispin Thurlow: Deconstructing the ideologies of universal visual language. 22nd Sociolinguistics Symposium, University of Auckland. • 13/09/2018: Liliane Haegeman: Subject ellipsis and the anaphorizing deficiency of impersonal pronouns. Linguistics Association of Great Britain, 11 Annual Meeting 2018, University of Sheffield.
Some selected publications: • Dürscheid, Christa/Siever, Christina M. (2017). Jenseits des Alphabets. Kommunikation mit Emojis. Zeitschrift für Germanistische Linguistik 45/2, 256–285. • Jucker, Andreas H., Heiko Hausendorf, Christa Dürscheid, Karina Frick, Christoph Hottiger, Wolfgang Kesselheim, Angelika Linke, Nathalie Meyer, and Antonia Steger (2018). Doing space in face-to-face interaction and on interactive multimodal platforms. Journal of Pragmatics 134, 85-101. • Lusetti, M., Ruzsics, T., Göhring, A., Samardžić, T., and Stark, E. (2018). Encoder-Decoder Methods for Text Normalization. In Proceedings of the Fifth Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial 2018), 18–28. Santa Fe, New Mexico, USA. • Petitjean, Cécile/Morel, Etienne (2017). "Hahaha": Laughter as a Resource to Manage WhatsApp Conversations. Journal of Pragmatics 110, 1-19. • Siebenhaar, Beat (2018.: Funktionen von Emojis und Altersabhängigkeit ihres Gebrauchs in der WhatsApp-Kommunikation. In: a. Ziegler (ed.): Jugendsprachen. Aktuelle Perspektiven internationaler Forschung. Berlin: De Gruyter. • Thurlow, Crispin (in press): Mediatizing sex: Sexting and/as digital discourse. In K. Hall & R. Barrett (eds:. The Oxford Handbook of Language and Sexuality. New York: Oxford University Press. • Ueberwasser, Simone/Stark, Elisabeth (2017). What’s up, Switzerland? A corpus-based 12 research project in a multilingual country. Linguistik online 84/5, 105-126.
Press 03/11/2018 13
The corpus • Planned publication (open access): end of 2019 • 617 Chats • 763,650 messages with text • 5,543,692 tokens • Texts from 945 participants, 426 with demographics • Multilingual (German, French, Italian, Romansh) Language Chats Messages Tokens Non-dialectal German 93 81,456 625,419 Swiss German 275 506,984 3,611,033 French 141 197,255 1,397,375 Italian 87 42,559 293,567 Romansh 77 29,094 283,909 03/11/2018 14
The corpus: What has been done up to now • Masking of non-consented messages • Anonymization • Mapping of emoji codes to Unicode and emojiQdescription • Tokenization • Removal of duplicate chats • Language identification for chats and messages • PoS annotations and normalization for the French corpus • Manual normalization of parts of the Swiss German corpus 03/11/2018 15
The corpus: Planned processing steps • PoS and noralization for the Italian data • Documentation 03/11/2018 16
Extended NLP (Natural Language Processing) on Swiss German chats • Team (collaboration with URPP “Language and Space”): Anne Göhring, Tanja Samardžić, Massimo Lusetti, Tatyana Ruzsics • Aim: Automated normalization of Swiss German messages • Relevance beyond our project/added value: • Some of the problems: Dialect and CMC, clitic forms, spelling variants: 03/11/2018 17
Extended NLP on Swiss German chats • Character-level statistical machine translation (CSMT) as a baseline: – Ziit→Zeit (‘time’) – wiiter→weiter (‘further’) – Priis→Preis (‘price') • Phase I (concluded): Neural encoder-decoder (ED) models (Lusetti, M., Ruzsics, T., Göhring, A., Samardžić, T., and Stark, E. (2018). Encoder-Decoder Methods for Text Normalization. In Proceedings of the Fifth Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial 2018), 18–28. Santa Fe, New Mexico, USA. ) • Phase II (ongoing): include Gold standard for PoS and look at the word's environment (Planned publication for Cambridge Journal of Natural Language Engineering (NLE).) • Accuracy scores: ~ 88% 03/11/2018 18
You can also read