FROM DATA TO KNOWLEDGE IN THE LANGUAGE SCIENCES - IRG 2020
←
→
Page content transcription
If your browser does not render page correctly, please read the page content below
FROM DATA TO KNOWLEDGE IN THE LANGUAGE SCIENCES Institut de plurilinguisme/Institut für Mehrsprachigkeit Fribourg/Freiburg (Switzerland) 6 to 8 February 2020 BOOK OF ABSTRACTS Symposium für junge Forschende | Symposium per giovani ricercatori Young researchers symposium | Symposium de jeunes chercheurs www.irg2020.ch Photo by Paul Murphy
IRG Symposium 2020: From Data to Knowledge in the Language Sciences 6-8 February 2020, Institute of Multilingualism, Fribourg, Switzerland BOOK OF ABSTRACTS Overview Schedule ............................................................................................................................... 3 Keynote presentations ........................................................................................................... 4 Workshops “Meet the Keynotes” ........................................................................................... 6 Poster presentations ............................................................................................................. 8 Paper presentations .............................................................................................................13 Section 1: Types of data and data selection ......................................................................13 Session 1A ....................................................................................................................13 Session 1B ....................................................................................................................14 Section 2: Access to data and data collection ...................................................................17 Session 2A ....................................................................................................................17 Session 2B ....................................................................................................................19 Session 2C ....................................................................................................................18 Section 3: Data management............................................................................................21 Session 3A ....................................................................................................................21 Session 3B ....................................................................................................................22 Section 4: Data interpretation............................................................................................24 Session 4A ....................................................................................................................24 Session 4B ....................................................................................................................26 Section 5: Challenging data: critique and validation ..........................................................28 Section 6: Data reporting ..................................................................................................30 Important remark concerning the languages of the symposium: The main languages of the symposium are English, French, German and Italian. All presentations will be held in the language of the abstract printed in this document, unless otherwise stated. All presentations not in English should feature English slides to help with understanding. For discussions, groups should find a language concept comfortable to all, e.g. speak a common language or mix spoken and heard languages as preferred. info@irg2020.ch v1.0 (24/01/20) 2
Schedule For changes please refer to our website or the printouts on site. Click on the blue event titles to read the corresponding abstracts. Thursday, from 10 a.m. Registration & Poster installation February 6 11.30-12 Welcome K0.02 12-14 Lunch [HEP I] 14-15 Keynote: Tineke Brunfaut K0.02 15.15-17 Session 4A L1.06 Session 1A L1.08 17-17.30 Coffee break 17.30-19.15 Session 1B L1.06 Session 2A L1.08 19.15-20.15 Welcome Apéro Friday, 8.45-9.45 Keynote: Ingrid de Saint-Georges K0.02 February 7 9.45-10.15 Coffee break 10.15-12.00 Session 3A L1.06 Session 4B L1.08 12.00-14.00 Lunch & Poster session K1.03 14.00-15.45 Session 5 L1.08 15.45-16.15 Coffee break 16.15-18.00 Session 2 L1.06 Session 3B L1.08 19-24 Conference dinner Saturday, 8.45-9.45 Keynote: Sarah Schimke K0.02 February 8 9.45-10.15 Coffee break 10.15-11.30 Session 2 L1.06 Session 6 L1.08 11.30-13 Lunch [HEP I] 13-15.30 Meet the Keynotes K1.03, L1.06, L1.08 14.30 Coffee served 15.30-16 Closing session K0.02 info@irg2020.ch v1.0 (24/01/20) 3
Keynote presentations Garbage in, garbage out – assessing data collection instruments Tineke Brunfaut, Lancaster University Thursday, 14h, K0.02 As researchers in the language sciences, we select, design or adapt instruments to collect data on our topic of interest. To ensure meaningful and useful interpretations of the data we gather, we need to establish that the research instruments themselves are valid for our research purposes and population. If the research instruments are flawed, the conclusions we draw – whether with respect to language theory, learning, teaching or assessment – will be misleading, lack sufficient grounding, and will not be credible. It follows that an assessment or evaluation of the data collection instruments used should be a key step in any research project. This process, termed validation, involves obtaining evidence for the quality of the instruments, and thus for the claims being made about them and about the data and findings resulting from them. It requires collecting evidence to justify the interpretations of participants’ scores or answers on the research instruments. While the importance of validating research instruments is increasingly recognised among researchers in the language sciences, in practice, it is still not systematically implemented and validation efforts do not always meet the accepted standards for developing, using and evaluating research instruments. In this talk, I will draw on the field of language testing and assessment to explain current conceptualisations of validity and validation, and to describe validation frameworks that can be used in language sciences research. I will also show examples of how such frameworks can be used in practice to validate research instruments. Conceptualiser la notion de « donnée qualitative » : contexte, tensions et possibilités Ingrid de Saint-Georges, Université de Luxembourg Friday, 8h45, K0.02 Dans la sphère publique comme dans la recherche académique, les données sont aujourd’hui au cœur de nombreux débats. Ces débats sont complexes et souvent contradictoires. Dans cette présentation, nous en proposons trois lectures. Nous montrons d’abord par une lecture historique que la manière dont les données sont pensées et appréhendées dans la science dépend en partie de l’évolution du champ social, technologique et culturel. Un rapide survol de ces évolutions sera avancé afin de dépasser certaines visions encore parfois simplistes des rapports entre sciences qualitatives et quantitatives. Dans un deuxième temps, une lecture critique des conditions contemporaines de conduite de la science sera envisagée. En particulier, le contexte actuel de production de la science sera interrogé. Il s’agira d’examiner les répercussions éventuelles d’une culture de l’audit, de la quantification et de l’évaluation sur les pratiques de recherches dites qualitatives. Dans un troisième temps, une lecture pragmatique visera à réfléchir aux rôles que la sociolinguistique et les approches ethnographiques peuvent jouer dans l’élaboration de nouveaux agendas et de nouvelles réflexions autour des données et de leur conception. En fin de compte, c’est donc la notion de donnée comme idéologie et comme pratique qui sera abordée. Des arguments seront également détaillés pour défendre, à l’encontre de l’imposition d’épistémologies normatives, une manière plurielle d’envisager l’analyse et l’usage des données. info@irg2020.ch v1.0 (24/01/20) 4
Die Interpretation von Online- und Offlinedaten in der Sprachwissenschaft Sarah Schimke, Technische Universität Dortmund Saturday, 8h45, K0.02 Viele sprachwissenschaftliche Studien erheben sowohl Online- als auch Offlinedaten, um das Wissen von Sprachnutzern über einzelne sprachliche Phänomene zu erfassen. Mit Onlinedaten sind dabei solche Daten gemeint, die einen Einblick in Prozesse der Sprachverarbeitung erlauben, während diese Prozesse stattfinden. Es kann sich dabei beispielsweise um Lesezeiten und Blickbewegungen während des Lesens oder während der Verarbeitung auditiver Stimuli handeln. Mit Offlinedaten sind Daten gemeint, die das Ergebnis der Sprachverarbeitung wiederspiegeln, zum Beispiel Grammatikalitätsurteile. Dabei können sehr häufig Online- und Offlinedaten für ein und denselben Prozess erhoben werden, zum Beispiel, wenn Lesezeiten gemessen werden, während Probanden einen Satz lesen, und die Probanden anschließend auch ein Grammatikalitätsurteil über den Satz abgeben. In diesem Vortrag soll dargestellt werden, in welchem Verhältnis die resultierenden Daten zueinanderstehen können, und wie sich verschiedene Verhältnisse jeweils erklären und interpretieren lassen. Grundsätzlich können Online- und Offlinedaten ein sehr ähnliches Bild des sprachlichen Wissens zeigen, oder voneinander abweichen. In letzterem Fall gibt es einerseits Befunde, bei denen online Wissen sichtbar wird, das sich offline nicht oder nicht so deutlich zeigt. Dieses Muster tritt auf, weil viele Offlinedaten einen bewussten Zugang zu sprachlichem Wissen voraussetzen, der nicht selbstverständlich gegeben ist (s. z.B. Höhle et al., 2016; Osterhout et al., 2006). Es gibt aber auch Studien, in denen sich in Offlinedaten nachweisen lässt, dass sprachliches Wissen vorhanden ist, während korrespondierende Onlinedaten zeigen, dass die Anwendung dieses Wissens während der Verarbeitung in verschiedenen Gruppen unterschiedlich schnell und zuverlässig ist (s. z.B. Pan et al., 2015; Roberts et al., 2008). In dem Vortrag werden mögliche Interpretationen derartiger Muster diskutiert. Dabei werden einerseits Eigenschaften der untersuchten Sprachnutzer einbezogen, insbesondere ihr Alter zum Zeitpunkt der Datenerhebung und ihr Alter bei Erwerbsbeginn der untersuchten Sprache, andererseits auch Eigenschaften der spezifischen experimentellen Aufgabe und des sprachlichen Phänomens. Zusammenfassend unterstreicht die Komplexität der Ergebnisse den Wert der Anwendung mehrerer Methoden. info@irg2020.ch v1.0 (24/01/20) 5
Workshops “Meet the Keynotes” Saturday, 13h, K1.03, L1.06, L1.08 Fitting the puzzle pieces together: the benefits and challenges of mixed-methods research Tineke Brunfaut, Lancaster University Saturday, 13h, K1.03 The use of mixed-methods approaches has considerably increased in language sciences research in recent years. Mixed-methods research involves the collection and analysis of both quantitative and qualitative data within the same study. It is justified by the idea that it combines the strengths of both qualitative and quantitative methods, that it allows to explore a topic from different perspectives, and that it helps uncover relationships between various aspects and layers of the topic. An important characteristic of mixed-methods approaches is the purposeful integration or linking of the various data strands as part of the data interpretation process. As with any methodological approach, however, the suitability of mixed methods needs to be carefully considered against the research questions. The use of this methodology also presents its own challenges. In this workshop, we will first look at examples of existing studies that have used mixed methods. We will consider what types of research methods were combined in these studies, and how methodological innovations in the language sciences might have enhanced mixed methods research. We will explore what role the different datasets generated through the different methods played within each study, as well as their accompanying data analyses. We will also look into how the different pieces of information were tied together in each study and how they contributed to answering the study’s research questions. Second, we will explore the potential and suitability of mixed-methods approaches to workshop participants’ own research. We will discuss challenges you may have experienced in conducting mixed-methods research, questions you may have concerning mixed-methods methodologies, and ideas or suggestions you may have around mixed methods research. Theorizing and generalizing in fieldwork-driven language research Ingrid de Saint-Georges, University of Luxembourg Saturday, 13h, L1.06 This workshop will be open to any questions participants have about their research. To launch the discussion, however, we will focus on two related aspects of the research process that are not often discussed in doctoral training in the language sciences: generalizing and theorizing. • Generalizing: The idea that 'one cannot generalize' from case studies is regularly accepted as a fact in linguistic research. The first purpose of this workshop will be to question this assumption. Is it always true? And if one can never generalize from case studies, what might be the societal or intellectual impact of our qualitative research? We will examine different ways of understanding 'generalizing'. We will also discuss when and why we might want to adopt or avoid the discourse of generalizing altogether. The aim is not only to prepare oneself to answer a kind of criticism often addressed to qualitative research, but also to clarify the purpose of one's work. (cont’d) info@irg2020.ch v1.0 (24/01/20) 6
• Theorizing: A major challenge for field researchers is to move from richly textured experiences which are diverse, subjective and piecemeal to the construction of a coherent and meaningful image of the field that matters to its actors, to decision- makers or to other researchers. In this part of the workshop, we will ask ourselves: When and how should we theorize? Should we wait until the observations are completed, or could it be interesting to theorize even prior to entering the field? Different strategies for theorizing will also be discussed, as will the reasons why it might be important to pay closer attention to our own theorizing processes. The workshop will be interactive, focusing on the discussion of practical problems. Time permitting, we will also consider the role of writing in theorizing and generalizing. At the end of the workshop, a bibliography for further exploration of these questions will be made available. How to research the same question in different types of language users – some methodological considerations Sarah Schimke, TU Dortmund University Saturday, 13h, L1.08 Linguistic research is often concerned with characterizing what language users know about a specific language. There is no one method, however, that would allow for a privileged access to this knowledge, as each method comes with limitations. While this in itself constitutes a methodological problem, this problem is amplified by the fact that the same method may play out differently in different types of language users. For instance, in experimental research, adults may not be challenged by stimuli suitable for children, and this may influence the way they use their linguistic knowledge. On the other hand, children’s cognitive resources may make it impossible for them to treat materials that were designed for adults. Next to age, other variables, such as educational background or motivation, may also strongly influence how language users respond to experimental situations. Similar difficulties arise for non-experimental methods, such as corpus research or interviews. Given all this, researchers who want to measure the same construct in different populations are faced with the challenge of developing measures that are appropriate for each population and still yield results that can be compared to each other in a meaningful way. In this workshop, we will look at this problem from different perspectives and discuss possible strategies for different types of research questions, data and participating language users. We will discuss existing examples of work comparing different populations (e.g. Järvikivi et al., 2015; Schimke & Dimroth, 2018; Verhagen & Schimke, 2009). In addition, workshop participants are encouraged to bring their research questions, research methods, or existing data. info@irg2020.ch v1.0 (24/01/20) 7
Poster presentations Friday, 12h-14h (and anytime during the conference), K1.03 Collecting data in a comparative study on conversion and class-changing affixation in Present-Day English and French Chloé Marie Debouzie, Université Lumière Lyon 2 Quantitative corpus-based studies require access to a wealth of data. My research focuses on analysing the competition between two morphological word-formation processes: lexical class- changing affixation and conversion in Present-Day English and French. I have collected a dataset of affixed and converted words (for example cheatN – cheaterN, nannyV –nannifyV, googlerV - googliserV) to analyse the presence or absence of competing pairs. First, I identified the data required for my study, using lists of prefixes and suffixes (there exists no “set list” of affixes, therefore I compiled my own lists). Identifying conversion is more problematic, as by definition, the input and the output are formally identical. This constitutes one of the main challenges in my data collection. To collect a manageable quantity of data, I restricted my scope to Present-Day English and French, and decided to study words coined since 1950. I collected data using the online versions of the Oxford English Dictionary and Le Grand Robert de la langue française. Using dictionaries poses several methodological issues, such as the question of the reliability of etymological data, the vagueness of the dates in French (some words being mentioned as “20th century” or “middle of the 20th century”), and the question of the arbitrariness of the words listed in dictionaries needs to be considered. I then investigated corpora to collect further data (the Corpus of Contemporary American English and the Corpus de Référence du Français Contemporain). The search for affixed words is done using wild cards (selecting words beginning or ending with a specific affix) but looking for converted words is much more problematic. In these corpora, words are tagged for their part of speech, but tagging errors exist. This poster will provide an opportunity to discuss the advantages and drawbacks of building a dataset of constructed words using dictionaries and existing corpora. Students’ language choice in Swedish compulsory school - expectations, learning and assessment Ingela Finndahl, University of Gothenburg The aim of this poster is to receive feedback on the interpretation of data. The study, a PhD project, is concerned with young learners’ choice of a second foreign language in the Swedish school context. The main data collection will be carried out in school year 2019/2020 as an ethnographic case study. The project aims to investigate young language learners’ study of a second foreign language (SFL) in a Swedish elementary school, focusing on their expectations, perceived learning and achievements. A multi-methods design has been chosen, aiming to capture learners’ beliefs through questionnaires and interviews, and learning practices and assessment by observations and interviews. Crucial questions concern the analysis and interpretation of the data, given the ethnographic approach chosen. The statistical analysis of the questionnaires, the coding of the observations and the transcriptions of the interviews will be work in progress by the time of the conference. info@irg2020.ch v1.0 (24/01/20) 8
These are all aspects of my data that I wish to discuss from the point of view of analysis and possible interpretation, and I look forward to receiving feedback from peers and experienced researchers. The contextual background to my study is that Swedish pupils choose an SFL after English in year 5 and begin these studies in year 6, at the age of 12. A language choice is obligatory. About 80 % of all pupils normally choose French, German or Spanish, but they can also decide on additional English or Swedish, mother tongue (if other than Swedish) or sign language. The focus of the study will be on the choice of a second foreign language, French, German or Spanish, but other options will also be taken into account. Measuring and enhancing the migrants’ comprehension of Italian administrative texts Giulia Lombardi, University of Genoa The aim of the research was to investigate the readability of Italian administrative documents among foreigners and offer clearer alternatives or set up clarity guidelines that might improve access to these documents. Too often, the Italian administrative language tends to be unnecessarily difficult for all those who are L2 beginners in Italian language skills to become resident in Italy and yet need to deal with various red tape and formalities. We decided to carry on a quantitative analysis. The starter point of the research, in 2017, was a computational- linguistic analysis of a synchronic mono-thematic corpus which includes the most important forms foreigners have to submit in Italy, in order to find what lexical, semantic and syntactical structures were too difficult for foreigners. Than, in 2018, 101 students of Italian L2 were tested on the comprehension of authentic Italian institutional texts. Many difficulties were singled out: the correlation between personal factors (like age, schooling, mother tongue and motivation) and reading comprehension was analyzed by a multiple regression analysis. All the data were collected in order to design specific language policies and practice. In the beginning of 2019, 61 students have been tested on simplified texts; among them, 32 attended a specific language course. Data have been analyzed with t-test and Anova test: the amelioration of the comprehension on the simplified texts is statically significant (df = 59, p-value = 1.066e-09), while the manually reformulation seems to have a greater impact to respect to automatized lexical simplification. The language course is also a predictor of a better comprehension (F value = 4.56, p-value=0.037 *). The final results show what could be effective in enhancing the migrants’ comprehension on such an important texts content. Building a specialised corpus – a case study of generics in Norwegian Anna Kurek-Przybilski, Adam Mickiewicz University in Poznań Existing research on genericity focuses mainly on sentence analysis. Sentences, created for the sake of a given study, do not contain a broader generic context, making sentence analyses somehow incomplete. A solution to that can be conducting a corpus research on a tailor-made corpus of generic texts, as the phenomenon is not tagged in any of the already existing corpora. What is more, not every text genre contains generic expressions so creating a database of many different text types may not give desired results. In order to perform a study on genericity in Norwegian, 170 generic texts were retrieved from an online encyclopaedia ‘Store norske leksikon’ (a data set of over 180000 words was created). Each of the texts consisted of at least one paragraph and belonged to one of 5 categories: 1) people, 2) animals, 3) plants, 4) tools, 5) other. The texts were tagged with the use of R software. Choosing an encyclopaedia as a source, makes the data homogenous. This has both advantages and disadvantages. On the one hand, it puts limitations to data analysis. On the info@irg2020.ch v1.0 (24/01/20) 9
other hand, a homogenous data set is easy to manage in terms of manual corpus tagging and text sorting since all the samples contain the studied phenomenon. The type of data chosen for the study on generics in Norwegian proved crucial for successful analyses. Narrowing genres to encyclopaedic texts not only provides the data on generics in context but also guarantees that each of the samples will include generic nouns and noun phrases. This approach to studying genericity in Norwegian is innovative and can lay the foundation for further research on the phenomenon. A diachronic look at the English passive: Distributional semantics of be vs get Axel Bohmann, Mirka Honkanen, Julia Müller & Miriam Neuhausen Albert Ludwig University of Freiburg In this poster, we discuss the method and first findings of a distributional semantic analysis of the passive construction in a large diachronic corpus of American English. We compare the distribution of the canonical be-passive and the more recent get-passive (Schwarz 2018) to see whether the alleged connotations of the latter (adversativity, agentivity/responsibility, etc.) (Huddleston & Pullum 2002) are empirically verifiable and historically stable. Distributional semantics (Erk 2012; Perek 2018) is a corpus-based method that allows investigating the types of lexical verbs that commonly occur with each of the passive auxiliaries and visualizing their semantic similarity or distance. It is based on the assumption that words that occur in similar contexts—i.e. have many of the same collocates—have similar meanings as well. This approach enables us to follow the individual development of each passive construction over time as well as compare them at different points in time. We apply this method to the Corpus of Historical American English (Davies 2010–), which consists of 400 million words of written American English from the 1810s–2000s. First, all instances of the passive voice in the corpus were extracted with a Python script. The thousand most frequent verbs that occur with both auxiliaries were included in the analysis. We represent these verbs as vectors in semantic ‘space’ on the basis of their collocate frequencies. In this visualization, verbs that occur in similar contexts cluster together. Our analysis demonstrates an innovative method that relies on the availability of very large amounts of corpus data. Such data offer a new way of looking at the interface of semantics and structural change, based on a quantitative approach to semantics. Sprachbiografien junger Erwachsener aus Romanisch- und Italienischbünden Language biographies of young adults from the Romansh and Italian areas of Grisons Flurina Kaufmann-Henkel, Sabrina Sala, Pädagogische Hochschule Graubünden Im vorliegenden Projekt werden Sprachbiografien junger Erwachsener aus Romanisch- und Italienischbünden untersucht. Sowohl Romanisch als auch Italienisch gelten als Minderheitensprachen im Kanton Graubünden. Die Studie interessiert sich dafür, wie die in gemischtsprachigen Familien aufgewachsenen Teilnehmenden, die mindestens einen Sprachraumwechsel durchlaufen haben, ihre Mehrsprachigkeit erleben, reflektieren und kommentieren. Diese jungen Erwachsenen, die in der Familie eine weitere Sprache neben Romanisch respektive Italienisch sprechen, sehen sich als Minderheit in der Minderheit mit besonderen sprachlichen Herausforderungen konfrontiert. Es sind 19 junge Erwachsene aus Italienisch- und 21 aus Romanischbünden interviewt worden. Die Erhebungsmethode bestand einerseits aus einem biografisch-narrativen Interview, andererseits aus einer anschliessenden leitfadengestützten Befragung. Der Stimulus zur Spontanerzählung gab jeweils die Gestaltung eines Sprachenportraits. info@irg2020.ch v1.0 (24/01/20) 10
Es liegen zur Zeit des Symposiums einige Transkripte vor, wovon Ausschnitte auf dem vorliegenden Poster zu sehen sind. Des Weiteren kann Einsicht in einzelne Sprachenportraits mit den dazugehörenden Transkriptausschnitten gegeben werden. Das Forschungsteam stellt eine strukturierende Inhaltsanalyse mit deduktiver Kategorienanwendung am Datenmaterial vor, möchte aber auch weitere inhaltsanalytische Verfahren zur Diskussion stellen. Chroniques de terrains - L’ethnographie, une question de terrain Tales from the Field – Formulating questions during fieldwork Kevin Petit Cahill, ICAR, Université Lumière Lyon 2 En sociolinguistique ethnographique, les questions de recherche ne préexistent pas au terrain mais sont construites et reformulées par le travail de terrain via un va-et-vient constant entre théories et observations participantes. L’objectif de ce poster est d’illustrer cela par mon expérience de thèse. Lorsque je commence ma recherche, je m’intéresse au mouvement de revitalisation de la langue irlandaise, et plus particulièrement à une pratique populaire depuis plus de cent ans qui consiste à se rendre l’été dans des colonies de vacances (ou summer colleges) pour apprendre la langue. Je m’inscris d’abord dans une tradition de recherche sur les langues en danger qui visait à « reverse language shift » (Fishman 1991) principalement via la production de locuteurs. Je me focalise donc sur les effets « techniques » des colonies sur les élèves, et plus particulièrement sur les facteurs influençant leur motivation et donc leur niveau de langue. La particularité de ces colonies est qu’elles proposent un enseignement en immersion dans des régions officiellement irlandophones, la Gaeltacht. Mais une fois sur le terrain je réalise que la summer college experience ne consiste pas à se plonger en immersion dans un bain monolingue irlandais naturellement présent. La Gaeltacht étant en fait bilingue, l’expérience consiste plutôt à produire cet espace imaginé comme monolingue. De plus, les effets techniques restant relativement limités, mes questionnements se déplacent alors sur l’ « efficacité proprement magique d'initiation et de consécration » de l’action pédagogique (Bourdieu 1981). Je m’intéresse donc maintenant au rôle de l’expérience des summer colleges dans la naturalisation (ou la contestation) de catégories sociales telles que la Gaeltacht. C’est en prenant une perspective émique et interpretiviste propre à l’ethnographie (i.e. s’intéresser à comment les acteurs créent du sens par leurs pratiques sociales) que j’ai pu reformuler mes questions de recherche au gré du travail de terrain. Evaluation de la prononciation en français L1/L2, entre données qualitatives et quantitatives French L1/L2 pronunciation evaluation: between qualitative and quantitative data Marion Didelot, Université de Genève Notre recherche de thèse porte sur la réception et l’évaluation de la parole accentuée native et non native en français auprès de différents groupes d’auditeurs. Nous nous basons sur les travaux menés en folk linguistics (Niedzielski & Preston 2003), qui s’intéressent à ce que les locuteurs non spécialistes pensent et affirment à propos de la langue et qui confrontent cette approche avec ce qu’ils font en réalité. Notre recherche comporte ainsi deux volets : une expérience de perception d’une part, dans laquelle des auditeurs doivent évaluer, sur une échelle de Likert, « à l’aveugle » (c’est-à-dire en se basant uniquement sur un extrait sonore) différents extraits produits par des locuteurs natifs et non natifs de français, en répondant à des questions d’ordre sociolinguistique et linguistique, et, d’autre part, des entretiens semi- dirigés qui permettent d’approfondir certains thèmes abordés dans l’expérience de perception. info@irg2020.ch v1.0 (24/01/20) 11
Nous obtenons ainsi des données quantitatives et qualitatives à propos de la parole accentuée. Pour notre expérience de perception à l’aveugle, nous avons choisi deux locuteurs par variété de français soumise à évaluation et nous cherchons, dans la mesure du possible, à former des groupes d’auditeurs relativement homogènes. Notre communication se focalisera sur les liens entre données quantitatives et qualitatives et sur les limites de nos résultats. En effet, si l’intérêt de la démarche choisie réside surtout dans la complémentarité qu’offre l’étude des représentations/attitudes à la fois conscientes et moins conscientes, elle nous semble également avantageuse quant à l’interprétation des données récoltées. L’analyse des entretiens pourrait ainsi nous amener à (ré)interpréter nos données quantitatives et, peut-être, expliquer certains résultats surprenants le cas échéant. Enfin, la question de la généralisation de nos résultats se pose également, en raison notamment du choix des locuteurs sélectionnés pour représenter une variété de français et de la représentativité des auditeurs de notre étude. info@irg2020.ch v1.0 (24/01/20) 12
Paper presentations Section 1: Types of data and data selection Session 1A Session chair: Katja Fiechter (University of Fribourg/Switzerland) Thursday, 15h15-17h, L1.08 Benefits and limitations of a combinatorial approach to agentivity Célia Hoffstetter, Grenoble Alpes University Agentivity has often been conceptualized as a semantic feature of a category of verbs called “agentive verbs”, whose grammatical subject can only be animate and “thought of as the willful source or agent of the activity described in the sentence” (Gruber 1965). However, this approach does not satisfactorily account for a great number of cases where inanimate entities “do” something. For instance, the inanimate subject in “The stone flew across the window” can hardly be "willful" as is the human-animate subject in “Charles Lindberg flew across the Atlantic”, although it retains "some notion of agency" (Quirk et al. 1985) which needs to be further specified. Drawing from constructional approaches in cognitive linguistics (Goldberg 1995, Fillmore & Kay 1999), I argue that agentivity is not only a semantic property of verbs, but rather emerges from constructions, i.e. combinations of words. In this paper, I will explain the benefits of examining subject-verb combinations in a corpus, as well as the limitations that are necessarily involved in the wording of corpus searches. In that perspective, I will introduce Lexicoscope, a corpus analysis tool developed by Kraif and Diwersy (2016) dedicated to the study of combinatorial profiles of lexical entries, and how it can be used on a large corpus – more than 30 million words – containing a wide range of inanimate referents that may be considered active in different respects. More specifically, I will compare the results produced by two types of searches, one of which focuses on the noun phrase (e.g. “the stone”), and the other on the verb phrase (e.g. “flew”) to show the differential impact of such a methodological choice on data collection. Combining conversation analysis and experimental methods in the study of comprehension of interaction by L2 learners Simone Morehed, University of Fribourg/Switzerland Comprehension is crucial in L2 interaction, without which the learner is not able to interact in an appropriate manner. However, comprehension in interaction is largely absent in research in interactional and pragmatic competences. Production is studied through conversation analyses of interactions between L2 speakers, where research shows that although L2 learners develop their interactional proficiency (Skogmyr et al. 2017), they often do not express themselves in the same way as the L1 speakers, and might encounter disruptions even at advanced levels. Comprehension is included in experimental studies of specific pragmatic markers. Even though the authentic oral interaction is studied, these studies mostly use written or oral non- authentic material (Culpeper et al. 2018). Previous studies conclude that L2 learners often have different comprehension issues in interaction, but we do not yet know which aspects of interaction are the most crucial for the L2 info@irg2020.ch v1.0 (24/01/20) 13
learner’s comprehension. There is a clear need for research focusing on comprehension in interaction. In this presentation we will discuss the methodological potentials and challenges of studying comprehension by combining conversation analysis with an experimental approach, using authentic corpora as material (Kendrick 2017). We will discuss the use of authentic material in an experimental study, more precisely the variation and representativeness of the material, the control of variables, the level of the conversation analyses (micro/macro), and the length of the interactions (role of the sequential context). Die Gratwanderung zwischen freien und vorgegebenen Antworten in einer Online- Umfrage zu schweizerdeutschen Dialekten The dilemma of analyzing open-ended and multiple choice questions in an online survey on Swiss-German dialects Melanie Bösiger, Universität Freiburg/Schweiz Schweizerdeutsch ist eine vielfältige Sprache und so haben Sprecher_innen zum Ausdruck eines bestimmten Sachverhalts manchmal mehrere Möglichkeiten. Dabei werden bestimmte Formulierungen präferiert, andere eher selten verwendet. So zum Beispiel bei den Possessivkonstruktionen: Neben dativischen Konstruktionen mit von (‚de Teddy vo de Anna‘) oder mit Possessivpronomen (‚de Anna ihre Teddy‘) kommt auch im Schweizerdeutschen der Genitiv vor: ‚s Annas Teddy‘. Der Genitiv gilt vielerorts als archaisch und wird nur selten gebraucht, findet aber bei Bildungen mit Eigennamen und Appellativen durchaus Verwendung. Er ist für die Dialektforschung insofern interessant, als dass wegen seiner Seltenheit eine gewisse Unsicherheit bei der Bildung besteht. Insbesondere beim Artikelgebrauch gab und gibt es Wandel. Aber wie erhebt man selten vorkommende Phänomene in der Dialektologie? In zwei aufeinander folgenden Online-Umfragen im Rahmen zweier Dissertationsprojekte wurde versucht, dem Genitiv auf die Spur zu kommen. Die Problematik bestand dabei eben darin, dass der Genitiv eine von mehreren Möglichkeiten ist, die Sprecher_innen zur Bildung von Possessivkonstruktionen verwenden können. Bei reinen Übersetzungsfragen („Wie sagen Sie 'Annas Teddybär' in Ihrem Dialekt?“) gehen darum viele Daten verloren, weil Antwortende eine andere Konstruktion wählen, obwohl sie in ihrem Dialekt auch den Genitiv bilden könnten. Es müssen also gewisse Vorgaben in der Fragestellung enthalten sein, die den Genitiv zwar elizitieren, aber trotzdem nicht suggerieren. Von dieser Gratwanderung soll im Vortrag berichtet werden: Die Resultate beider Online-Umfragen werden verglichen und so können Vor- und Nachteile der unterschiedlichen Vorgehensweisen aufgezeigt werden. Session 1B Session chair: tba Thursday, 17h30-19h15, L1.06 Fieldwork, corpora, and tailored methods in dialect syntax Cameron Morin, University of Paris, Jack Grieve, University of Birmingham This paper focuses on some empirical problems and solutions in the study of rare language variation, through the case study of dialect syntax in English, and drawing on substantial fieldwork by the author. info@irg2020.ch v1.0 (24/01/20) 14
Multiple modals are peripheral but noticeable constructions in several British and American basilects. The following examples come from Borders Scots: (a) He’ll can help us tomorrow. (b) They might could be working. Investigating these features is an empirical and methodological challenge. Firstly, classic corpora-based enquiries (AMC Edinburgh) reveal themselves to be insufficient. This is supposedly due to the marginality of the structures, even in the varieties where they have been suggested to occur. Alternative sources of data collection may prove more useful, such as fieldwork experimentation directly interacting with the speech communities concerned. In January 2018, I conducted a field experiment in the town of Hawick (Borders), distributing a questionnaire to approximately 60 respondents from various age groups and occupations. The questionnaire was semi-structured, and revolved around tasks of judgment elicitation and syntactic manipulations to get a quantitative and qualitative picture of double modals in this representative locus of Borders Scots which could never have been provided through a corpus. However, do intuition-based judgments unfailingly deserve our trust? There are serious empirical issues with these alternative methods, and close scrutiny must be brought to the ways of avoiding their biggest pitfalls. These new problems might be well compensated, however, by a new combinatorial and multidimensional approach to rare dialect syntax, by reappraising both corpora compilation and fieldwork; and cross-examining specific quantitative and qualitative aspects of their individual components to establish the coherence of the resulting picture. This is a view on the rise in studies of language variation and change, and it is one I am currently developing for his doctoral investigation of multiple modals in English. Dall’idea ai fatti: i compromessi nella ricerca From the idea to the facts: compromising in research Dalila Dipino, Università di Zurigo Le ricerche degli ultimi decenni in sociolinguistica hanno avviato una proficua riflessione sul lavoro di raccolta e costruzione dei dati linguistici (cfr. D’Agostino, 2006; Calamai, 2004), mostrando come l’elaborazione di metodi di ricerca adeguati ai propri scopi sia un’operazione estremamente complessa. Anche nel nostro caso la fase della progettazione e composizione del corpus si è dimostrata assai ardua e delicata. Il progetto di ricerca in questione intende studiare la realizzazione di un tratto fonetico soprasegmentale, la lunghezza vocalica, in alcuni dialetti italo-romanzi settentrionali, appartenenti a tre sottogruppi diversi del ligure (Forner, 1988). Gli obiettivi originari erano molto ambiziosi: ci si era proposti di raccogliere dati relativi alla lingua parlata di oltre 25 informatori per ognuno dei tre gruppi dialettali, equamente differenziati per sesso ed età, così da ottenere un corpus robusto, bilanciato e rappresentativo. Ugualmente ambiziosa era l’idea di effettuare test per la raccolta di materiali eterogenei, in svariati contesti prosodici e pragmatici: dal parlato spontaneo al Discourse Completion task, dai compiti di traduzione fino al Map Task (per una panoramica v. Calamai, 2015). Gli obiettivi iniziali sono stati progressivamente ridimensionati di fronte alle difficoltà di elaborazione, da parte del ricercatore, e di svolgimento, da parte dei soggetti, di un questionario tanto complesso. Innanzitutto, abbiamo sperimentato la difficoltà di elaborare test adatti ad età molto diverse. Perfino la presentazione degli stimoli si è dimostrata un passaggio delicato, costituendo l’intermediazione dell’italiano una pericolosa fonte di pressione sulle info@irg2020.ch v1.0 (24/01/20) 15
produzioni dialettali. Non ultima, la difficoltà nel reclutamento degli informatori, considerata la scarsità di persone dialettofone, soprattutto giovani, e i numerosi rifiuti. Nel nostro contributo si illustreranno i tentativi di risoluzione dei problemi suesposti (dalla rielaborazione dei metodi di ricerca, alla creazione di test innovativi e attività ludiche fino al sostegno degli enti locali) e le questioni tuttora irrisolte. Using corpus data for pragmatic analysis: Researching response tokens with the International Corpus of English (ICE) Annika Blum, University of Bayreuth Until recently, corpus linguistic studies only rarely considered pragmatic phenomena. The relationship between these two fields could have been summarized as “parallel but often mutually exclusive and excluding” (Romero-Trillo 2008: 2): While corpus linguists prefer to work quantitatively and read texts vertically, pragmaticists work mostly qualitatively and tend to read texts horizontally including contextual information on the variable under investigation. The relatively new field of corpus pragmatics, however, combines corpus linguistics with pragmatics and promises that new insights will be gained through this approach (Rühlemann & Aijmer 2015). The proposed paper is linked to a PhD project on response tokens (RTs) in the field of variational pragmatics. These studies focused for a long time exclusively on RT use in the Inner Circle varieties of American, British and Irish English (McCarthy 2002, 2003; Murphy 2012; O’Keeffe & Adolphs 2008; Wong & Kruger 2018). This PhD project seeks to fill a research gap by adding variational pragmatic and corpus pragmatic research on RT use addressing different oral text types in Outer Circle Englishes, i.e. Nigerian and Philippine English. It will be shown how corpus pragmatics can contribute to the study of RTs in different Outer Circle varieties of English and text types. Due to the variable under investigation only dialogic exchanges, such as face-to-face conversations, phone calls or broadcast interviews and discussions etc., will be considered. Consequently, written and spoken, monologic text types will be excluded from the study. Using these spoken, dialogic sub-corpora of the International Corpus of English (ICE), it shall elaborate on the choice of data sets and the choice of tools and methods for data analysis and interpretation. Additionally, it aims to reflect on the limits of working with secondary data by highlighting the challenges that researchers on pragmatics have to deal with when working with corpora that do not contain pragmatic annotation. info@irg2020.ch v1.0 (24/01/20) 16
Section 2: Access to data and data collection Session 2A Session chair: Kevin Petit Cahill (ICAR, Université Lumière Lyon 2) Thursday, 17h30-19h15, L1.08 Données quantitatives en territoire insulaire : quel(s) modèle(s) interprétatif(s) ? Quantitative data in an insular territory: which interpretative model(s) can be used? Cleudir Filipe da Luz Mota, Laboratoire DyLis, Université de Rouen Normandie La République du Cap-Vert est un petit pays insulaire dont la population est estimée à environ 538.000 habitants (Instituto Nacional de Estatísticas, 2018). Sa situation sociolinguistique est marquée par une cohabitation entre la langue capverdienne (un créole de base lexicale portugaise formé pendant la colonisation ; aujourd’hui langue nationale, parlée par la quasi- totalité de la population dans les situations de communication informelles) et la langue portugaise (langue officielle utilisée dans les contextes formels). Depuis son indépendance, en 1975, les différents gouvernements que l’archipel a connus ont tenté d’adopter une politique linguistique qui mènerait à l’officialisation de la langue capverdienne. Ceci a suscité un grand débat social et politique autour des conséquences de ces « interventions » (Calvet, 2017). Dans le cadre de notre étude de terrain réalisée sur quatre îles du Cap-Vert (Santo Antão, São Vicente, Santiago et Fogo), nous avons recueilli les avis des Capverdiens par rapport aux mesures de politique linguistique adoptées. Ayant adopté une approche à la fois quantitative (Berthier, 2000) et qualitative et choisi de mener des enquêtes directives (Blanchet et Chardenet, 2011), nous avons administré (en langue capverdienne) un total de 289 questionnaires dans des espaces publics. Comme sur chaque île les enquêtés utilisent leur propre variété de la langue capverdienne, leurs avis sont très variés. Un nouveau défi s’est alors présenté : quels modèles d’interprétation adopter pour analyser et synthétiser des données récoltées sur un terrain à caractère archipélagique ? En effet, nombreuses sont les variables à prendre en compte (entre autres le niveau de scolarité, les langues parlées et l’île d’origine) dans le cadre d’une étude réalisée au niveau « macro ». Notre enquête nous a ainsi permis de prendre en considération de nombreux enjeux méthodologiques (liés aux outils et au terrain de recherche) dont il faudrait rendre compte lorsque l’on réalise des enquêtes sociolinguistiques sur un terrain insulaire où les enjeux identitaires sont fortement présents. Radicalité djihadiste et médias sociaux : enjeux, méthodes et défis liés à la sélection et à la récolte de données sensibles Jihadist radicality and social media: Issues, methods and challenges related to the selection and collection of sensitive data Laurène Renaut, Université de Cergy-Pontoise Cette communication qui s’inscrit au croisement de plusieurs courants des sciences du langage (linguistique appliquée et analyse du discours) se propose d’interroger, dans le contexte de la radicalisation djihadiste en ligne, les méthodes pour sélectionner et récolter des données sensibles issues des médias sociaux. info@irg2020.ch v1.0 (24/01/20) 17
Notre recherche s’appuie sur les données publiques de 100 profils radicalisés sur Facebook (60 hommes et 40 femmes répertoriés selon l’organisation terroriste dont ils se réclament et le degré d’activité de leurs comptes) ; donc sur un corpus numérique anonymisé pour des raisons de sécurité et confidentialité. Précisons que la constitution de notre corpus a exigé une phase d’observation de la djihadosphère, donc une enquête préparatoire reposant sur un parcours de recension des comptes radicalisés afin d’investir le terrain questionné. De ce travail découle ensuite une phase de caractérisation ou circonscription du territoire visant à préétablir une grille de critères en veillant à cerner la catégorie « djihadiste » en comparaison à d’autres catégories comme les comptes salafistes. Si dans notre thèse nous interrogeons les évolutions des stratégies sémio-discursives déployées pour se dire « djihadiste » sur les réseaux sociaux entre 2015 et 2019, le focus de cette communication sera porté sur les défis rencontrés plutôt que sur les résultats des analyses menées. Dans cette perspective, nous évoquerons les difficultés d’accès à ce corpus et les problèmes relatifs à la nécessaire anonymisation du chercheur menant cette étude et utilisant des avatars Facebook pour sa sécurité. Par ailleurs, nous aborderons les obstacles surmontés tant dans le choix des données que dans leur collecte (comptes censurés, évolution du RGPD et problématique du web scraping). Session 2B Session chair: Philippe Humbert (University of Teacher Education Fribourg/Switzerland) Friday, 16h15-18h, L1.06 Observer et interpréter « chez soi » : entre chercheuse et actrice sociale Observing and interpreting as an “insider”: between researcher and social actor Salomé Molina Torres, Université Sorbonne Nouvelle - Paris 3 Dans le cadre de ma recherche doctorale, je réalise une ethnographie multi-située où j’interroge les enjeux sociolinguistiques du processus de production d’une communauté imaginée (Anderson, 1983) colombienne à Paris. La méthodologie de mon enquête relève de la participation observante (Rötterink, 2008) au sein des réseaux et espaces colombiens à Paris. Etant moi-même colombienne migrante, mon arrivée au terrain s’est faite avant même qu’il devienne un espace de réflexion sociolinguistique. Ceci facilite la prise de contact avec certains réseaux, mais représente également une difficulté vis-à-vis du regard que je porte sur les phénomènes socio-langagiers qui m’intéressent. Le caractère subjectif de ma recherche a été depuis le début à la fois une motivation et un grand questionnement de mon approche ethnographique. Si mon appartenance à la migration colombienne contribue à ma réflexion académique, mon implication personnelle représente un possible biais dont je suis consciente. Etant donné que mon interprétation est influencée par les relations que j’ai construites en tant que colombienne et chercheuse, je m’interroge constamment sur mon rôle au sein de cette migration vis-à-vis des catégories qui m’y sont attribuées (colombienne, étudiante, chercheuse, amie, jeune…). Est-il nécessaire et possible de tracer une frontière entre un moi migrante/colombienne et un moi chercheuse ? Si on n’attend plus de l’ethnographe qu’il s’efface du terrain pour garder une posture de neutralité (Volvey, 2014), les débats sont encore centrés sur son statut en tant que chercheur. Pourtant l’ethnographe n’est pas que chercheur dans son terrain ; il s’y investit personnellement. Comment son engagement influence son interprétation ? Pour aller au-delà d’une compréhension dichotomique (rapprochement- distanciation ; implication-désimplication) d’une « anthropologie chez soi » (Ouattara, 2004), info@irg2020.ch v1.0 (24/01/20) 18
You can also read