CLARIN: One Infrastructure For Many Languages - CLARIN ERIC The 31st of May 2021
←
→
Page content transcription
If your browser does not render page correctly, please read the page content below
Organisers This edition of the CLARIN Café is organized and hosted by Andreas Witt and Francesca Frontini CLARIN ERIC BoD 2
Plan 14:00 - 14:10 Opening and CLARIN 101 - Francesca Frontini (ILC-CNR and CLARIN ERIC) 14:10 - 14:20 CLARIN as a Multilingual Infrastructure - Andreas Witt (IDS and CLARIN ERIC) 14:20 - 14:35 K-centres Under The Angle Of Multilinguality - Steven Krauwer (CLARIN ERIC) 14:35 - 14:50 The CKLD K-centre - Felix Rau (University of Cologne) 14:50 - 15:05 The European Language Equality Project (ELE) - Georg Rehm (DFKI) 15:05 - 15:20 The European Federation of National Institutions for Language (EFNL) - Sabine Kirchmeier (EFNIL, Deputy President) 15:20 - 15:30 Short coffee break 15:30 - 16:00 Panel discussion on "CLARIN, an infrastructure for a Multilingual Europe" 3
Recording The event is recorded for further dissemination purposes. Questions and comments? Put them in the chat box. 4
CLARIN ... ● is the Common Language Resources and Technology Infrastructure ● has the ESFRI ERIC status since 2012, Landmark since 2016 ● provides easy and sustainable access for scholars in the humanities and social sciences and beyond – to digital language data (in written, spoken, video or multimodal form) – and advanced tools to discover, explore, exploit, annotate, analyse or combine them, wherever they are located – through a single sign-on environment ● serves as an ecosystem for knowledge sharing ● is an integral part of the European Open Science Cloud – See clarin.eu/eosc 6
CLARIN today ● 68 centres ● 21 members: (AT, BG, CY, CZ, DE, DK, EE, FI, GR, HR, HU,IS, IT, LT, LV, NL, NO, PL, PT, SE, SI) ● 3 observers: FR, UK, ZA 7
The Knowledge Infrastructure https://www.clarin.eu/content/clarin-for-researchers https://www.clarin.eu/content/knowledge-sharing 9
The café #CLARINcafe 11
CLARIN as a multilingual infrastructure
Panel discussion Panelists: ● Sabine Kirchmeier (EFNIL, Deputy President) ● Georg Rehm (DFKI) ● Franciska de Jong (CLARIN ERIC) ● Jurgita Vaičenonienė (Vytautas Magnus University and CLARIN-LT)
Getting involved in CLARIN • Join our NewsFlash – https://www.clarin.eu/content/newsflash • Check out our events – https://www.clarin.eu/events • Open calls – https://www.clarin.eu/content/funding-opportunities • Follow us on Twitter @CLARINERIC • And stay tuned for the next cafés – https://www.clarin.eu/content/clarin-cafe – #clarincafe 14
See you at the next café ParlaMint II Release Café 28th June 14:00-16:00 CEST Comparable parliamentary corpora now 13 more languages! 15
CLARIN as a Multilingual Infrastructure Andreas Witt 31 May 2021 1
CLARIN - a language infrastructure • Common Language Resources and Technology Infrastructure • many object languages • BUT: most LRs in computational linguistics focus on English • one meta language: English • many CLARIN tools aim to be language-neutral • BUT: the output quality often differs depending on the input language 2
CLARIN - a pan-European infrastructure • members: consortia in 21 countries, many of them are members of the EU • 24 official languages in the EU • in most EU countries more than one language is used • official language(s) + immigration + sign language(s) • but also countries from the EU are members of CLARIN (South Africa) • CLARIN aims at supporting all languages 3
Example: Virtual Language Observatory 4
Put our hands together for… • The European Federation of National Institutions for Language (EFNL) • The European Language Equality Project (ELE) • CLARIN Knowledge Centre for linguistic diversity and language documentation (CKLD), University of Cologne 5
… and prevent this from happening: 6
CLARIN as a Multilingual Infrastructure Andreas Witt 31 May 2021 7
K-centres Under The Angle Of Multilinguality Steven Krauwer ( steven@clarin.eu ) & Bente Maegaard ( bmaegaard@hum.ku.dk ) CLARIN Café - One Infrastructure For Many Languages May 31 2021
CLARIN and languages • The announcement says “CLARIN is language-neutral” … • … but personally I would rather say “All languages are equally dear to CLARIN” • We do not just tolerate all languages … • … but we love them all, and we love the diversity! • CLARIN has many centres that give access to language data, services and tools, but in this presentation we will focus on the centres that give access to knowledge and expertise, the so-called K(nowledge)-centres • At this moment we have 23 (with 2 more in the pipeline), and here we will show how they support the multilingual nature of CLARIN and the diversity of languages CLARIN 2
K-centres and their focus • K-centres come in types, and can focus on (possibly combinations of) e.g. - specific languages: Danish, Basque - modalities: written text, speech - linguistic topics: morphology, field linguistics - language processing topics: text mining, speech recognition - data types: tree banks, wordnets - language independent topics: IPR, data management - … and many others • For more information about K-centres: - See description of what they do and what they are on https://www.clarin.eu/content/knowledge-centres - See full list of K-centres on http://vonweber.elsnet.org/cgi/kcentres_page.cgi - Search K-centres for specific expertise on http://vonweber.elsnet.org/cgi/kcentres_search.cgi CLARIN 3
K-centres and languages: language portals Some K-centres serve as portals for a language or group of languages and have broad knowledge about the language(s) and about the availability of resources and tools to work with them. At this moment the following have declared themselves portal - CLASSLA: Slovene, Croatian, Bosnian, Serbian, Montenegrin, Macedonian, Bulgarian - CORLI: French - CorpLingCz: Czech - DANSK: Danish - K-BLP: Belarusian - NLP:EL: Greek, Greek sign language - PolLinguaTec: Polish - PORTULAN: Portuguese - Spanish-K-centre: Spanish, Basque, Catalan, Galician - SWELANG: Swedish • Note: 11 out of 24 official EU languages are covered by a K-centre language portal CLARIN 4
K-centres and languages: other language expertise • Some K-centres don’t serve as a portal for a specific language but have working experience with it that they are happy to share, but often from a specific perspective. Some examples: - CLARIN-SPEECH works on speech analysis for Swedish and English - IMPACT-CKC works on digitisation and OCR for a number of languages, such as Spanish, English, Polish, French, Dutch, German, Slovene, Czech, Latin, Bulgarian - PoliLinguaTec serves as a portal for Polish, but also works on English, German, Russian, Ukrainian, Bulgarian, Lithuanian, French, Spanish, Hungarian, Hebrew • Note: the list of “other” languages covered by K-centres contains some 40 individual languages, language families and groups of languages; out of 24 official EU languages only Maltese and Slovak are completely missing CLARIN 5
K-centres and languages: dealing with language barriers and diversity Some K-centres address topics that are multilingual by their very nature, such as • Second language learning and bilingual development: e.g. ACE, CLARIN-HUMLAB, CLARIN-Learn, CLARIN-SMS • Facilitating translation (manual or by machine): e.g. CLARIN-SMS, K-BLP, NLP:EL, PolLinguaTec, PORTULAN, TRTC • Studying or dealing with language diversity: e.g. CKLD (see next presentation in the programme), CLARIN-HUMLAB, CLARIN-Learn, CLARIN-SMS, PhA-OeAW CLARIN 6
Concluding remarks From the perspective of multilinguality K-centres offer a lot of expertise to users of the CLARIN infrastructure: - Jointly the CLARIN K-centres provide expertise on nearly all official EU languages, and on a rich variety of other languages Europe- and worldwide. - Out of 24 official EU languages 11 have K-centre portals that can provide broad expertise on matters related to the language and its processing - Multilinguality and diversity as such are addressed as well But we are not there yet: - For the remaining 13 languages coverage via K-centres is quite uneven. For some languages there is a wealth of language and language processing expertise available, even if there is no dedicated declared K-centre portal for it, whereas for others the available expertise offered by K-centres is focused on specialized topics rather than the language at large. Finally: - Please note that K-centres are not the only places in CLARIN where one can find relevant expertise, but they are the ones that have agreed to make it more visible and more easily accessible! CLARIN 7
CLARIN Knowledge-Centre for linguistic diversity and language documentation Felix Rau Data Center for the Humanities / University of Cologne
● distributed K-Centre ● since 2018 ● 7 partners ● from 2 countries (Germany/UK)
Linguistic Diversity and Language Documentation ● Linguistic Diversity ● Minority languages ○ Global linguistic diversity ● Endangered languages ○ Language typology ● Under-resourced & under-researched ○ Language comparison languages ○ Typological databases ● Language community involvement ● Language Documentation ○ Audio-visual data ● Participatory research ○ Language archiving ● Multilingualism ○ Linguistic fieldwork
Partners ELAR Endangered Languages Archive SOAS University of London SWLI SOAS World Languages Institute SOAS University of London DCH Data Centre for the Humanities University of Cologne IfL Department of Linguistics University of Cologne HZSK Hamburg Centre for Language Corpora University Hamburg INEL Grammatical Descriptions, Corpora and Academy of Sciences and Humanities Language Technology for Indigenous Northern in Hamburg Eurasian Languages ZAS Leibniz-Zentrum Allgemeine Sprachwissenschaft
Activities ● Consultations ○ CKLD Helpdesk (via CLARIN-D Helpdesk infrastructure) ○ Via the individual partner institutions (e.g. DCH) ● Trainings ○ Field methods training (e.g. IfL, ELAR) ○ In-country language documentation training (e.g. ELAR) ○ EXMARaLDA training (HZSK) ● Research ○ QUEST project: quality standards for audio-visual language data (INEL, DCH, HZSK,IfL, ZAS) ○ Joint publications
Audience ● Linguists (and other researcher interested in endangered and minority languages) ● Language communities and speakers ● Educators (working with endangered and minority languages)
Thank you! https://ckld.uni-koeln.de/
European Language Equality: An Overview Georg Rehm (Co-Coordinator ELE, DFKI) 31-05-2021 CLARIN Café http://www.european-language-equality.eu
European Language Equality (ELE) – Summary Objective: development of a strategic research, innovation and deployment agenda to achieve digital language equality in Europe by 2030 Consortium: 52 partners from all over Europe Coordinator: ADAPT Centre (Dublin City University) Co-coordinator: DFKI Runtime: 18 months – ELE and ELG will both end in June 2022 Start on 1 January 2021 European Language Equality 2
Context: “Language Equality” EP Resolution European Parliament 2014-2019 EP Resolution Language equality in the digital age TEXTS ADOPTED Provisional edition P8_TA(2018)0332 – partially based on the STOA study P8_TA-PROV(2018)0332 Voting (11 Sept. 2018): 592 yes – 45 no Language equality in the digital age European Parliament resolution of 11 September 2018 on language equality in the digital age (2018/2028(INI)) Selected Recommendations addressed by ELE: The European Parliament, – having regard to Articles 2 and 3(3) of the Treaty on the Functioning of the European Union (TFEU), 25. Establish a large-scale, long-term coordinated funding – having regard to Articles 21(1) and 22 of the Charter of Fundamental Rights of the European Union, programme for research, development and innovation in the – having regard to the 2003 UNESCO Convention for the Safeguarding of the Intangible Cultural Heritage, field of language technologies, at European, national and – having regard to Directive 2003/98/EC of the European Parliament and of the Council of 17 November 2003 on the re-use of public sector information1, regional levels, tailored specifically to Europe’s needs and – having regard to Directive 2013/37/EU of the European Parliament and of the Council of 26 June 2013 amending Directive 2003/98/EC on the re-use of public sector information2, demands – having regard to Decision (EU) 2015/2240 of the European Parliament and of the Council of 25 November 2015 establishing a programme on interoperability solutions and common frameworks for European public administrations, businesses and citizens (ISA2 programme) as a means for modernising the public sector3, 29. Create a European LT platform for sharing of services – having regard to the Council resolution of 21 November 2008 on a European strategy for multilingualism (2008/C 320/01)4, – having regard to the Council decision of 3 December 2013 establishing the specific 27. Europe has to secure its leadership in language-centric AI programme implementing Horizon 2020 – the Framework Programme for Research and 1 OJ L 345, 31.12.2003, p. 90. 2 OJ L 175, 27.6.2013, p. 1. 3 OJ L 318, 4.12.2015, p. 1. 4 OJ C 320, 16.12.2008, p. 1. European Language Equality 3
European Language Equality: Consortium • 5 core partners: Adapt Centre, DFKI, Charles University, ILSP, University of the Basque Country • 9 networks, associations, initiatives: LT Innovate (via Crosslang), EFNIL, ELEN, ECSPM, CLARIN, CLAIRE (via University of Leiden), NEM (via Eurescom), LIBER, Wikimedia • 9 companies: Tilde, ELDA, Expert System, Sail Labs, Kantan MT (via Xcelerator MT), Pangeanic, Semantic Web Company, Ontotext (via Sirma AI), SAP • 29 research organisations: University of Vienna, University of Antwerp, Institute for Bulgarian Language, University of Zagreb, University of Copenhagen, University of Tartu, University of Helsinki, CNRS, Research Institute for Linguistics, Institute for Icelandic Studies, FBK, University of Latvia, Institute of the Lithuanian Language, Luxembourg Institute of Technology, University of Malta, University of Utrecht, Language Council of Norway, Polish Academy of Sciences, University of Lisbon, Romanian Academy, University of Cyprus, Slovak Academy of Sciences, Jozef Stefan Institute, Barcelona Supercomputing Center, Royal Institute of Technology, University of Zurich, University of Sheffield, University of Vigo, Bangor University European Language Equality 4
European Language Equality: Approach • Main result: Strategic Agenda and Roadmap – all deliverables provide input for this main report • Detailed description of the European Language Equality Programme (cf. EP resolution) • Research partners prepare updates of META-NET White Papers (one deliverable each). • Networks and initiatives to produce reports (one deliverable each) in which they collect, consolidate and present their own position, needs, wishes, demands, visions etc. • Companies to produce various technical deep dives for the different technology areas. • Several additional reports to be produced, primarily by the core partners. • Reports to be prepared in a more or less autonomous way based on templates • Reports to be used as input for the strategic agenda and roadmap – the main project result. • Total number of languages taken into account in ELE: approx. 75. • Close collaboration with ELG European Language Equality 5
WP1 (lead: R.C. “Athena”, ILSP) WP2 (lead: Charles University) European Language Equality: European Language Equality: Status Quo in 2020/2021 The Future Situation in 2030 WP3 (lead: Univ. of the Basque Country) Development of the Strategic Agenda and Roadmap WP4 (lead: DFKI) Communication – Dissemination – Exploitation – Sustainability WP5 (lead: ADAPT Centre) Project Management Work Packages European Language Equality 6
WP1 European Language Equality: Status Quo in 2020/2021 WP2 European Language Equality: The Future Situation in 2030 Task 1.1: Defining Digital Language Equality Task 2.1: The perspective of European LT developers (industry and research) Task 1.2: Language Technologies and Language-centric AI – State of the Art Task 2.2: The perspective of European LT users and consumers Task 1.3: Language Technology Support of Europe’s Languages in 2020/2021 Task 2.3: Science – Technology – Society: Language Technology in 2030 32 reports on the tech- Reports from Deep dives Report on Forecast: Digital Language Language Technology nology support of 32 networks, (MT, speech, external Language Equality: Definition of and language-centric AI: European languages initiatives and text analytics, consultations Technology the concept State of the Art (META-NET White Paper update) associations data) and surveys in 2030 WP3 Development of the Strategic Agenda and Roadmap Task 3.1: Desk research – landscaping Task 3.2: Consolidation and aggregation of all input received Task 3.3: Final round of feedback collection Existing strategic Strategic agenda and roadmap: Final round of feedback Strategic agenda and roadmap: documents and projects initial version collection final version in LT/AI WP4 Communication – Dissemination – Exploitation – Sustainability EP/EC Workshop ELE Conference Task 4.1: Overall project communication and dissemination Task 4.2: Liaise with EP/EC – organisation of a targeted workshop Work Task 4.3: Organisation of final ELE conference Task 4.4: Production of PR materials and sustainable results ELE Strategic Agenda and Roadmap (print version, interactive version) Final ELE Book Publication Packages and WP5 Project Management main Deliverables Task 5.1: Overall project management including Project Management Office Task 5.2: Digital collaboration and document management infrastructure European Language Equality 7
Start of the ELE project M1 ELE kick-off meeting M2 Digital collaboration and document Digital Language Equality – preliminary definition (D1.1) Continuous inclusion of M3 management infrastructure (D5.1) Promotional materials and PR package (D4.1); project infrastructure (D5.1) the community through various means (esp. meetings, website, M4 Specification of the consultation process including templates, surveys, events etc. (D2.1) email, discussion groups etc.) M5 M6 Communication and dissemination plan (D4.2) M7 M8 External consultation Project mgmt. report (D5.2) M9 Report on the state of the art in Language Technology and Language-centric AI (D1.2) and brainstorming meetings (both face-to-face and virtual) M10 M11 M12 M13 Digital Language Equality – full specification of the concept (D1.3) Feedback loops to include input and comments from Reports on 32 European languages (D1.4-D1.35) the Language Technology M14 Reports from relevant European initiatives (D2.2-D2.12); technology deep dives (D2.13-D2.16) community Strategic agenda including roadmap – initial version (D3.2) Timeline M15 Report on all external consultations and surveys (D2.17) Database and dashboard with the empirical data collected in D1.4-D1.35 (and others) (D1.36) M16 Report on the state of Language Technology in 2030 (D2.18) M17 Report on the final round of feedback collection (D3.3) Project mgmt. report (D5.3) Strategic agenda including roadmap – final version (D3.4) M18 8 End of the ELE project ELE EP/EC workshop (D4.3); ELE conference (D4.4); ELE book publication (D4.6)
HPC initiative (High Performance Computing) and RDA (Research Data Alliance), among others. DFKI Berlin hosts the German/Austrian Chapter of W3C and has a good working relationship to DIN (Deutsches Institut für Normung). Network – Initiative – Association Represented by ELE consortium partner(s) Association of European Research Libraries (LIBER) LIBER Big Data Value Association SAP Confederation of Laboratories for AI Research in Europe (CLAIRE) ULEID Cracking the Language Barrier DFKI and various others European Civil Society Platform for Multilingualism (ECSPM) ECSPM European Federation of National Institutions for Language (EFNIL) EFNIL European Language Equality Network (ELEN) ELEN European Language Grid (ELG) DFKI, ILSP, CUNI, ELDA and others European Lexicographic Infrastructure (ELEXIS) JSI European Research Infrastructure for LRs and Technology (CLARIN) CLARIN ERIC (CUNI and ILSP are members) LT-Innovate – Europe's LT Business Association CRSLNG (SAIL, EXPSYS, TILDE are members) META-NET DFKI, CUNI, ILSP, TILDE, ELDA and others New European Media (NEM) ERSCM Public-Private Partnership on AI (AI PPP) SAP, TILDE Wikipedia, Wikidata, Abstract Wikipedia WMD External networks, initiatives and associations that ELE will consult with through established connections AI4EU (European AI on Demand Platform), Dbpedia, Europeana, European Association for Machine Translation (EAMT), Networks European Commission (DG Translate, DG Interpretation/SCIC), European Parliament (DG Translation, CULT Committee, ITRE Committee), Global WordNet Association (GWA), HumanE-AI-Net (and other ICT-48-2020 projects), Network to and promote linguistic diversity (NPLD), World Wide Web Consortium (W3C) and various others initiatives Table 5: Networks, initiatives and associations – either represented by ELE partners or external ones ELE will communicate with all of these initiatives in terms of getting input and feedback for the strategic agenda and roadmap with a special emphasis on the consortium partners and relevant networks and initiatives (Table 5). European Language Equality 9 1.3.6 Current situation in the countries
partners are already members of the LTC). The fully populated LTC is meant to be a representative, balanced and inclusive body that includes representatives from all relevant stakeholders and from all European countries. No. Deliverable name WP Short name Type Diss. level Date D1.1 Digital Language Equality – preliminary definition 1 DCU R Public 3 D1.2 Report on the state of the art in Language Technology and Language-centric AI 1 EHU R Public 9 D1.3 Digital Language Equality – full specification of the concept 1 DCU R Public 13 D1.4 Report on Basque 1 EHU R+OTH Public 14 D1.5 Report on Bulgarian 1 IBL R+OTH Public 14 D1.6 Report on Catalan 1 BSC R+OTH Public 14 D1.7 Report on Croatian 1 FFZG R+OTH Public 14 D1.8 Report on Czech 1 CUNI R+OTH Public 14 D1.9 Report on Danish 1 UCPH R+OTH Public 14 D1.10 Report on Dutch 1 UU R+OTH Public 14 D1.11 Report on English 1 USFD R+OTH Public 14 D1.12 Report on Estonian 1 UTART R+OTH Public 14 D1.13 Report on Finnish 1 UHEL R+OTH Public 14 D1.14 Report on French 1 CNRS R+OTH Public 14 D1.15 Report on Galician 1 UVIGO R+OTH Public 14 D1.16 Report on German 1 DFKI R+OTH Public 14 D1.17 Report on Greek 1 ILSP R+OTH Public 14 D1.18 Report on Hungarian 1 NYTI R+OTH Public 14 D1.19 Report on Icelandic 1 SAM R+OTH Public 14 D1.20 Report on Irish 1 DCU R+OTH Public 14 D1.21 Report on Italian 1 FBK R+OTH Public 14 Deliverables D1.22 Report on Latvian 1 IMCS R+OTH Public 14 D1.23 Report on Lithuanian 1 LKI R+OTH Public 14 1/3 D1.24 Report on Luxembourgish 1 LIST R+OTH Public 14 D1.25 Report on Maltese 1 UOM R+OTH Public 14 D1.26 Report on Norwegian 1 LCNOR R+OTH Public 14 D1.27 Report on Polish 1 PAS R+OTH Public 14 D1.28 Report on Portuguese 1 ULIS R+OTH Public 14 D1.29 ReportEuropean Language Equality on Romanian 1 ICIA R+OTH Public 14 10 D1.30 Report on Serbian 1 FILFUB R+OTH Public 14
D1.22 Report on Latvian 1 IMCS R+OTH Public 14 D1.23 Report on Lithuanian 1 LKI R+OTH Public 14 D1.24 Report on Luxembourgish 1 LIST R+OTH Public 14 D1.25 Report on Maltese 1 UOM R+OTH Public 14 D1.26 Report on Norwegian 1 LCNOR R+OTH Public 14 D1.27 Report on Polish 1 PAS R+OTH Public 14 D1.28 Report on Portuguese 1 ULIS R+OTH Public 14 D1.29 Report on Romanian 1 ICIA R+OTH Public 14 D1.30 Report on Serbian 1 FILFUB R+OTH Public 14 D1.31 Report on Slovak 1 JULS R+OTH Public 14 D1.32 Report on Slovenian 1 JSI R+OTH Public 14 D1.33 Report on Spanish 1 BSC R+OTH Public 14 D1.34 Report on Swedish 1 KTH R+OTH Public 14 D1.35 Report on Welsh 1 BNGR R+OTH Public 14 D1.36 Database and dashboard with the empirical data collected in D1.4-D1.35 (and others) 1 ILSP R+OTH Public 16 D2.1 Specification of the consultation process including templates, surveys, events etc. 2 CUNI R Public 4 D2.2 Report from CLAIRE 2 ULEID R+OTH Public 14 D2.3 Report from CLARIN 2 CLARIN R+OTH Public 14 D2.4 Report from LT Innovate 2 CRSLNG R+OTH Public 14 D2.5 Report from META-NET 2 CUNI R+OTH Public 14 D2.6 Report from ELG 2 DFKI R+OTH Public 14 D2.7 Report from ECSPM 2 ECSPM R+OTH Public 14 D2.8 Report from EFNIL 2 EFNIL R+OTH Public 14 D2.9 Report from ELEN 2 ELEN R+OTH Public 14 D2.10 Report from LIBER 2 LIBER R+OTH Public 14 D2.11 Report from NEM 2 ERSCM R+OTH Public 14 D2.12 Report from Wikipedia 2 WMD R+OTH Public 14 D2.13 Technology deep dive Machine Translation 2 TILDE R Public 14 Deliverables 2/3 D2.14 Technology deep dive Speech Technologies 2 SAIL R Public 14 D2.15 Technology deep dive Text Analytics and Natural Language Understanding 2 EXPSYS R Public 14 D2.16 Technology deep dive Data 2 SWC R Public 14 D2.17 Report on all external consultations and surveys 2 CUNI R Public 15 D2.18 Report on the state of Language Technology in 2030 2 CUNI R Public 16 D3.1 European Report Languagestrategic on existing Equality documents and projects in LT/AI 3 EHU R Public 3 11 D3.2 Strategic agenda including roadmap – initial version 3 DFKI R Public 15
D2.8 Report from EFNIL 2 EFNIL R+OTH Public 14 D2.9 Report from ELEN 2 ELEN R+OTH Public 14 D2.10 Report from LIBER 2 LIBER R+OTH Public 14 D2.11 Report from NEM 2 ERSCM R+OTH Public 14 D2.12 Report from Wikipedia 2 WMD R+OTH Public 14 D2.13 Technology deep dive Machine Translation 2 TILDE R Public 14 D2.14 Technology deep dive Speech Technologies 2 SAIL R Public 14 D2.15 Technology deep dive Text Analytics and Natural Language Understanding 2 EXPSYS R Public 14 D2.16 Technology deep dive Data 2 SWC R Public 14 D2.17 Report on all external consultations and surveys 2 CUNI R Public 15 D2.18 Report on the state of Language Technology in 2030 2 CUNI R Public 16 D3.1 Report on existing strategic documents and projects in LT/AI 3 EHU R Public 3 D3.2 Strategic agenda including roadmap – initial version 3 DFKI R Public 15 D3.3 Report on the final round of feedback collection 3 EHU R Public 17 D3.4 Strategic agenda including roadmap – final version 3 DFKI R Public 18 D4.1 Promotional materials and PR package 4 DFKI R+OTH Public 3 D4.2 Communication and dissemination plan 4 DFKI R Public 6 D4.3 Report on EP/EC workshop 4 DFKI R Public 18 D4.4 Report on ELE conference 4 DFKI R Public 18 D4.5 Strategic agenda and roadmap (print version, online version) 4 DFKI R Public 18 D4.6 ELE book publication 4 DFKI R Public 18 D5.1 Digital collaboration and document management infrastructure 5 DCU R+OTH Confidential 3 D5.2 Project management report (interim report) 5 DCU R Confidential 9 D5.3 Project management report (final report) 5 DCU R Confidential 18 Table 7: List of deliverables Deliverables ELE European Language Equality 46 3/3 European Language Equality 12
Summary • ELE is a new EU project that started in January 2021 and ends in June 2022 • Its goal is the development of the Strategic Research, Innovation and Implementation Agenda and a Roadmap for achieving full Digital Language Equality in Europe by 2030 • Once-in-a-decade opportunity • Consortium with 52 partners covering all European countries and all major initiatives • Many consultation events, roundtables, stakeholder meetings planned • Close collaboration with its sister project European Language Grid (ELG) • Intended overlap between ELE and ELG consortium, ELG NCCs and META-NET members • Firmly establish LT and language-centric AI in Horizon Europe and Digital Europe Programme • Please participate in our stakeholder events, surveys and questionnaires! • ELE and ELG will finish up with a joint META-FORUM 2022 in June 2022. European Language Equality 13
You’re invited to stay in touch via https://european-language-equality.eu European Language Equality 14
European Language Equality Thank you! The European Language Equality project has received funding from Georg Rehm (Co-Coordinator ELE, DFKI) the European Union under grant agreement № LC-01641480 – 31-05-2021 CLARIN Café 101018166 (ELE). http://www.european-language-equality.eu
Presentation of EFNIL Sabine Kirchmeier Vice president European Federation of National Institutions for Language (EFNIL)
Outline 1. About EFNIL 2. EFNIL Projects 3. EFNIL institutions and CLARIN 2
European Federation of National Institutions for Language - EFNIL: 40 member organizations from 29 countries 3
EFNIL activities • Projects aiming at the description and analysis of the current linguistic situation in Europe – LLE : Articles describing language legislation in Europe – ELM: European Language Monitor – ELIPS: European Languages and their Intelligibility in the Public Space. • Scientifically based analysis of cross-state language problems and questions of language policy in annual conferences and publications • Consultation services in the field of language policy for political decision makers of the EU institutions and member states • Propagation of the cultural and practical benefits of European linguistic diversity and plurilingualism through relevant actions and publications. – EFNIL Master’s Thesis Award (MTA). www.efnil.org 5
European Language Monitor • It is a scientific review of the language situation in European countries repeated in intervals of 4 years. • The information is comparable over time. • The information is comparable across countries. • It provides exact reference to the actual legislation in each country. • It provides background knowledge about the status of the languages of Europe. 6
European Language Monitor 1. Country situation. Official, regional, indigenous, immigrant languages spoken within and outside the country, legal status, accordance with conventions 2. Legal situation. Language law, constitutional status, other regulations, language demands for citizenship 3. Primary and secondary education. Languages of instruction, languages offered 4. Tertiary education. Languages of instruction, languages used in publications and dissertations 5. Media. Papers, TV, film, music 6. Business. Regulations, company languages, annual reports, websites 7. Dissemination of languages. Official languages taught abroad 8. Language organizations. Official, non-governmental but publicly funded, private 9. Language technology. 7
European Language Monitor Visualisation online
What are language provisions about? 9
Official language plan or strategy? Yes; 10; 45% No; 12; 55%
Funding programme for language technology? No answer; 3; 14% Yes; 12; No; 7; 54% 32%
ELIPS ELIPS investigates the following topics: • Plain language policies and actions • Easy-to-read language policies and actions • Terminology policies and actions • Policies and actions on the use of other languages, gender, cultural and sexual diversity • Training of information providers in public institutions • Collaboration between translation services of the EU institutions and the experts in member states
1.6. Materials, instructions, services and tools Web Guidelines Models or Tools service(s)(online, pdf or templates Country printed) Austria No answer No answer No answer No answer Belgium (Flemish Community) No Yes Yes Yes Bulgaria Yes No No No Denmark Yes Yes Yes Yes Estonia No Yes No No Finland (Swedish) Yes Yes Yes No Finland (Finnish) Yes Yes Yes Yes Germany Yes Yes No Yes Grand Duchy of Luxembourg Yes Yes Yes No Greece Yes Yes Yes Yes Hungary No Yes No No Iceland Yes Yes No No Ireland (excl. Northern Ireland) Yes Yes No Yes Italy Yes Yes No No Latvia Yes No No No Lithuania Yes Yes No Yes Malta No answer No answer No answer No answer Netherlands Yes Yes No Yes Norway Yes Yes Yes Yes Portugal No answer No answer No answer No answer Slovak Republic Yes Yes No No Slovenia Yes No No Yes Sweden Yes Yes Yes Yes Switzerland No Yes No No UK (England) Yes Yes Yes No UK (Wales) No Yes Yes No UK (Northern Ireland) Unknown Unknown Unknown Unknown UK (Scotland) No No No No Table 1: Which materials, instructions, services and tools are available in your country in order to help public administration comply with the principles of plain language?
EFNIL conferences Thessaloniki 2010 “Language, languages and new technologies: ICT in the service of languages” London 2011 “The Role of Language Education in Creating a Multilingual Europe” Budapest 2012 “Lexical Challenges in Multilingual Europe” Vilnius 2013 “Translation and Interpretation in Europe” Florence 2014 “Language use in university teaching and research past, present, future” Helsinki 2015 “Language use in public administration – theory and practice in the European states” Warsaw 2016 “Stereotypes and linguistic prejudices in Europe” Mannheim 2017 “National language institutions and national languages” Amsterdam 2018 “Language variation: a factor of increasing language complexity and a challenge for language policy within Europe” Tallinn 2019 “Language and Economy: Language Industries in a Multilingual Europe“ Webinar 2020 “Language in the Corona Crisis” Cavtat 2021: “The role of national language institutions in the digital age” Conference publications available online on www.efnil.org
EFNIL Master’s Thesis Award The EFNIL Master’s Thesis Award is an annual competition to find the best master’s theses in Europe within the area of language use, language policy and multilingualism. EFNIL wishes to inspire and motivate young researchers to engage in scientific projects on language use, language policy and multilingualism, and to disseminate new ground-breaking research on language use, language policy and multilingualism to a wider audience. The students that submit the best three theses will each be awarded: • 1. the EFNIL Master's Thesis Award (1500 Euro) • 2. an invitation to present their thesis at the annual international EFNIL conference (all expenses paid) • 3. the opportunity to publish an article based on their thesis in EFNIL’s annual conference proceedings • 4. the opportunity to publish the full thesis on EFNIL’s website www.efnil.org. Next submission deadline is 31 December 2021.
EFNIL institutions and Digital Linguistics • 5 EFNIL member organizations are CLARIN centres • Some EFNIL institutions are collecting their own resources or participate in national ressource building projects, such as dictionaries and corpora. • EFNIL as an organisation, and 5 EFNIL member organizations are part of the ELE consortium. 16
EFNIL and CLARIN 17
How do EFNIL members use the CLARIN infrastructure? 18
What future improvements of CLARIN would you like to see? • More tools • More processed resources • More training possibilities • Legal workshops • Lower fees for small institutions • Clearer differentiation between CLARIN and other infrastructure projects/communities providing access to LT resources. If I have problem X, which infrastructure should I use? 19
Are EFNIL members satisfied with CLARIN?
Possible connections between EFNIL and CLARIN • More formalized exchange of information (promotion of news and conferences) • Involving more EFNIL members in CLARIN • Tapping into EFNIL projects (plain language corpora) • Joint efforts to cover more languages (minority languages) • Cooperating on policy and legal issues • Questions about language resources and infrastructures in ELM • Joint projects : – multilingual language data – written and spoken – open online language teaching environments – grammar and dictionary collections.
Thank you for listening
You can also read