TSS 2020 ONLINE - TermNet

Page created by Deborah Keller
 
CONTINUE READING
TSS 2020 ONLINE - TermNet
TSS 2020
   ONLINE
TSS 2020 ONLINE - TermNet
Terminology and
Knowledge Organization Systems (KOS)
      Michael Wetzel, Coreon GmbH, Berlin
TSS 2020 ONLINE - TermNet
Agenda

         What are KOS

           Examples

            Software

           Multilingual KOS

         Value of MKS
TSS 2020 ONLINE - TermNet
Types of Knowledge Organization Systems

   Classification     Taxonomies,
                                           Thesauri          Ontologies
     Systems         Nomenclatures
 • Hierarchical     • Often in         • In information   • In IT: formal
   concept            natural            science:           conceptual
   systems,           sciences:          controlled         shared
   usually            systematic         vocabularies       specifications,
   domain-            arrangements       for indexing       domain
   specific,          of terms seen      and                ontologies
   sometimes          as “scientific     information        often created
   universal in       names”             retrieval          by formalizing
   scope                                                    the previously
                                                            listed types of
                                                            KOS
TSS 2020 ONLINE - TermNet
An Example to Start with – the GEMET Thesaurus

                https://www.eionet.europa.eu/gemet/
TSS 2020 ONLINE - TermNet
Knowledge Organization …
… is definitely more than MS Sharepoint
   … is a part of
       information and library science,
       philosophy of science and of epistemology,
       also of knowledge management and knowledge engineering, and
       terminology science
   … investigates and represents structures of knowledge
   … has epistemological and cognitive science aspects
   … has linguistic and socio-cultural aspects (e.g. folk taxonomies)
   … has historical aspects (e.g. Leibniz, encyclopedism, administrative categorizations in ancient societies, history of
    science, etc.)
   … has a practical side, namely to creating and to use knowledge organization systems
   … is a crucial process in linguistic action (text organization)
TSS 2020 ONLINE - TermNet
Why? Values of Knowledge Organization Systems

       • To structure and       • Model structural      • Support targeted    • Search aids,
         archive the              components of           retrieval of          visual navigation,
         content of large         information             information based     query languages
         scale collections        systems and             on conceptual
                                  products                search criteria

       Master                   Design                  Find                   Enjoy

                   • Support tools          • Instruments of       • Teaching support,
                     (cross-lingual,          corporate              orientation
                     cross-disciplinary,      knowledge              support, didactic
                     cross-cultural)          management             tools

                   Communicate              Share                  Learn
TSS 2020 ONLINE - TermNet
Key Characteristics of KOS

        Model conceptual structures (hierarchical and non-hierarchical structures)

           Explicitation of conceptual links, definitions (mono- or multilingual)

              Terminological and linguistic standardization

              Increasingly formalized and digital (in particular as „ontologies“)

              Different scales (small KOS to large ones, 200K+ concepts)

           Increasingly with visualized structures, interactive user interfaces

        Static or dynamic (e.g. ontologies for modelling business processes in companies)
TSS 2020 ONLINE - TermNet
We Find many Domain-Specific KOS

            Medicine,
                                              Industry,
        health, bio- and   Business, trade
                                             engineering
         life sciences

            Natural        Administration,
                                               Culture
           sciences         government

          Pedagogy           Linguistics         …
TSS 2020 ONLINE - TermNet
Ontologies: Most Formal, most Expressive

               From                                To
               • Traditional field of philosophy   • Formal, digitally represented
                 (theory of being, of objects)       concept systems

          Concepts are explicitly defined, terms are assigned
          Relations between concepts are made explicit
          Relations work on classes and instances
          Logical application rules and constraints are
           specified
Agenda

         What are KOS

           Examples

            Software

           Multilingual KOS

         Value of MKS
Semantic Web: www.w3.org/standards/semanticweb

       From web of documents to the “Web of data”
        • a computer program can learn enough about what the data means to
          process it.
       Common framework
        • that allows data to be shared and reused across application, enterprise,
          and community boundaries.
       Led by World Wide Web Consortium (W3C)
        • A collaborative effort with participation from a large number of researchers
          and industrial partners.
       Based on RDF, the Resource Description Framework
        • integrates a variety of applications using URIs for naming.
Vocabularies Make the Semantic Web Work
   Define the concepts and relationships
       To describe and represent an area of concern
       Classify terms
       Define constraints on using terms
 Can be very complex or very simple
 No clear distinction between vocabularies and ontologies
       Ontology for more complex and quite formal collections
   Basic building blocks for inference techniques
RDF Plays a Key Role in Representing Vocabularies
   Resource Description                                         Statements about resources in
    Framework: a family of W3C                                    form of triples: subject-
    specifications                                                predicate-object expressions
   Knowledge represented in                                     Several serialization formats
    machine-readable way

                               “Mark Twain”        “is-author-of”        “Huckleberry Finn”
                                                      Predicate
                                  Subject                                       Object

                                 dc:relation “author";
               .          dc:book “HuckleBerry_Finn".

                                                N3 Syntax                                      Turtle Syntax
OWL: Web Ontology Language

                           Family of knowledge
                            representation
                            languages for
                            authoring ontologies
                           Languages are
                            characterized by
                            formal semantics and
                            RDF/XML-based
                            serializations for the
                            Semantic Web.
                           Based on the RDF
                            specification
SKOS: Simple Knowledge Organization System

         W3C standard, based on RDF
         Enables a migration towards OWL ontologies (“missing link”).
         Often required by Web services.
         Not a formal ontology (no axioms, etc.)
         For modeling controlled vocabularies such as thesauri or
          classifications which are of a different nature than formal ontologies.
         Ideas or meanings described by thesauri or other kinds of terminology
          are referred to as “concepts”
         -> “skosification” of controlled vocabularies (thesauri, etc.) and other
          terminologies!
How SKOS and RDF are Supposed to Work

                                                                                        I herewith inform about #4539

       …
       
                                                                                        Look into #218000, there you will find one of
                            my preferred labels
          
                          I have a related concept – look into #5347
          
          4539
          
                          I am a Concept, nothing else
          
          …
       
       …
       …
                           I herewith inform about #218000
          
          n/a
          land transport
                                                                      I am a PreferredTerm, nothing else
       …
       …
                                                                                        My value is land transport, for English
Real Applications: SKOS is often too “Simple”

      Portfolio of elements    Custom extensions
        often too limited:
      skos:broader, but not
      skos:broader-partitive

                               Extensions from other
                               standards
        Extend with own
           namespace

         No longer easy        Standard SKOS
          exchangeable         elements
Thesaurus Example: Eurovoc
     Maintained by: European Unions Publications Office
     V4.5 (June 2016): 6643 concepts, 21 domains, 127
      microthesauri, 23 languages
     Multidisciplinary
         Parliamentary activities
         European Union, EU legislation, EU activities, EU policies, EU
          institutions, EU regions
     Exact equivalence between concepts
         No coverage of regional or national concepts

                                 Terminological standardization of indexing vocabularies for accurate
                                 documentary searches

                                 Documents indexed in the documentalist’s language, searches made in the
                                 user’s language
Thesaurus Example: Eurovoc
Eurovoc Distribution: Available to anyone for Use

   Online: http://eurovoc.europa.eu
   Download: SKOS / RDF or XML
   Free to use, re-use, link and
    redistribute for commercial or
    non-commercial purposes!
Other Important Thesauri
          GEMET – General European Multilingual Environmental Thesaurus
          UNESCO Thesaurus
          CEDEFOP Thesaurus (vocational training)
          AAT – Art and Architecture Thesaurus
          AGROVOC Thesaurus (FAO)

          General Trends:
              Preparing for Semantic Web applications
              RDF, SKOS, Linked Data, ontologies
              Networking, mapping, interoperability

                                           Find more

                         Purchase pre-built ones
Wrapping Up: Resources are Different by Degree of Expressiveness

                                                                            Ontology
                                                                            •Several variants
                      Terminology                           Thesaurus        of typed relations
                                                            •Hierarchical   •Entity classes
                                            Taxonomy         Relations      •…
                                            •Hierarchical   •Associative
                                Synonym      relations       Relations
                                Rings
             Lists              •Synonymy
             •Ambiguity
Agenda

         What are KOS

           Examples

            Software

           Multilingual KOS

         Value of MKS
Taxonomies Inherently Pretty Established
                                 We use them …
                                     … in file folder structures
                                     … when tagging topics in Content Management
                                      Systems
                                     … when navigating the hierarchy of web pages
                                     …

       From: www.drupal.org
An Editor for Ontologies: Protégé
KOS Tools: Good, but not Appropriate for Language Requirements

                     Not for Language              Though, very helpful
                     • For trained experts only    • Classifications,
                       (!)                           nomenclatures
                     • Lexically organised         • Tagging in CMS
                       • Weak in managing          • Semantic Search
                          synonyms                 • Standards: SKOS, OWL
                       • Weak in multilingualism   • Conferences: semantics,
                     • Not for describing            KMWorld, SemTechBiz,
                       terminology data              Wissensmanagementtage
Agenda

         What are KOS

           Examples

            Software

           Multilingual KOS

         Value of MKS
Solution: Both Worlds Unified
a Multidimensional Repository for Knowledge and Language
Systematic development of terminologies:
(Re-)Gain Control through Structure

                          Yesterday:                                   MKS:
                         List of terms                  Visualised Multilingual Concept Map
           ...                    ...

           Grundschule            elementary school                                  Schule
                                                                                     school
           Gymnasium              grammar school

           …                      …

           Oberschule             secondary school                    Oberschule              Grundschule
                                                                      secondary               elementary
                                                                        school                  school
           …                      …

           Realschule             intermediate school

           …                      …                                                 Realschule
                                                            Gymnasium
                                                                                   intermediate
                                                          grammar school
           Schule                 school                                              school

           ...                    ...
Fusion of KOS with Terminology:
Multilingual Knowledge System (MKS) Coreon
Agenda

         What are KOS

           Examples

            Software

           Multilingual KOS

         Value of MKS
Multilingual Knowledge Systems are the Intelligent Drivers
behind Various Business Applications

                                       Documentation and
                                       Globalisation

                                           AI, ML, and MT

                                              Enterprise Search

                                              Auto-Classification

                                           Interoperability

                                       Staff Training
Contextual Terminology Work for Writing and Translating

                   Documentation and               Hook into authoring and CAT software
                   Globalisation
                                                   Work most visually in context
                       AI, ML, and MT

                          Enterprise Search

                          Auto-Classification

                       Interoperability

                   Staff Training                                      Looking up Coreon from
                                                                       within SDL Trados Studio
Boost Artificial Intelligence, Machine Learning and MT

                                                 … since we craft new words every day …

                    Documentation and
                    Globalisation

                        AI, ML, and MT

                           Enterprise Search
                                                      ML / neural-net-
                           Auto-Classification         based systems
                                                        in particular
                        Interoperability
                                                        struggle with
                    Staff Training                       rare words
Knowledge-based MT Training
Knowledge-based MT Workflow
Boost Artificial Intelligence, Machine Learning and MT

                    Documentation and
                    Globalisation

                        AI, ML, and MT

                                                  Revisor
                           Enterprise Search

                                                  Multilingual Knowledge
                           Auto-Classification
                                                   Manager
                        Interoperability          Linguistic Assets Curator
                    Staff Training                Translation Workflow
                                                   Engineer
Whether you Shop Online or Look for a File on the Intranet –
Search for Things instead of Strings!

                                                    Find items …
                    Documentation and                  … with any keyword
                    Globalisation
                                                       … in any spelling
                                                       … in any language
                        AI, ML, and MT
                                                    Find even related items
                                                     through concept map
                           Enterprise Search         (“Did you mean …?”)

                           Auto-Classification

                        Interoperability            Have more first-time
                                                     purchasers
                                                    Increase international
                    Staff Training
                                                     business
                                                    Increase e-sales
                                                    Happy audience
Analyze and Auto-Cluster Vast Amounts of Unstructured Texts

                   Documentation and
                   Globalisation

                       AI, ML, and MT

                                                                        Abstract, cluster,
                          Enterprise Search                             compare, and file

                          Auto-Classification
                                                                Tune classifier with
                                                                domain knowledge
                       Interoperability

                                                   Scan texts
                   Staff Training
MKS Facilitates Interoperability of Organizations

                    Documentation and
                    Globalisation
                                                             Remote controls for diapositive projectors

                        AI, ML, and MT                          Controls for projectors

                           Enterprise Search      Fernbedienungen für Beamer

                                                 Télécommandes pour
                           Auto-Classification           projecteurs

                        Interoperability

                    Staff Training
MKS Makes the Internet of Things Work –
by Mapping Values that the Devices Send to each other

                   Documentation and
                   Globalisation

                       AI, ML, and MT

                          Enterprise Search

                          Auto-Classification

                       Interoperability

                   Staff Training
Stop Searching – Start Exploring

                                                        “Styria”
                   Documentation and                       in         “Styrie”!
                                                        French?
                   Globalisation

                       AI, ML, and MT

                          Enterprise Search

                                                            How
                          Auto-Classification               many?

                       Interoperability
                                                           Regions

                   Staff Training
                                                Their                  In any
                                                names                language
Agenda

         What are KOS

           Examples

            Software

           Multilingual KOS

         Value of MKS
Terminology as the Connecting Link of Digitalization

                                      © 2017 dieEinheit
Terminology and KOS

           KOS – systematic organization and sharing of knowledge in
           machine and human readable ways

              Coreon, an MKS – A fusion of terminology with taxonomy /
              thesaurus, to capture language with knowledge in a holistic way

           MKS boost various business solutions, with a promising future and
           career paths for terminologists, linguists
What Users Say
                                              Like no other thesaurus,
                                          taxonomy or terminology tool –
                                            Coreon turned out to be the
                                          only comprehensive solution
                                            to implement a multilingual
                                                 knowledge system.
                                              L. Auer, Liebherr Rostock

           … thus Coreon combines
        terminology with the advantages
           of a knowledge structure ...
             F. Deubzer, SDI Munich

                                                                           47
Michael Wetzel, Coreon
               michael@coreon.com
               Thank you very much for your attention

                                     www.coreon.com
                          www.twitter.com/coreonapp
                       www.multilingualknowledge.com

48   TSS 2020 ONLINE                             1-4 July 2020
You can also read