ISA-17 17th Joint ACL - ISO Workshop on Interoperable Semantic Annotation Workshop Proceedings

Page created by Gerald Snyder
 
CONTINUE READING
ISA-17 17th Joint ACL - ISO Workshop on Interoperable Semantic Annotation Workshop Proceedings
ISA-17

17th Joint ACL - ISO Workshop on Interoperable Semantic
                      Annotation

                Workshop Proceedings
                     Harry Bunt, editor

                   June 16 - 17, 2021
ISA-17 17th Joint ACL - ISO Workshop on Interoperable Semantic Annotation Workshop Proceedings
©2021 The Association for Computational Linguistics

Order copies of this and other ACL proceedings from:

             Association for Computational Linguistics (ACL)
             209 N. Eighth Street
             Stroudsburg, PA 18360
             USA
             Tel: +1-570-476-8006
             Fax: +1-570-476-0860
             acl@aclweb.org

ISBN 978-1-954085-20-6

                                            ii
ISA-17 17th Joint ACL - ISO Workshop on Interoperable Semantic Annotation Workshop Proceedings
Message from the General Chair

Welcome the proceedings of the online ISA workshop at IWCS 2021!

Last year, the ISA-16 workshop at LREC 2020 had to be canceled altogether, but following the decision
of the LREC organisers we did publish the accepted submissions to the workshop in the proceedings that
can be found online at the ISA-16 website and in the ACL anthology.

This year we are in a slightly better shape since the IWCS 2021 conference that hosts the ISA-17
workshop was planned from the beginning to be held in online form. While online presentation of papers
and discussion tend to suffer from the online format, this does feel like a step forward compared to last
year. In particular, in 2020 we had planned to organise two exciting shared tasks, one on the annotation
of quantification phenomena and one on the representation of visual information, which were postponed
to this year and will go ahead this time. The discussion notes and annotations that were submitted for
these shared tasks have not been included in these proceedings, but are available at the ISA-17 website
(https://sigsem.uvt.nl/isa17).

We thank the members of the ISA-17 program committee for reviewing the submitted papers timely and
thoroughly, and we thank the authors of accepted papers for revising their contributions according to the
original time schedule, taking the review comments into account. We thank the participants in the two
shared tasks for their contributions, which promise to be most valuable for the further development of
adequate semantic annotation and representation schemes. Thank you!

Harry Bunt

                                                   iii
ISA-17 17th Joint ACL - ISO Workshop on Interoperable Semantic Annotation Workshop Proceedings
ISA-17 17th Joint ACL - ISO Workshop on Interoperable Semantic Annotation Workshop Proceedings
Organizing Committee
          Harry Bunt, Tilburg University (Netherlands)
      Nancy Ide, Vassar College, Poughkeepsie, NY (USA)
       Kiyong Lee, Korea University, Seoul (South Korea)
 Volha Petukhova, Saarland University, Saarbrücken (Germany)
 James Pustejovsky, Brandeis University, Waltham, MA (USA)
Laurent Romary, INRIA/Humboldt University, Berlin (Germany)
   Ielka van der Sluis, University of Groningen (Netherlands)

                Program Committee
                     Jan Alexandersson
                          Johan Bos
                         Harry Bunt
                     Nicoletta Calzolari
                      Jae-Woong Choe
                        Robin Cooper
                       Ludivine Crible
                       David DeVault
                       Simon Dobnik
                         Jens Edlund
                          Alex Fang
                     Robert Gaizauskas
                         Koiti Hasida
                          Nancy Ide
                       Elisabetta Jezek
                    Nikhil Krishnaswamy
                         Kiyong Lee
                       Paul McKevitt
                        Adam Meyers
                       Philippe Muller
                       Rainer Osswald
                      Volha Petukhova
                      Massimo Poesio
                    Andrei Popescu-Belis
                       Laurent Preévot
                     James Pustejovsky
                        Livio Robaldo
                      Laurent Romary
                     Ielka van der Sluis
                       Manfred Stede
                       Matthew Stone
                      Thorsten Trippel
                          Carl Vogel
                     Menno van Zaanen
                        Annie Zaanen
                     Heike Zinsmeister

                             v
ISA-17 17th Joint ACL - ISO Workshop on Interoperable Semantic Annotation Workshop Proceedings
ISA-17 17th Joint ACL - ISO Workshop on Interoperable Semantic Annotation Workshop Proceedings
Table of Contents

Developing a multilayer semantic annotation scheme based on ISO standards for the visualization of a
newswire corpus
     Purificação Silvano, António Leal, Fátima Silva, Inês Cantante, Fatima Oliveira and Alípio Mario
Jorge . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

Towards the ISO 24617-2-compliant Typology of Metacognitive Events
    Volha Petukhova and Hafiza Erum Manzoor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

Annotating Quantified Phenomena in Complex Sentence Structures Using the Example of Generalising
Statements in Literary Texts
     Tillmann Dönicke, Luisa Gödeke and Hanna Varachkina . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

The ISA-17 Quantification Challenge: Background and introduction
     Harry Bunt . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

Discourse-based Argument Segmentation and Annotation
    Ekaterina Saveleva, Volha Petukhova, Marius Mosbach and Dietrich Klakow . . . . . . . . . . . . . . . . . 41

Converting Multilayer Glosses into Semantic and Pragmatic forms with GENLIS
    Rodolfo Delmonte, Serena Trolvi and Francesco Stiffoni . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54

Unleashing annotations with TextAnnotator: Multimedia, multi-perspective document views for ubiqui-
tous annotation
     Giuseppe Abrami, Alexander Henlein, Andy Lücking, Attila Kett, Pascal Adeberg and Alexander
Mehler . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65

                                                                                         vii
ISA-17 17th Joint ACL - ISO Workshop on Interoperable Semantic Annotation Workshop Proceedings
ISA-17 17th Joint ACL - ISO Workshop on Interoperable Semantic Annotation Workshop Proceedings
Workshop Program

Developing a multilayer semantic annotation scheme based on ISO standards for
the visualization of a newswire corpus
Purificação Silvano, António Leal, Fátima Silva, Inês Cantante, Fatima Oliveira and
Alípio Mario Jorge

Towards the ISO 24617-2-compliant Typology of Metacognitive Events
Volha Petukhova and Hafiza Erum Manzoor

Annotating Quantified Phenomena in Complex Sentence Structures Using the Ex-
ample of Generalising Statements in Literary Texts
Tillmann Dönicke, Luisa Gödeke and Hanna Varachkina

The ISA-17 Quantification Challenge: Background and introduction
Harry Bunt

Discourse-based Argument Segmentation and Annotation
Ekaterina Saveleva, Volha Petukhova, Marius Mosbach and Dietrich Klakow

Converting Multilayer Glosses into Semantic and Pragmatic forms with GENLIS
Rodolfo Delmonte, Serena Trolvi and Francesco Stiffoni

Unleashing annotations with TextAnnotator: Multimedia, multi-perspective docu-
ment views for ubiquitous annotation
Giuseppe Abrami, Alexander Henlein, Andy Lücking, Attila Kett, Pascal Adeberg
and Alexander Mehler

                                 ix
ISA-17 17th Joint ACL - ISO Workshop on Interoperable Semantic Annotation Workshop Proceedings
Developing a multilayer semantic annotation scheme
       based on ISO standards for the visualization of a newswire corpus

                    Purificação Silvano1, António Leal2, Fátima Silva3, Inês Cantante4,
                                    Fátima Oliveira5 & Alípio Mário Jorge6
               1,2,3,4,5
                         University of Porto/ Centre of Linguistics 6University of Porto/ INESC
       1
           msilvano@letras.up.pt, 2jleal@letras.up.pt,3mhenri@letras.up.pt,
        4
            cantante.ines@gmail.com, 5moliv@letras.up.pt, 6amjorge@fc.up.pt

                            Abstract                               disadvantages of adapting/ adopting an existing
                                                                   model, or of creating one. Ideally, the model is
        In this paper, we describe the process of                  custom designed to deal with all the specificities
        developing     a    multilayer     semantic                of a particular project, but also broad enough so
        annotation scheme designed for extracting
                                                                   that it can be applied to other datasets. In fact, with
        information from a European Portuguese
        corpus of news articles, at three levels,
                                                                   the growth of the Semantic Web and Linguistic
        temporal, referential and semantic role                    Linked Data (Chiarcos et al., 2020),
        labelling. The novelty of this scheme is the               interoperability is key to read and to interpret
        harmonization of parts 1, 4 and 9 of the ISO               linguistic resources (Ide and Pustejovsky, 2010).
        24617 Language resource management -                            With all the above-mentioned provisos in mind,
        Semantic annotation framework. This                        we developed a multilayer semantic annotation
        annotation framework includes a set of                     scheme by combining three standards from the
        entity structures (participants, events,                   Language         resource     management-Semantic
        times) and a set of links (temporal,                       annotation framework: Part 1- Time and events
        aspectual, subordination, objectal and
                                                                   (ISO-24617-1), Part 4- Semantic roles (ISO-
        semantic roles) with several tags and
        attribute values that ensure adequate
                                                                   24617-4) and Part 9- Referential annotation
        semantic and visual representations of news                framework (ISO-24617-9). In addition to
        stories.                                                   promoting interoperability, our model has proven
                                                                   to be able to markup manually the relevant features
1       Introduction                                               of the genre news to generate visual
                                                                   representations of their narratives. Moreover, our
The development of an annotation framework can                     proposal operationalizes the integration of three
be an overwhelming task, even more when its                        different standards in the same framework, which
purpose is to account for different linguistic                     is, to the best of our knowledge, a novelty.
phenomena. However, as challenging as it may be,                        This multilayer semantic annotation scheme
designing an annotation scheme is an                               was designed to annotate a European Portuguese
indispensable step to generate language resources                  corpus of news articles in three different, but
that can be the starting point of fundamental                      complementary, levels, temporal, referential and
corpus-based linguistic research.                                  thematic, within the Text2Story project 1 , which
    When deciding on an annotation framework,                      aims to extract narratives from news, represent
one has to take into consideration several factors                 them in intermediate data structures, and make
(Pustejovsky et al., 2017), such as main objectives                these available to subsequent media production
of the annotation, the linguistic phenomena under                  processes, i.e., visualizations such as message
analysis, the corpus genre, and the nature of the                  sequence charts (MSC) and knowledge graphs
annotation, and weigh in the advantages and                        (KG). In this paper, we document the decision-

1
    https://text2story.inesctec.pt/

                                                               1
               Proceedings of the 17th Joint ACL - ISO Workshop on Interoperable Semantic Annotation, pages 1–13
                               June 16–17, 2021. ©2021 Association for Computational Linguistics
making process about which annotation format to             the case of news articles, this requires featuring
adopt, what adjustments to make, and how to                 participants, events and times, as well the
harmonize the three layers into an integrated and           relationships between them. For these reasons, the
wide-ranging model.                                         annotation scheme that we designed encompasses
                                                            three intertwined semantic layers: temporal,
2   Background and Motivation                               referential and thematic. Since our aim was to
                                                            adopt a coherent and interoperable annotation
News may frequently assume the format of a story            scheme with these three layers, and because none
that reports on current events involving one or             of the existing proposals satisfied these requisites,
more entities in given time and place. In addition to       we designed an annotation scheme which
the main event, however, news stories typically             compatibilizes three ISO.
present contextual content that allows connecting it
to others, explaining the circumstances and                 3 Related work
consequences of its occurrence. It may also include
other complementary information that frames,                Over the last years, there has been a proliferation of
comments, clarifies, or evaluates the reported              multilayer corpora, that is, corpora that “contain
events (Caswell and Dörr, 2019; Choubey et al.,             mutually independent forms of information, which
2020; cf. also van Dijk, 1985; Bell, 1991). A               cannot be derived from one another reliably”
complete story usually answers six questions:               (Zeldes, 2019: 4). These layers can be defined in an
what, who, where, when, why, and how, that is,              independent way and they “are explicitly analyzed
5W1H (a.o. Bonet-Jover et al., 2021), following a           using multiple, independent annotation schemes”
top-down organization, corresponding to an                  (Zeldes, 2019: 7), or resorting to one unique
inverted pyramid discourse structure (cf. Rabe              scheme that integrates all the layers. In fact, an in-
2008), in which information flows in decreasing             depth analysis of the relevant literature reveals that
order of importance. A news organization structure          there are many different types of multilayer
usually features a title, a lead, and the body. In          annotation schemes. In the remainder of this
many cases, the lead or introductory paragraph              section, we will only present a brief overview of
condenses the answers to the above six questions            some of those proposals.
and is followed by complementary information                   One of the most well accomplished and far-
(a.o. Thomson et al., 2008; Norambuena et al.,              reaching multilayer annotation schemes is the one
2020). Sometimes, the answer to some of the                 developed within the Groningen Meaning Bank
questions is distributed throughout the text (Bonet-        (GMB) (Basile et al., 2012; Bos et al., 2017).
Jover et al., 2021). Because of this organization,          Besides morphological and syntactic annotation, it
events frequently follow a non-chronological                comprises different semantic annotation levels,
order, presenting a complex time structure                  such as named entity recognition, temporal
regarding other kinds of narratives (Zahid et al.,          features, and thematic roles. The adopted semantic
2019). Besides, the narrative may return to                 formalism is an extension of Discourse
previous data, as well as adding information (a.o.          Representation Theory (Kamp and Reyle, 1993),
van Dijk, 1985; Thomson et al., 2008; Choubey et            which renders a semantic representation (discourse
al., 2020).                                                 representation structures) that unifies the various
   Establishing the temporal sequencing of events,          layers. Another important feature of this scheme is
their participants, and interrelations is crucial to        that it was designed to analyze linguistic
understand the news story, and ultimately to extract        phenomena in texts, instead of only sentences, and
the narratives to be represented graphically by             it has been used quite successfully in 10,000 texts
means of MSC (Harel and Thiagarajan, 2003) or               from different genres, namely news and fables. Its
KG (Ehrlinger and Wöß, 2016), which is our                  implementation requires a human-aided machine
project’s main objective. These visualizations by           annotation insofar as it employs NLP software such
portraying the narratives more schematically can            as an automatic tagger for named entity
be of great interest to news agencies, for example.         recognition, VerbNet (Schuler, 2005) for semantic
The more overarching and rigorous the annotation            role labelling, a semantic analyzer for coreference,
the more informative is the visualization, and, in          and then a module Boxer (Bos, 2005, 2008; Curran

                                                        2
et al. 2007), responsible for the overall semantic          of annotating semantically data at a sentential and
analysis, but also relies on the input of experts and       textual level. This task requires not only a great
general public. Although, in terms of semantic              amount of time, but also a wide variety and
annotation, it is one of the most complete, this            substantial number of resources. Nonetheless,
scheme lacks information about more referential             semantic schemes to represent the meaning of texts
relations. Moreover, since the temporal annotation          are of utmost relevance to the development of
is based on DRT-language, it does not integrate tags        different applications.
about lexical and contextual meaning with bearing
on temporal interpretation, namely a more                   4     The Annotation scheme
diversified class of events, and other link types
between events.                                             4.1    The process
   Other multilayer annotation schemes have been            Building a bootstrapping annotation scheme is a
developed for Manually Annotated Sub-Corpus                 very complex and time-consuming endeavor
(MASC) (Ide et al., 2008), Georgetown University            involving different phases. After the literature
Multilayer Corpus (GUM) (Zeldes and Simonson,               review, we started by defining the tags and their
2016; Zeldes, 2017), OntoNotes (Hovy et al.,                attributes first for the temporal layer, then for the
2006), for AMALGUM (Gessler et al., 2020), or               referential level, and finally for the semantic role
SenSem (Fernández and Vázquez, 2014), just to               labelling. To create a model, we followed the
name a few, but none of those provide a                     MATTER (Pustejovsky and Stubs, 2012) sub-
comprehensive and harmonized semantic                       cycle, MAMA, with four steps, (1) model, (2)
framework suitable to handle the linguistic                 annotate, (3) evaluate and (4) revise. This process
phenomena that we need to address.                          allowed us to identify and resolve the scheme’s
   For European Portuguese (EP), one can point              inconsistencies, gaps and incompatibilities, and to
out the scheme used in CINTIL DeepBank (Branco              gradually improve it so that it could properly
et al., 2010), which is a corpus of Portuguese news         account for the linguistic data, and to deliver the
and novels that is annotated with several                   necessary input for the visualization task. This
grammatical        information       (morphological,        cycle was repeated several times until we were
syntactic,      and      semantic)      for      each       satisfied with the model. The annotation tool that
sentence. Currently, there are 32497 sentences,             we used, BRAT (brat rapid annotation tool)
mainly from news, which were semi-automatically             (Stenetorp et al., 2012), enabled the updates of the
annotated with Treebank, DependencyBank,                    annotation scheme without having to rebuild the
Propbank, and LogicalFormBank (with formal                  whole scheme.
representations of the sentences meanings using
Minimal Recursion Semantics). However, the                  4.1.1 Temporal Layer
CINTIL DeepBank’s scheme does not include a                 Temporal interpretation plays a crucial part in
level for referential annotation, nor for temporal          understanding how the events are organized in
annotation. The fact that only the sentences that the       natural language texts. For this reason, extraction
grammar can parse are included in the corpus is a           of temporal information has been receiving a lot of
downside. Additionally, though each level of                attention within NLP during the past few years.
annotation can be accessed separately, a unifying           One approach to extract temporal features, and
formalism that combines all the layers is missing.          eventually to rebuild chronological sequences of
   Regarding schemes aimed exclusively at                   events, is designing a suitable annotation scheme.
semantic annotation, some are intended to handle a          In this field, research has started with the extraction
specific phenomenon, resort to non-standardized             of time expressions in message understanding
markup language, and are not widely known (cf.              conferences (MUCs) and progressed to relating
for an overview (Gries and Berez, 2017).                    events to times (eg. Filatova and Hovy, 2001; Katz
Moreover, the majority deals with lexical                   and Arosio, 2001; Song et al., 2016). From the
problems, such as word disambiguation, and less             growing investment on temporal extraction, on the
with compositional semantics. The scarcity of               one hand, and from its usefulness, on the other
proposals within this branch of semantics can be            hand, ensued not only a significant number of
explained by the complexity underlying the process          corpora annotated according to different schemes,

                                                        3
but also annotation standards. One of these                     Regarding the tag times, we adopted a very
standards is TimeML (Pustejovsky et al., 2003a,             simple scheme, which meets the needs of our
2003b), based on the work of Setzer (2001), Setzer          project. The attributes that incorporate our
& Gaizauskas (2000a, 2000b, 2001) and Ferro et              annotation scheme are the required ones, according
al. (2003), from which ISO-TimeML (ISO 24617-               to ISO-24617-1, that is, type (date, time, duration
1) stemmed.                                                 and set) and value (the specific value of the type).
   ISO-TimeML, a model grounded on linguistic               We have also integrated two optional attributes:
approaches (eg. Reichenbach, 1947; Comrie,                  temporal function with the value publication time
1985), defines a full-fledged markup language that          and anchor time, which are pertinent to process
permits a fine-grained annotation of time                   time expressions common in news articles, like
expressions, events, and temporal relations                 hoje ‘today’, na sexta-feira ‘Friday’.
between events and between events and time                     The sequencing of the events, that is, their
expressions. Its efficacy and productivity in               ordering, is essential to depict the way the narrative
capturing the text’s temporal structure is evidenced        evolves in time. ISO-24617-1 specifies the
by corpora such as TIDES Temporal Corpus                    adequate manner to establish the events timeline,
(Gerber et al., 2002), TimeBank (Pustejovsky et al.,        as well as the relations between events and time
2003b), composed of news articles, or Sun et al.            expressions by postulating TLinks, which we
(2013)’s corpus with clinical narratives. Costa             integrated in our scheme. In turn, the ALink, by
(2012) and Costa and Branco (2010, 2012) use                specifying the relation between aspectual verbs and
TimeML to annotate for the first time a EP corpus           their event arguments, gives crucial input to create
with temporal information, TimeBankPT. This                 the visualizations of the events. The relevance of
corpus, nonetheless, only comprises the                     the SLink derives from the fact that the news
translations of texts from the original TimeBank, as        articles frequently include contexts of
well as the same annotations with some adaptations          subordinating relationships between events. We
required by language specificities.                         omitted the measuring link (MLink) because the
   Compared to the scheme employed by                       information it conveys is already captured to a
TimeBankPT, the temporal tagset and linkset that            certain extent by the value duration for the attribute
we subscribe follow more closely ISO-24617-1. As            type of tag times. The values for the three links of
expected, bearing in mind the project’s main aim,           our model are the ones proposed by ISO-24617-1.
that is, visualization of news narratives, and the
necessity of not overloading the scheme with                4.1.2 Referential Layer
unnecessary information, some tags and links were            Pointing out to the referring expressions in a text,
excluded. Thus, for the temporal layer, our scheme          identifying the discourse entities denoted by those
incorporates two tags, event and times, and three           expressions, and establishing the links between
links, temporal link (TLink), aspectual link                them are key tasks to reference annotation, and
(ALink) and subordination link (SLink).                     underly referential phenomena in discourse, such
    The tag event marks eventualities (Bach, 1985),         as anaphora.
represented by tensed or untensed verbs,                       In our corpus, those referring expressions
nominalizations,         adjectives,      predicative       correspond to named entities, or participants that
constructions or prepositional complements. The             play an important role in the story. Therefore, we
combination of all the required attributes, class,          needed a framework to deal with named entities
part of speech, tense, aspect, verb form, mood,             recognition and their relation throughout the news
modality and polarity, provides the necessary               texts. ISO-24617-9 met these needs, as it is a meta-
information about temporal, aspectual and modal             model of referential annotation that articulates the
features of events. With respect to the values for          discourse domain with the linguistic domain,
each attribute, we maintained the ones established          contributing to a comprehensive representation of
by ISO-24617-1, namely for Italian, but added in            the discourse entities, the referring expressions that
the attribute mood the value future to account for          denote them, and their relations.
its modal uses, and the modality values dever                  Despite following the standard in its overall
(‘must’), poder (‘can’), ter de (‘have to’) and ser         guidelines, we did not annotate all its categories,
capaz de (‘be able to’).                                    and both discourse entity structures and referential

                                                        4
expression structures were kept as simple as                       nominal anaphora’s mechanisms. Unlike many
possible, to avoid overloading the process of                      studies that focus on anaphora resolution and
annotation: the former include only information                    depict only coreferential mechanisms, leaving out
concerning the lexical head (noun, pronoun),                       other types of relations, the adopted framework
whereas the latter include information concerning                  allows for the marking of different types of
domain (individuation and types) and involvement.                  anaphoric linkage between entities, namely direct
The individuation attribute, with the values set,                  and indirect anaphora.2
individual and mass, follows ISO-24617-9
definitions, while for involvement we defined the                  4.1.3 Semantic Role Layer
values: 0 (the empty set); 1 (a set with only one                  The task of semantic role labelling for English texts
entity); >1 (a set with more than one entity, but less             usually uses one of the following frameworks (see
than the totality of entities in the domain); all (the             also ISO 24617-4, Annex B): FrameNet (Baker et
totality of entities in the domain = universal                     al., 1998), VerbNet (Schuler, 2005), PropBank
quantification); undef (undefined involvement).                    (Palmer et al., 2005), EngVallex (Cinková, 2006),
   As for types, since ISO 24617-9 does not                        and LIRICS (Petukhova and Bunt, 2008).
provide a typology of named entities, we selected,                    As for EP data, there are some proposals that
considering our corpus text genre and the purpose                  approach the issue of semantic role labelling,
of the project, a tagset of six named entities: PER,               typically using the methodology of PropBank and
ORG, LOC, OBJ, NAT, OTHER. In fact, the                            VerbNet. However, these proposals have a very
definition of named entities is neither easy nor                   narrow scope, working with small datasets and
consensual, and there are several typologies for                   small lists of (typically) verbs. Some examples of
their classification, being the number and types of                these works are PropBankPT (Branco et al., 2012),
entities influenced by factors, such as the domain                 a corpus of 3406 sentences translated from the Wall
from which they are extracted or the purpose of its                Street Journal, and annotated with information
classification (for a survey on this topic, see, a.o.              concerning constituency structure (phrase
Nouvel et al., 2016; Goyal et al., 2018). This tagset              constituency and grammatical relations) and
is an adaptation of general categories depicted in                 semantic roles; and CINTIL-PropBank (Branco et
the named entity classification typologies used in                 al., 2012), a corpus of 10039 sentences extracted
many other corpora, including multilayer ones. The                 from news and novels, and annotated with
first three named entities are common to all the                   information concerning constituency structure and
annotated corpora while the others may vary.                       semantic roles. There is also ViPer (Talhadas et al.,
   In what concerns the relations included in ISO-                 2013), a verbal lexical database with information
24617-9, we did not include in our specifications                  about the verb’s arguments semantic roles (using
the lexical relational links between entity structures             PropBank approach) manually annotated.
and referring expressions (eg. synonym, antonym,                   However, there are some aspects of the semantic
hyponym, meronym), the referential status of                       roles list that is used that can be problematic for our
referring expressions (old/new), and the properties                project (for instance, event-denoting nouns are
of discourse entities (abstractness, animacy,                      treated as arguments of the “occurrence” type,
alienability, natural gender and cardinality),                     instead of being treated as events, like in ISO-
because they were not necessary for the visual                     24617-1).
representations of news. As a matter of fact, it is                   So, the semantic role labelling task in our project
more useful for visualization to mark two linguistic               could not be based on previous work done for EP,
expressions as referring to the same participant.                  and it had to be done from scratch. The easier way
Thus, our analysis only considers the proposed                     to do so was to use some established framework
objectal links (objectalIdentity, partOf, subset,                  and adapt it to EP, but the methodology typically
memberOf and referentialDisjunction) between                       used in frameworks designed for English (eg.
discourse entities, which allows to represent                      FrameNet) requires that, for each verb, a frame be

2                                                                  compatible with Universal Dependencies, and that codifies
         The       Universal       Anaphora       initiative
(https://universalanaphora.github.io/UniversalAnaphora/)           different aspects of the anaphoric phenomena.
has been working towards a proposal markup scheme

                                                               5
constructed, and the construction of each frame           particular part of SemAF by resorting to some
entails many examples with the same verb and their        notations from other parts of the ISO. Gaizauskas
analysis (to identify all the meanings the verb can       and Alrashid (2019), for instance, put forward a
have and all the constructions in which it can            scheme with some annotations from ISO-24617-
occur), to determine its semantic selection. This         1/7, but do not refer to issues related to
work would be colossal, and impracticable taking          incompatibilities. Therefore, in the process of
into account the time frame and objectives of the         constructing our model, we had to overcome these
project. Therefore, we needed a framework that            difficulties in order to obtain a fully integrated
would allow semantic annotation to be limited to          scheme.
the analysis of concrete examples of the news to be          We began by modelling the types of structures
annotated. We started working with the framework          as entity structures and link structures, and defined
provided by LIRICS, which was the most                    subtypes for each type, as described in Figure 1.
appropriate for the task. Furthermore, as LIRICS
was the basis for the construction of the ISO
                                                                                                    events
standard for thematic annotation, there would be
fewer potential problems when integrating                                            Entity
                                                                                                    times
                                                                                   structures
semantic role annotation with referential and
temporal annotation.                                                                              participants
   Consequently, in our project, we annotate                    Multilayer
                                                             semantic annotatin                   temporal
semantic      roles      following    ISO-24617-4               framework                           link
specifications in what concerns semantic roles. We                                                aspectual
do not construct entity structures, nor event                                                       link
structures in this level of annotation. Instead, we                                   Link
                                                                                                 subordination link
use the entity structures constructed in the                                       structures
referential annotation to deal with non-                                                         objectal link
event discourse entities, and the entity structures
constructed in the temporal annotation to deal                                                    semantic
                                                                                                  role link
with event discourse entities. The semantic role
annotation consists in establishing the thematic
                                                            Figure 1: Text2story semantic annotation framework
relation between predicates and their arguments
and modifiers.

4.2   Harmonizing Different Layers                           This annotation structure is the first step to
                                                          guarantee that all the layers are combined into a
The foregoing describes how the markup language           coherent annotation scheme. The entity structures,
used in each layer of our annotation scheme was           regardless of the layer to which they are associated,
extracted from three different standards. Although        are available to be related among them by different
they comply with the principles for semantic              types of link structures. Such unifying approach
annotation (ISO-24617-6), in fact, they were              facilitates a uniform semantic representation in
elaborated separately and assynchronically, and           discourse representation structures (DRS).
they lack information about how to combine them              The next step was to decide on the attributes and
with each other. ISO-24617-6, in addition to              their respective values, so the information they
defining some overall guidelines for the semantic         codified would be compatible and not repetitive, as
annotation framework (SemAF), attempts at                 explained in the previous sections. The final
tackling some overlaps and inconsistencies                annotation scheme is presented in Table 1.
between the different parts of the SemAF, but its
coverage is limited. This means that, when
combining different parts of the SemAF, as we did,
it is expected that not only some incompatibilities
may arise, but also some loose-ends and gaps may
be left unsolved. Proposals such as Bunt (2019)
improve the absence of some information in one

                                                      6
The harmonization of the different annotation
                          ENTITY STRUCTURES                             layers using ISO-standards presented us with some
                 class            occurrence, state, reporting,         mismatches between the three ISOs, which had to
                                  perception, aspectual, I-             be addressed and solved. As an illustration, we
                                  action, I-state
                                                                        present two of those issues.
                 type             state, process, transition
                                                                           Concerning markables, while the thematic
                 pos              verb, noun, adjective,
                                  preposition
                                                                        annotation specifications in ISO 24617-4 foretold
                 tense            present, past, future,                that a clause may receive a semantic role, the
                                  imperfect, none                       referential ISO does not stipulate any entity
                 aspect           progressive, perfective,              structure for clauses. Our solution to this problem
EVENTS

                                  imperfective, imperfective-
                                  progressive, perfective-
                                                                        was to mark the event structure corresponding to
                                  progressive, none                     the verbal predicate of the subordinated clause so
                 vform            none, gerundive, infinitive,          that the semantic role link can be set up.
                                  participle                            Accordingly, in a sentence like John said that Mary
                 mood             none, subjunctive,                    went to Porto the chunk that is linked to “said” by
                                  conditional, future,
                                                                        the semantic role theme is not the whole clause, but
                                  imperative
                 modality         dever, poder, ter de, ser             only the verb “went”, because it has been already
                                  capaz de                              associated to an entity structure, more precisely to
                 polarity         negative, positive                    an event structure, in the temporal layer, contrary
                 type             date, time, duration, set             to the clause. This solution adopts a Neo-
                 value            specific value                        Davidsonian perspective of the relation between
TIME

                 temporal         publication_time
                                                                        events and their arguments and considers that all
                 function                                               entities with an event structure annotated in the
                 anchortime       time ID (select relevant time)        temporal level correspond to an event argument of
                 lexical head     noun, pronoun                         a predicate. So, in a Neo-Davidsonian version, the
PARTICIPANTS

                                  individuation: set, individual,       sentence above would have the following logical
                                  mass                                  form: ∃e1 [SAY (e1) & AGENT (e1, John) & ∃e2
                       domain
                                  types: per, org, loc, obj, nat,       [GO (∃e2) & AGENT (∃e2, Mary) & TO (∃e2,
                                  other
                                                                        Porto) & THEME (e1, e2)]].
                 involvement      0,1, >1, all, undefined
                                                                           However, some problems are of more difficult
                                                                        resolution. ISO-24617-4 envisages that some
                            LINK STRUCTURES                             adverbial phrases may be attributed the semantic
                                  before, after, includes.
                                                                        role of manner, like “tightly” in the sentence The
                                  is_included, during,                  tiny stick was fastened tightly to his wrist (ISO-
                 Temporal
                   links
                                  simultaneous, identity,               24617: 23). Nonetheless, “tightly” in our
                                  begins, ends, begun_by,               framework (and in the relevant ISO-standards, for
                                  ended_by
                                  initiates, culminates,                that matter) cannot be marked as any kind of entity
           Aspectual links        terminates, continues,                structure. We could simply disregard it because it is
                                  reinitiates                           a modifier, but in some cases manner adverbial
                                  intensional, evidential,
 Subordination links              neg_evidential, factive,
                                                                        phrases are complements (The child behaved
                                  counter_factive, conditional          badly), conveying pertinent information to the
                                  objectalIdentity, partof,             story, and, hence, they should be annotated. At this
               Objectal links     subset, memberOf,                     moment, we still have no means to come to grips
                                  referentialDisjunction
                                  agent, source, location,
                                                                        with this conundrum.
                                  path, goal, time, theme,                 Despite the above-mentioned hurdles, we have
                                  instrument, partner, patient,         been able to conciliate three ISO-standards and
                                  pivot, cause, beneficiary,            produce a consistent and complete multilayer
   Semantic role links            result, reason, purpose,
                                  manner, medium, means,                semantic annotation scheme, which not only
                                  setting, initialLocation,             adequately serves the purpose of our project, but
                                  finalLocation, distance,              may also contribute to other annotations’ schemes.
                                  amount, attribute
Table 1: Text2story annotation scheme

                                                                    7
Table 1: Text2story annotation scheme

Table 1: Text2story annotation scheme 1
5 An Annotated Example                                       e6 simultaneous e4

In our model, the annotation procedure consists of           SLINK
three stages. Example (1) will serve to illustrate the       e4 intensional e3
three stages.                                                e6 intensional e4

(1) 20/03/2021                                                  In the second stage, the participants are
Cientistas que estudavam a erupção de um vulcão              identified, and they are related to each other by
da Islândia decidiram esta sexta-feira usar a lava           objectal links.
expelida da cratera para assar salsichas.
Scientists that were studying the eruption of a              PARTICIPANTS
volcano of Iceland decided this Friday to use the            p1=cientistas que estudavam a erupção de um
lava expelled from the crater to roast sausages.             vulcão     na   Islândia lexical      head=noun
                                                             individuation=individual type =per involvement=
   In the first stage, the annotator marks the entity        >1
structures of events and times, and, then, the               p2=que head=pronoun individuation=individual
temporal, aspectual and subordination links are              type =per involvement= >1
established.                                                 p3=um vulcão da Islândia head=noun
                                                             individuation=individual          type      =per
EVENTS                                                       involvement=1
e1=estudavam class=occurrence type=process                   p4=a lava expelida da cratera head=noun
pos=verb      tense=past     aspect=imperfective             individuation= mass type =nat involvement=1
polarity=pos vform=none mood=none                            p5= a lava head=noun individuation=mass
e2=erupção      class=occurrence    type=process             type=nat involvement= undef
pos=noun tense= none aspect= none polarity= pos              p6=a cratera head=noun individuation= individual
vform=none mood=none                                         type =nat involvement=1
e3=decidiram class=occurrence type=transition                p7=salsichas head=noun individuation= individual
pos=verb tense= past aspect=perfective polarity=             type =obj involvement=>1
pos vform= none mood= none
e4= usar class=occurrence type= process pos=verb             OBJECTAL LINKS
tense=none        aspect=none        polarity=pos            p2 ObjIdentity p1
vform=infinitive mood= none                                  p5 partOf p3
e5=expelida class=occurrence type=transition                 p6 partOf p3
pos=verb tense=past aspect= perfective polarity=
pos vform=participle mood=none                                 In the third stage, the annotator connects
e6=assar      class=occurrence     type=process              participants to events by semantic role links.
pos=verb tense=none aspect=none polarity=pos
vform= infinitive mood=none                                  SEM_ROLE_LINK
                                                             p1=agent (e3)
TIME EXPRESSIONS                                             p2=agent (e1)
t1=20/03/2021 type=date value=20-03-2021                     p3=patient (e2)
FunctionInDocument= publication time                         p4=instrument (e4)
t2=esta sexta-feira type=date value=19-03-2021               p5=theme (e5)
AnchorTimeID=t1                                              p6=initial location (e5)
                                                             p7=patient (e6)
TLINK                                                        e6=purpose (e4)
e2 before e1                                                 p1=agent (e4)
e3 is_included e1                                            p1=agent (e6)
e3 is_included t1                                            e2=theme (e1)
e4 after e3                                                  e4=theme (e3)
e5 before e3

                                                         8
After carrying out this manual annotation in the              links available in the referential and thematic
annotation tool BRAT 3 , our project’s pipeline                  layers. Likewise, a more detailed information
includes two more modules: the Brat2DRS, which                   regarding quantification of participants and of
takes the annotation file generated by Brat, parses              events is a component to be improved in the future.
it, and creates a DRS representation; and the                    At this moment, this kind of information has a very
BRAT2Viz, which takes as input the DRS                           simplified representation solely in the referential
representation, and deploys a web application that               layer, which does not fully represent the different
produces the visualizations in the form of MSC or                possibilities of quantification over entities.
KG (Amorim et al., 2021).
                                                                 Acknowledgments
6    Conclusion                                                  The authors wish to thank the reviewers for their
                                                                 constructive comments. This research is financed
In this paper, we present an annotation framework
                                                                 by the ERDF – European Regional Development
for news articles in EP that aims to provide the
                                                                 Fund through the North Portugal Regional
input for visualization processes. First, we
                                                                 Operational Programme (NORTE 2020), under the
determined what type of information was
                                                                 PORTUGAL 2020 and by National Funds through
necessary to account for events and participants in
                                                                 the Portuguese funding agency, FCT - Fundação
the narratives, and decided that three annotation
                                                                 para a Ciência e a Tecnologia within project
layers - temporal, referential and thematic - were
                                                                 PTDC/CCI-COM/31857/2017 (NORTE-01-0145-
required. The next step was to decide which tags
                                                                 FEDER-03185). The usual disclaimers apply.
and links should be used in each layer to fulfill the
annotation purposes. Since interoperability is                   References
crucial when we talk about semantic resources,
three standards ISO 24617-1/4/9 were utilized to                 Amorim, Evelin; Ribeiro, Alexandre; Cantante, Inês;
                                                                   Jorge, Alípio; Santana, Brenda; Nunes, Sérgio;
create a multilayer semantic annotation scheme.                    Silvano, Purificação; Leal, António; & Campos,
Notwithstanding the fact that these standards are,                 Ricardo (2021). Brat2Viz: a Tool and Pipeline for
in fact, themselves three parts of the same standard,              Visualizing Narratives from Annotated Texts. In
when combined, some inconsistencies arise. So, we                  Text2Story 2021. Fourth International Workshop on
had to harmonize the three layers, to attain a                     Narrative Extraction from Texts. (pp. 49-56). Lucca,
cohesive annotation framework. Additionally, we                    Italy: CEUR Workshop Proceedings, CEUR-
sought to balance the amount of information                        WS.org.
needed to capture the news stories and the load of               Bach, Emmon (1985). The algebra of events.
the annotation process.                                            Linguistics and Philosophy, 9, 5–16.
   Although this model was built to capture the                  Baker, Collin; Fillmore, Charles; & Lowe, John (1998).
structure of stories in news in EP, its scope is not               The Berkeley FrameNet project. In Proceedings of
limited to news nor to EP, as it can be extended to                the Conference on 36th Annual Meeting of the
other narrative texts and other languages with some                Association for Computational Linguistics and
adaptations to deal with genre and language                        17th International Conference on Computational
specificities. Moreover, the integration of three                  Linguistics.      (pp.   86–90).          Montréal,
                                                                   Quebec: Université de Montréal. Retrieved from
different layers in a single annotation framework
                                                                   https://www.aclweb.org/anthology/P98-1013/
enables formal semantic representation with DRS,
which acts as an intermediate language to generate               Basile, Valerio; Bos, Johan; Evang, Kilian; &
visualizations in the form of knowledge graphs, for                Venhuizen, Noortje J. (2012). Developing a large
                                                                   semantically annotated corpus. In Proceedings of
instance.
                                                                   the Eighth International Conference on Language
   In the future, we intend to endow our annotation                Resources and Evaluation. (pp. 3196–3200).
scheme with more granularity. To this end, ISO                     Istanbul, Turkey: ELRA. Retrieved from
standard for spatial information (ISO 24617-7) will                https://www.aclweb.org/anthology/L12-1299/
be added to our framework. For now, spatial
annotation has relied on the tags, attributes and

3
  https://nabu.dcc.fc.up.pt/brat/#/examples_demos/paper_IS
A-17

                                                             9
Branco, António; Costa, Francisco; Silva, João;                 Caswell, David; & Dörr, Konstantin (2019).
  Silveira, Sara; Castro, Sérgio; Avelãs, Mariana;                Automating Complex News Stories by Capturing
  Pinto, Clara; & Graça, João (2010). Developing a                News Events as Data. Journalism Practice, 13(8),
  Deep Linguistic Databank Supporting a Collection                951–955. doi.org/10.1080/17512786.2019.1643251
  of Treebanks: the CINTIL DeepGramBank. In
                                                                Chiarcos, Christian; Klimek, Bettina; Fäth, Christian;
  Proceedings of the Seventh International
                                                                  Declerck, Thierry; & McCrae, John P. (2020). On
  Conference on Language Resources and Evaluation
                                                                  the Linguistic Linked Open Data Infrastructure. In
  (pp. 1810–1815). Valletta, Malta: ELRA. Retrieved
                                                                  Proceedings of the 1st International Workshop on
  from
                                                                  Language Technology Platforms (IWLTP 2020) (pp.
  http://www.di.fc.ul.pt/~ahb/pubs/2010BrancoCosta
                                                                  8–15). Language Resources and Evaluation
  SilvaEtAl.pdf
                                                                  Conference (LREC). Marseille, France. Retrieved
Branco, António; Carvalheiro, Catarina; Pereira,                  from
  Sílvia; Avelãs, Mariana; Pinto, Clara; Silveira, Sara;          https://www.aclweb.org/anthology/2020.iwltp-1.2/
  Costa, Francisco; Silva, João; Castro, Sérgio; &
                                                                Choubey, Prafulla Kumar; Lee, Aron; Huang, Ruihong;
  Graça, João (2012). A PropBank for Portuguese:
                                                                  & Wang, Lu (2020). Discourse as a Function of
  The CINTIL-PropBank. In Proceedings of the Eight
                                                                  Event: Profiling Discourse Structure in News. In
  International Conference on Language Resources
                                                                  Proceedings of the 58th Annual Meeting of the
  and Evaluation (pp. 1516–1521). Istanbul, Turkey:
                                                                  Association for Computational Linguistics. (pp.
  ELRA.       Retrieved     from      http://www.lrec-
                                                                  5374–5386). Association for Computational
  conf.org/proceedings/lrec2012/summaries/373.htm
                                                                  Linguistics.            Retrieved            from
  l
                                                                  https://www.aclweb.org/anthology/2020.acl-
Bell, Allan (1991). The Language of News Media.                   main.478/
  Oxford: Blackwell.
                                                                Cinková, Silvie (2006). From PropBank to EngValLex:
Bonet-Jover, Alba; Piad-Morffis, Alejandro; Saquete,              Adapting the PropBank-Lexicon to the Valency
  Estela; Martínez-Barco, Patricio; & García-                     Theory of the Functional Generative Description. In
  Cumbreras, Miguel Ángel (2021). Exploiting                      Proceedings of the Fifth International Conference
  discourse structure of traditional digital media to             on Language Resources and Evaluation (LREC’06)
  enhance automatic fake news detection. Expert                   (pp. 2170–2175). Genova, Italy: European
  Systems with Applications, 169, 1–19. doi:                      Language            Resources          Association
  10.1016/j.eswa.2020.114340                                      (ELRA). https://www.aclweb.org/anthology/L06-
                                                                  1058/
Bos, Johan (2005). Towards wide-coverage semantic
  interpretation. In Proceedings of IWCS-6. (pp. 42–            Comrie, Bernard (1985). Tense.            Cambridge:
  53). Tilburg, The Netherlands. Retrieved from                  Cambridge University Press.
  https://www.let.rug.nl/bos/pubs/Bos2005IWCS.pdf
                                                                Costa, Francisco (2012). Processing Temporal
Bos, Johan (2008). Wide-Coverage Semantic Analysis                Information in Unstructured Documents. (Doctoral
  with Boxer. In Johan Bos & Rodolfo Delmonte                     dissertation, Universidade de Lisboa). Retrieved
  (eds.). Semantics in Text Processing. STEP 2008                 from https://repositorio.ul.pt/handle/10451/8639
  Conference Proceedings, volume 1 of Research in
                                                                Costa, Francisco; & Branco, António (2010). Temporal
  Computational Semantics. (pp. 277–286). College
                                                                  information processing of a new language: Fast
  Publications.
                                                                  porting with minimal resources. In ACL2010—
Bos, Johan; Basile, Valerio; Evang, Kilian; Venhuizen,            Proceedings of the 48th Annual Meeting of the
  Noortje J.; & Bjerva, Johannes (2017). The                      Association for Computational Linguistics (pp.
  Groningen Meaning Bank. In Nancy Ide & James                    671–677). Uppsala, Sweden. Retrieved from
  Pustejovsky (eds.), Handbook of Linguistic                      https://www.aclweb.org/anthology/P10-1069/
  Annotation (pp. 463–496). USA: Springer. ISBN
                                                                Costa, Francisco; & Branco, António (2012).
  978-94-024-0879-9. doi.org/10.1007/978-94-024-
                                                                  Extracting temporal information from Portuguese
  0881-2_18
                                                                  texts. In Helena Caseli; Aline. Villavicencio;
Bunt, Harry (2019). Plug-ins for content annotation of            António Teixeira; & Fernando Perdigão (Eds).
  dialogue acts . In Proceedings of the 15th Joint ACL            Computational Processing of the Portuguese
  - ISO Workshop on Interoperable Semantic                        Language. PROPOR 2012. Lecture Notes in
  Annotation (ISA-15) (pp.33–45). Gothenburg,                     Computer Science, vol 7243 (pp. 99–105). Springer,
  Sweden.                 Retrieved               from            Berlin, Heidelberg. doi.org/10.1007/978-3-642-
  https://sigsem.uvt.nl/isa15/ISA-15_proceedings.pdf              28885-2_11

                                                           10
Curran, James; Clark, Stephen; & Bos, Johan (2007).               Retrieved                                   from
  Linguistically Motivated Large-Scale NLP with                   https://www.aclweb.org/anthology/2020.lrec-
  CandC and Boxer. In Proceedings of the 45th                     1.648/
  Annual Meeting of the Association for
                                                               Goyal, Archana; Vishal, Gupta; & Kumar, Manish
  Computational Linguistics Companion Volume
                                                                 (2018). Recent Named Entity Recognition and
  Proceedings of the Demo and Poster Sessions (pp.
                                                                 Classification techniques: A systematic review.
  33–36). Prague, Czech Republic. Retrieved from
                                                                 Computer      Science     Review,    29, 21–43.
  https://www.aclweb.org/anthology/P07-2009/
                                                                 doi.org/10.1016/j.cosrev.2018.06.001
Ehrlinger, Lisa; & Wöß, Wolfram (2016). Towards a
                                                               Gries, Stefan Th.; & Berez, Andrea L. (2017).
  definition of knowledge graphs. In: SEMANTiCS
                                                                 Linguistic Annotation in for Corpus Linguistics.
  (Posters, Demos, SuCCESS), 48, 1-4. http://ceur-
                                                                 The Groningen Meaning Bank. In Nancy Ide &
  ws.org/Vol-1695/paper4.pdf
                                                                 James Pustejovsky (Eds.). Handbook of Linguistic
Fernández-Montraveta, Ana; & Vázquez, Gloria                     Annotation (pp. 379–410). USA: Springer. ISBN
  (2014). The SenSem Corpus: an annotated corpus                 978-94-024-0879-9.
  for Spanish and Catalan with information about
                                                               Harel, David; & Thiagarajan, P.S. (2003). Message
  aspectuality,      modality,      polarity      and
                                                                 Sequence Charts. In Luciano Lavagno; Martin
  factuality. Corpus Linguistics and Linguistic
                                                                 Grant; & Bran Selic (Eds.). UML for Real: Design
  Theory, 10 (2), 273–288. doi.org/10.1515/cllt-2013-
                                                                 of Embedded Real-Time Systems (pp. 77–105).
  0026
                                                                 USA: Springer. ISBN 978-0-306-48738-5.
Ferro, Lisa; Gerber, Laurie; Mani, Inderjeet; & Wilson,          https://doi.org/10.1007/0-306-48738-1_4
  George (2003). TIDES 2003 standard for the
                                                               Hovy, Eduard; Marcus, Mitchell; Palmer, Martha;
  annotation of temporal expressions (technical
                                                                 Ramshaw, Lance; & Weischedel, Ralph (2006)
  report). The MITRE Corporation. Retrieved from
                                                                 OntoNotes: the 90% solution. In Proceedings of the
  https://www.mitre.org/sites/default/files/pdf/ferro_t
                                                                 Human Language Technology Conference of the
  ides.pdf
                                                                 NAACL, Companion Volume: Short Papers. (pp.
Filatova, Elena; & Hovy, Eduard (2001). Assigning                57–60). Stroudsburg, PA: Association for
   Time-Stamps to Event-Clauses. In Proceedings of               Computational Linguistics. Retrieved from
   the ACL-EACL 2001 Workshop on Temporal and                    https://www.aclweb.org/anthology/N06-2015/
   Spatial Information Processing (pp. 88–95).
                                                               Ide, Nancy; Baker, Collin; Fellbaum, Christiane;
   Toulouse:    Association    for   Computational
                                                                  Fillmore,      Charles;       &       Passonneau,
   Linguistics.           Retrieved           from
                                                                  Rebecca (2008). MASC: The manually annotated
   https://www.aclweb.org/anthology/W01-1313/
                                                                  Sub-Corpus of American English. In Proceedings of
Gaizauskas, Robert, & Alrashid, Tarfah. (2019)                    the 6th International Conference on Language
  SceneML: A Proposal for Annotating Scenes in                    Resources and Evaluation, LREC 2008 (pp. 2455-
  Narrative Text, In Proceedings of the 15th Joint ACL            2460). European Language Resources Association
  - ISO Workshop on Interoperable Semantic                        (ELRA).     Retrieved     from    http://www.lrec-
  Annotation (ISA-15) (pp.13–21), Gothenburg,                     conf.org/proceedings/lrec2008/pdf/617_paper.pdf
  Sweden.                 Retrieved               from
                                                               Ide, Nancy; & Pustejovsky, James (2010). What does
  https://sigsem.uvt.nl/isa15/ISA-15_proceedings.pdf
                                                                  interoperability mean, anyway? Toward an
Gerber, Laurie; Ferro, Lisa; Mani, Inderjeet;                     operational definition of interoperability. In Proc. of
  Sundheim, Beth; Wilson, George; & Kozierok,                     the 2nd International Conference on Global
  Robyn (2002). Annotating Temporal Information:                  Interoperability for Language Resources (ICGL).
  From Theory to Practice. In Proceedings of the 2nd              Hong      Kong,       China.      Retrieved       from
  international conference on Human Language                      https://www.cs.vassar.edu/~ide/papers/ICGL10.pdf
  Technology Research (pp. 226–230). San Francisco,
                                                               ISO24617-1:2012, Language resource management-
  CA: Morgan Kaufmann Publishers. Retrieved from
                                                                 Semantic annotation framework (SemAF) - Part 1:
  https://dl.acm.org/doi/10.5555/1289189.1289202
                                                                 Time and events (SemAF-Time, ISO-TimeML)
Gessler, Luke; Peng, Siyao Logan; Liu, Yang; Zhu,
                                                               ISO-24617-4:        2014, Language        resource
  Yilun; Behzad, Shabnam; & Zeldes, Amir (2020).
                                                                 management- Semantic annotation framework
  AMALGUM - A free, balanced, multilayer English
                                                                 (SemAF) - Part 4: Semantic roles (SemAF-SR)
  web corpus. In Proceedings of the 12th Conference
  on Language Resources and Evaluation (LREC                   ISO 24617-6: 2016, Language resource management-
  2020). (pp. 5267–5275). Marseille: European                    Semantic annotation framework (SemAF) - Part 6:
  Language Resources Association (ELRA).

                                                          11
Principles of     semantic    annotation    (SemAF           Pustejovsky, James; & Stubbs, Amber (2012). Natural
  Principles)                                                    Language Annotation for Machine Learning.
                                                                 O’Reilly Media, Inc., USA.
ISO 24617-7: 2019, Language resource management-
  Spatial information (SemAF) - Part 7: Reference              Pustejovsky, James; Castaño, José; Ingria, Robert;
  annotation framework (ISO-Space)                               Saurí, Roser; Gaizauskas, Robert; Setzer, Andrea; &
                                                                 Katz, Graham (2003a). TimeML: robust
ISO 24617-9: 2019, Language resource management-
                                                                 specification of event and temporal expressions in
  Semantic annotation framework (SemAF) - Part 9:
                                                                 text. In IWCS-5, Fifth International Workshop on
  Reference annotation framework (RAF)
                                                                 Computational Semantics (pp. 28–34). Retrieved
Kamp, Hans; & Uwe Reyle (1993). From Discourse to                from
  Logic: Introduction to Modeltheoretic Semantics of             https://www.aaai.org/Papers/Symposia/Spring/200
  Natural Language, Formal Logic and Discourse                   3/SS-03-07/SS03-07-005.pdf
  Representation Theory. Dordrecht: Kluwer
                                                               Pustejovsky, James; Hans, Patrick; Saurí, Roser; See,
  Academic Publishers.
                                                                 Andrew; Gaizauskas, Robert; Setzer, Andrea;
Katz, Graham; & Arosio, Fabrizio (2001). The                     Radev, Dragomir; Sundheim, Beth; Day, David;
  Annotation of Temporal Information in Natural                  Ferro, Lisa; & Lazo, Marcia (2003b). The
  Language Sentences. In Proceedings of ACL-EACL                 TIMEBANK Corpus. In Proceedings of Corpus
  2001, Workshop for Temporal and Spatial                        Linguistics, Lancaster (pp. 647–656). Retrieved
  Information Processing (pp. 104–111). Association              from
  for Computational Linguistics. Toulouse. Retrieved             https://www.researchgate.net/publication/22855908
  from      https://www.aclweb.org/anthology/W01-                1_The_TimeBank_corpus
  1315/
                                                               Rabe, Robert (2008). Inverted Pyramid. In Stephen L.
Norambuena, Brian Keith; Horning, Michael; & Mitra,              Vaughn (Ed.). Encyclopedia of American
  Tanushree (2020). Evaluating the Inverted Pyramid              Journalism. (pp. 223–225). New York: Routledge.
  Structure through Automatic 5W1H Extraction and
                                                               Reichenbach, Hans (1947). Elements of Symbolic
  Summarization.         Computation       Journalism
                                                                 Logic. New York: Macmillan.
  Symposium,          1–7.       Retrieved      from
  https://par.nsf.gov/biblio/10168974                          Schuler, Karin (2005). VerbNet: A Broad-Coverage,
                                                                 Comprehensive Verb Lexicon (Doctoral dissertation,
Nouvel, Damien; Ehrmann, Maud; & Rosset, Sophie
                                                                 University of Pennsylvania). Retrieved from
  (2016). Named Entities for Computational
                                                                 https://verbs.colorado.edu/~kipper/Papers/dissertati
  Linguistics. ISTE/Wiley, UK/USA.
                                                                 on.pdf
Palmer, Martha; Gildea, Daniel; & Kingsbury, Paul
                                                               Setzer, Andrea (2001). Temporal Information in
   (2005). The Proposition Bank: An Annotated
                                                                  Newswire Articles: an Annotation Scheme and
   Corpus of Semantic Roles. Computational
                                                                  Corpus Study (Doctoral dissertation, University of
   Linguistis, 31 (1), 71–106. Retrieved from
                                                                  Sheffield).              Retrieved           from
   https://www.aclweb.org/anthology/J05-1004/
                                                                  http://etheses.whiterose.ac.uk/14436/
Petukhova, Volha; & Bunt, Harry (2008). LIRICS
                                                               Setzer, Andrea; & Gaizauskas, Robert (2000a).
   semantic role annotation: design and evaluation of a
                                                                  Annotating events and temporal information in
   set of data categories. In Proceedings of the Sixth
                                                                  newswire text. In Proceedings of the Second
   International Conference on Language Resources
                                                                  International Conference on Language Resources
   and Evaluation (LREC'08) (pp. 39–45). Marrakech,
                                                                  and Evaluation (LREC 2000) (pp. 1287–1293).
   Morocco:      European      Language     Resources
                                                                  Athens, Greece: European Language Resources
   Association
                                                                  Association     (ELRA).       Retrieved    from
   (ELRA). https://www.aclweb.org/anthology/L08-
                                                                  https://www.aclweb.org/anthology/L00-1241/
   1428/
                                                               Setzer, Andrea; & Gaizauskas, Robert (2000b).
Pustejovsky, James; Bunt, Harry; & Zaenen, Annie
                                                                  Building a temporally annotated corpus for
  (2017). Designing Annotation Schemes: From
                                                                  information extraction. In Proceedings of the
  Theory to Model. The Groningen Meaning Bank. In
                                                                  Information Extraction Meets Corpus Linguistics
  Nancy Ide & James Pustejovsky (Eds.). Handbook
                                                                  Workshop at the 2nd International Conference on
  of Linguistic Annotation (pp. 21–72). USA:
                                                                  Language Resources and Evaluation (LREC 2000)
  Springer. ISBN 978-94-024-0879-9.
                                                                  (pp. 9–14). Athens, Greece: European Language
                                                                  Resources Association (ELRA). Retrieved from
                                                                  http://staffwww.dcs.shef.ac.uk/people/R.Gaizauska
                                                                  s/research/papers/lrec00-ie-meets-cl-ter.pdf

                                                          12
You can also read