Usage in Question Answering - Inside ASCENT: Exploring a Deep Commonsense Knowledge Base and its

Page created by Aaron Joseph

Education

English

Like
Share
Embed
Fullscreen
Slides
Download HTML
Download PDF
Abuse

←

→

Page content transcription

If your browser does not render page correctly, please read the page content below

Usage in Question Answering - Inside ASCENT: Exploring a Deep Commonsense Knowledge Base and its

Inside A SCENT: Exploring a Deep Commonsense Knowledge Base and its
                                                             Usage in Question Answering

                                                   Tuan-Phong Nguyen            Simon Razniewski              Gerhard Weikum
                                                                         Max Planck Institute for Informatics
                                                                              Saarbrücken, Germany
                                                                {tuanphong,srazniew,weikum}@mpi-inf.mpg.de

                                                                  Abstract                         (CSKBs), including expert-annotated KBs (e.g.,
                                                                                                   Cyc (Lenat, 1995)), crowdsourced KBs (e.g., Con-
                                                 A SCENT is a fully automated methodology
                                                 for extracting and consolidating commonsense      ceptNet (Speer and Havasi, 2012) and Atomic (Sap
arXiv:2105.13662v1 [cs.AI] 28 May 2021

                                                 assertions from web contents (Nguyen et al.,      et al., 2019)) and KBs built by automatic acqui-
                                                 2021). It advances traditional triple-based       sition methods such as WebChild (Tandon et al.,
                                                 commonsense knowledge representation by           2014, 2017), TupleKB (Mishra et al., 2017), Quasi-
                                                 capturing semantic facets like locations and      modo (Romero et al., 2019) and CSKG (Ilievski
                                                 purposes, and composite concepts, i.e., sub-      et al., 2020). Human-created KBs, although pos-
                                                 groups and related aspects of subjects. In this
                                                                                                   sessing high precision, usually suffer from low cov-
                                                 demo, we present a web portal that allows
                                                 users to understand its construction process,     erage. On the other hand, automatically-acquired
                                                 explore its content, and observe its impact in    KBs typically have better coverage, but also con-
                                                 the use case of question answering. The demo      tain more noise. Nonetheless, despite different
                                                 website1 and an introductory video2 are both      construction methods, these KBs are all based on a
                                                 available online.                                 simple subject-predicate-object model, which has
                                                                                                   major limitations in validity and expressiveness.
                                         1       Introduction
                                                                                                      We recently presented A SCENT (Nguyen et al.,
                                         Commonsense knowledge (CSK) is an enduring                2021), a methodology for automatically collecting
                                         theme of AI (McCarthy, 1960) that has been re-            and consolidating commonsense assertions from
                                         cently revived for the goal of building more robust       the general web. To overcome the limitations of
                                         and reliable applications (Monroe, 2020). Recent          prior works, A SCENT refines subjects with sub-
                                         years have witnessed the emerging of large pre-           groups (e.g., circus elephant and domesticated ele-
                                         trained language models (LMs), notably BERT (De-          phant) and aspects (e.g., elephant tusk and elephant
                                         vlin et al., 2018), GPT (Brown et al., 2020) and          habitat), and captures semantic facets of assertions
                                         their variants which significantly boosted the per-       (e.g., hlawyer, represents, clients, LOCATION: in
                                         formance of tasks requiring natural language under-       courtsi or helephant, uses, its trunk, PURPOSE: to
                                         standing such as question answering and dialogue          suck up wateri).
                                         systems (Clark et al., 2020). Although it has been           For a given concept, A SCENT searches through
                                         shown that such LMs implicitly store some com-            the web with pattern-based search queries dis-
                                         monsense knowledge (Talmor et al., 2019), this            ambiguated using WordNet (Miller, 1995) hyper-
                                         comes with various caveats, for example regarding         nymy. Then, irrelevant documents are filtered out
                                         degree of truth, or negation, and their commercial        based on similarity comparison against the corre-
                                         development is inherently hampered by their low           sponding Wikipedia articles. We then use a se-
                                         interpretability and explainability.                      ries of judicious dependency-parse-based rules to
                                            Structured knowledge bases (KBs), in contrast,         collect faceted assertions from the retained texts.
                                         give a great possibility of explaining and interpret-     The semantic facets, which come from preposi-
                                         ing outputs of systems leveraging the resources.          tional phrases and supporting adverbs are then la-
                                         There have been great efforts towards build-              beled by a supervised classifier. Finally, asser-
                                         ing large-scale commonsense knowledge bases               tions are clustered using similarity scores from
                                             1
                                                 https://ascent.mpi-inf.mpg.de                     word2vec (Mikolov et al., 2013) and a fine-tuned
                                             2
                                                 https://youtu.be/qMkJXqu_Yd4                      RoBERTa (Liu et al., 2019) model.

We executed the A SCENT pipeline for 10,000 1. When searching for source texts, A SCENT
prominent concepts (selected based on their respec- combines the target subject with an informa-
tive number of assertions in ConceptNet) as pri- tive hypernym from WordNet to distinguish
mary subjects. In (Nguyen et al., 2021), we showed different senses of the word (e.g., “bus public
that the content of the resulting CSKB (hereinafter transport” and “bus network topology” for the
referred to as A SCENT KB) is a milestone in both subject bus).
salience and recall. As extrinsic evaluation, we
conducted a comprehensive evaluation of the con- 2. A SCENT refines subjects with multi-word
tribution of CSK to zero-shot question answering phrases into subgroups and aspects. For ex-
(QA) with pre-trained language models (Petroni ample, subgroups for the subject bus would
et al., 2020; Guu et al., 2020). be tourist bus and school bus, while one of its
aspects would be bus driver.
This paper presents a companion web portal of
the A SCENT KB, which enables the following in-
Semantic facets. The validity of commonsense
teractions:
assertions is usually non-binary (Zhang et al., 2017;
Chalier et al., 2020), and depends on specific tem-
1. Exploration of the construction process
poral and spatial circumstances (e.g., lions live for
of A SCENT, by inspecting word sense
10-14 years in the wild but for more than 15 years
and Wikipedia disambiguation, web search
in captivity). Moreover, CSK triples often ben-
queries, clustered statements, and source sen-
efit from further context regarding causes/effects
tences and documents.
and instruments (e.g., elephants communicate with
2. Inspection of the resulting KB, starting from each other by creating sounds, beer is served in
subjects, predicates, objects, or examining bars). In A SCENT’s knowledge model, such infor-
specific subgroups or aspects. mation is added to SPO triples via semantic facets.
A SCENT distinguished 8 types of facets: cause,
3. Observation of the impact of structured knowl- manner, purpose, transitive-object, degree, loca-
edge on question answering with pretrained tion, temporal and other-quality.
language models, comparing generated an-
2.2 Extraction pipeline
swers across various CSKBs and QA settings.
A SCENT is a pipeline operating in three phases:
The web portal is available at https://ascent. source discovery, knowledge extraction and knowl-
mpi-inf.mpg.de, and a screencast demonstrating edge consolidation. Fig. 1 illustrates the architec-
the system can be found at https://youtu.be/ ture of the pipeline.
qMkJXqu_Yd4. Source discovery. We utilize the Bing Web
Search API to obtain documents specific to each
2 A SCENT subject, with search queries refined by the sub-
ject’s hypernyms in WordNet. We manually de-
Two major contributions of A SCENT are its ex- signed query templates for 35 prominent hyper-
pressive knowledge model, and its state-of-the-art nyms (e.g., if subject s0 has hypernym animal.n.01,
extraction methodology. Details are in the techni- we produce the search query “s0 animal facts”,
cal paper (Nguyen et al., 2021). In this section, we similarly for the hypernym professional.n.01, the
revisit the most important points. search query will be “s0 job descriptions”). We
then compute the cosine similarity between the
2.1 Knowledge model
bag-of-words representations of each obtained doc-
A SCENT extends the traditional triple-based data ument and a respective Wikipedia article to deter-
model in existing CSKBs in two ways. mine the relevance of the documents. Low-ranked
Expressive subjects. Subjects in existing CSKBs documents will be omitted in further steps.
are usually single nouns, which implies two short- Knowledge extraction. The extractors take in the
comings: (i) different meanings for the same word relevant documents and their outputs include: open
are conflated, and (ii) refinements and variants of information extraction (OIE) tuples, list of sub-
word senses are missed out. A SCENT has addressed groups and list of aspects. To obtain OIE tuples,
this problem with the following means: we extend the S TUFF IE approach (Prasojo et al.,

Open IE

Supervised Facet Labeling Assertion Clustering

Relevant websites
Coreference Resolution OpenIE assertions Facet Clustering

Noun Chunking

Filtering

Web Search

Noun chunks

Subgroup/Aspect Extraction
Concept S
Related terms Commonsense assertions of S

(1) Retrieval (2) Extraction (3) Consolidation

Figure 1: Architecture of the A SCENT extraction pipeline (Nguyen et al., 2021).

2018), a list of carefully crafted dependency-parse- answering (Section 4.3).
based rules, to pull out faceted assertions from the
texts. Then we classify each facet into one of the 3 Commonsense QA setups
eight semantic labels using a fine-tuned RoBERTa One common extrinsic use case of KBs is question
model. For subgroups, noun phrases whose head answering. Recently, it was observed that prim-
word is the target subject are collected as candi- ing language models (LMs) with relevant context
dates and then are clustered using the hierarchical can considerably benefit their performance in QA-
agglomerative clustering (HAC) algorithm on av- like tasks (Petroni et al., 2020; Guu et al., 2020).
erage word2vec representations. Finally, we col- In (Nguyen et al., 2021), to evaluate the contri-
lect aspects from possessive noun chunks and SPO bution of structured CSK to QA, we conducted a
triples where P is either “have”, “contain”, “be comprehensive evaluation consisting of four differ-
assembled of” or “be composed of”. ent setups, all based on the above idea.
Knowledge consolidation. We perform cluster- 1. In masked prediction (MP), LMs are asked
ing on SPO triples and facet values. As SPO to predict single masked tokens in generic
triples, we first filter triple-pair candidates with sentences.
fast word2vec similarity. After that, advanced simi-
larity of triple pairs computed by another fine-tuned 2. In free generation (FG), LMs arbitrarily gen-
RoBERTa model is fed to the HAC algorithm to erate answer sentences to given questions.
group the triples into semantically similar clusters. 3. Guided generation (GG) extends free genera-
For facet values, we group phrases with the same tion by answer prefixes that prevent the LMs
head words together (e.g., “during evening” and from evading answering.
“in the evening”).
4. Span prediction (SP) is the task of locating
2.3 Web portal the answer of a question in provided context.
The web portal (https://ascent.mpi-inf.mpg. Examples of the QA setups can be seen in Ta-
de) is implemented in Python using Django, and ble 1. Generally, given a question, our system
hosted on an Nginx web server. The underlying will retrieve from CSKBs assertions relevant to it,
structured CSK is stored in a PostgreSQL database, and then use the assertions as additional context
while for the QA part, statements of all CSKBs to guide the LMs. In the A SCENT demonstrator,
are indexed and queried via Apache Solr, for fast we provide a web interface for experimenting with
text-based querying. All components are deployed all of those QA setups with context retrieved from
on a virtual machine with access to 4 virtual CPUs several popular CSKBs.
and 8 GB of RAM.
4 Demonstration experience
In the demonstration session, we show how users
can interact with the portal for exploring the KB In the demonstration session, attendees will experi-
(Section 4.1), understanding the KB construction ence three main functionalities of our demonstra-
(Section 4.2), and observing its utility for question tion system.

Setup Input Sample output ity under the Browse menu. This way, they can
Elephants eat [MASK]. [SEP] Ele- everything (15.52%), search, for instance, for all concepts that eat grass
MP phants eat roots, grasses, fruit, and trees (15.32%), plants
bark, and they eat a lot of these things. (11.26%) (capybara, zebra, kangaroo, ...).
C: Elephants eat roots, grasses, fruit, They eat a lot of
and bark, and they eat... grasses, fruits, and... The website also provides a JSON-formatted
FG
Q: What do elephants eat? data dump (678MB) of all 8.9 million assertions
A:
C: Elephants eat roots, grasses, fruit, Elephants eat a lot of extracted by the pipeline and their corresponding
and bark, and they eat... things.
GG
Q: What do elephants eat? source sentences and documents. This dataset is
A: Elephants eat also accessible via the HuggingFace Datasets pack-
question=“What do elephants eat?” start=14, end=46,
SP context=“Elephants eat roots, grasses, answer=“roots, grasses, age3 .
fruit, and bark, and they eat...” fruit, and bark”

Table 1: Examples of QA setups (Nguyen et al., 2021).
4.2 Inspecting the construction of assertions
For many downstream use cases, it is important to
know about the provenance of information.
4.1 Exploring the A SCENT KB
Users can inspect general properties of the con-
Concept page. Suppose a user wants to know struction process by observing the WordNet lemma
which knowledge A SCENT stores for elephants. and the Wikipedia page used for filtering, as well
They can enter the concept into the search field in as inspect specific statistics about the number of
the top right of the start page, and select the first retained websites, sentences, and assertions, in a
result from the autocompletion list, or press enter, panel at the bottom of subject pages (e.g., 435 web-
to arrive at the intended concept. The resulting sites were retained for elephant, from which 50k
website (see Fig. 2) is divided into three main areas. OpenIE assertions could be extracted).
At the top left, they can inspect an image from Furthermore, users can look deeply into the con-
https://pixabay.com, the WordNet synset used struction process of each assertion on its own dedi-
for disambiguation, the Wikipedia page used for cated page, which displays the following:
result filtering, and a list of alternative lemmas, if
existing. 1. Clustered triples: These are triples that were
At the top right, users can see subgroups and grouped together in the knowledge consolida-
related aspects, which in our knowledge represen- tion phase (cf. Section 2.2), where the most
tation model, can carry their own statements. This frequent triple was selected as cluster repre-
way, they can learn that the most salient aspects of sentative. For example, for the assertion hlion,
elephants are their trunks, tusks and ears, or that eat, zebra, DEGREE: mostlyi (14), the cluster
elephant trunks have more than 40,000 muscles. contains: hlion, eat, zebrai (9), hlion, prey on,
The body of the page, presents the assertions, zebrai (2), hlion, feed on, zebrai (1), hlion,
organized into groups of same-predicate assertions. feed upon, zebrai (1), hlion, prey upon, zebrai
In each group, assertions are sorted by their fre- (1). The numbers in parentheses indicate their
quency displayed beside their objects. For example, corresponding frequency.
the most commonly mentioned foods of elephants
are grasses, fruits, and plants. Many assertions 2. Facets: The assertion’s facets are presented in
come with a red asterisk. This indicates that the as- a table whose columns are facet value, facet
sertion comes with semantic facets. When clicking type and clustered facets. The frequency of
on an assertion, it will show a small box display- each clustered facet is also indicated.
ing an SVG-based visualisation of the assertion in
which we illustrate all elements of the assertion: its 3. Source sentences and documents: Finally, we
subject, predicate, object, facet labels and values, exhibit the sentences from which the asser-
frequency of the assertion as well as frequency of tions were extracted and their parent docu-
each facet. For example, one can see that the pur- ments (in the form of URLs). Furthermore, in
pose of elephants using their trunks is to suck up the extraction phase, we also recorded the po-
water. sition of assertion elements (i.e., subject, pred-
icate, object, facet) in the source sentences.
Searching and downloading assertions. Alter-
natively to exploring statements starting from a 3
https://huggingface.co/datasets/
subject, users can start from a search functional- ascent_kb

Figure 2: Example of A SCENT’s page for the concept elephant.

We show that information to users by high- assertions to be retrieved per CSKB for each
lighting each kind of element with a different question.
color in the source sentences.
4. Context sources: The user selects the sources
4.3 Experimenting with commonsense QA of context (i.e., “no context”, CSKBs and
The third functionality experienced in the demo ses- “custom context”). If a CSKB is selected, the
sion is the utilization of commonsense knowledge system will retrieve from that KB assertions
for question answering (QA). relevant to the given input question. If “cus-
Input. There are four main parts in the input in- tom context” is selected, user must then enter
terface for the QA experiment: their own content. The “no context” option is
available for all setups but Span Prediction.
1. QA setup: The user chooses one QA setup
they want to experiment with. Available Output. The QA system presents its output in the
are Masked Prediction, Span Prediction and form of a table which has three columns: Source,
Free/Guided Generation. If Masked Predic- Answer(s) and Context. For Masked Prediction
tion is selected, the user can choose how many and Span Prediction, answers are printed with
answers the LM should produce. For the Gen- their respective confidence scores, meanwhile for
eration settings, users can provide an answer Free/Guided Generation, only answers are printed.
prefix to avoid overly evasive answers. For Span Prediction in which answers come di-
rectly from given contexts, we also highlight the
2. Input query: The user enters the text question
answers in the contexts.
as input. The question can be in the form
An example of the QA demo’s output for the
of a masked sentence (in the case of Masked
question “What do rabbits eat?” under the Free
Prediction), or a standard natural-language
Generation setting can be seen in Fig. 3. One can
question (in other setups).
observe that language models’ predictions are heav-
3. Retrieval options: The user can select one ily influenced by given contexts. Without context,
supported retrieval method and the number of GPT-2 is only able to generate an evasive answer.

When being given context, it tends to re-generate tent via CSV files. Some, like ConceptNet4 , We-
the first sentence in the context first, (e.g., see the bChild5 , Atomic6 and Quasimodo7 , have a web por-
answers aligning with A SCENT, TupleKB and Con- tal to visualise their assertions. The most common
ceptNet in Fig. 3). For the context retrieved from way for CSKB visualisation is to use a single page
Quasimodo, GPT-2 is able to overlook the erro- for each subject and group assertions by predicate
neous first sentence, however its generated answer (e.g., in ConceptNet and WebChild). Quasimodo,
is rather elusive despite the fact that subsequent on the other hand, implements a simple search in-
statements in the context all contain direct answers terface to filter assertions and presents assertions
to the question. in a tabular way (Romero and Razniewski, 2020).
The question “Bartenders work in [MASK].” un- The A SCENT demo has both functionalities: ex-
der the Masked Prediction setting is another ex- hibiting assertions of each concept in a separated
ample for the influence of context on LMs’ output. page, and supporting assertion filtering. Our demo
Since bartender is a subject well covered by the A S - also uses an SVG-based visualisation of assertions
CENT KB, the assertions pulled out are all relevant with semantic facets, which are a distinctive feature
(i.e., Bartenders work in bar. Bartenders work in of the A SCENT knowledge model.
restaurant. . . ) which help guide the LM to a good Context in LM-based question answering.
answer (bar). Meanwhile, because this subject is Priming large pretrained LMs with context in
not present in TupleKB, its retrieved statements are QA-like tasks is a relatively new line of research
rather unrelated (Work capitals have firm. Work (Petroni et al., 2020; Guu et al., 2020). In our orig-
experiences include statement. . . ). Given that, the inal paper, we made the first attempt to evaluate
top-1 prediction for this KB was tandem which is the contribution of CSKB assertions to QA via four
obviously an evasive answer. different setups based on that idea. While others
use commonsense knowledge for (re-)training lan-
5 Related work guage models (Hwang et al., 2021; Ilievski et al.,
2021; Ma et al., 2021; Mitra et al., 2020), to the
CSKB construction. Cyc (Lenat, 1995) is the best of our knowledge, our demo system is the first
first attempt to build a large-scale common- to visualize the effect of priming vanilla language
sense knowledge base. Since then, there have models, i.e., without task-specific retraining.
been a number of other CSKB construction
projects, notably ConceptNet (Speer and Havasi, 6 Conclusion
2012), WebChild (Tandon et al., 2014, 2017), Tu-
pleKB (Mishra et al., 2017), and more recently We presented a web portal for a state-of-the-art
Quasimodo (Romero et al., 2019), Dice (Chalier commonsense knowledge base—the A SCENT KB.
et al., 2020), Atomic (Sap et al., 2019), and It allows users to fully explore and search the
CSKG (Ilievski et al., 2020). The early approach CSKB, inspect the construction process of each
to building a CSKB is based on human annota- assertion, and observe the impact of structured
tion (e.g., Cyc with expert annotation and Con- CSKBs on different QA tasks. We hope that the
ceptNet with crowdsourcing annotation). Later portal enables interesting interactions with the A S -
projects tend to use automated methods based on CENT methodology, and that the QA demo allows
open information extraction to collect CSK from researchers to explore the potentials of combining
texts (e.g., WebChild, TupleKB and Quasimodo). structured data with pre-trained language models.
Lately, CSKG is an attempt to combine various
commonsense knowledge resources into a single
KB. The common thread of these CSKB is that References
they are all based on SPO triples as knowledge Tom B Brown et al. 2020. Language models are few-
representation, which has shortcomings (Nguyen shot learners. In NeurIPS.
et al., 2021). A SCENT is the first attempt to build 4
http://conceptnet.io
a large-scale CSKB with assertions equipped with 5
https://gate.d5.mpi-inf.mpg.de/
semantic facets built upon the ideas of semantic webchild
6
role labeling (Palmer et al., 2010). https://mosaickg.apps.allenai.org/
kg-atomic2020
7
KB visualization. Most CSKBs share their con- https://quasimodo.r2.enst.fr

Figure 3: Free Generation output for question: “What do rabbits eat?”.

Yohan Chalier, Simon Razniewski, and Gerhard                evaluation in commonsense question answering. In
  Weikum. 2020. Joint reasoning for multi-faceted           AAAI.
  commonsense knowledge. In AKBC.
                                                          John McCarthy. 1960. Programs with common sense.
Peter Clark et al. 2020. From ‘F’to ‘A’ on the NY           RLE and MIT computation center.
  regents science exams: An overview of the Aristo
  project. AI Magazine.                                   Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey
                                                            Dean. 2013. Efficient estimation of word represen-
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and               tations in vector space. In ICLR.
   Kristina Toutanova. 2018. Bert: Pre-training of deep
   bidirectional transformers for language understand-    George A Miller. 1995. Wordnet: a lexical database for
   ing. In NAACL.                                           English. Communications of the ACM.

Kelvin Guu, Kenton Lee, Zora Tung, Panupong Pasu-         Bhavana Dalvi Mishra, Niket Tandon, and Peter Clark.
  pat, and Ming-Wei Chang. 2020. Realm: Retrieval-          2017. Domain-targeted, high precision knowledge
  augmented language model pre-training. In ICML.           extraction. TACL.
                                                          Arindam Mitra, Pratyay Banerjee, Kuntal Kumar Pal,
Jena D Hwang, Chandra Bhagavatula, Ronan Le Bras,
                                                            Swaroop Mishra, and Chitta Baral. 2020. How ad-
  Jeff Da, Keisuke Sakaguchi, Antoine Bosselut, and
                                                            ditional knowledge can improve natural language
  Yejin Choi. 2021. Comet-atomic 2020: On sym-
                                                            commonsense question answering? arXiv preprint
   bolic and neural commonsense knowledge graphs.
                                                            arXiv:1909.08855.
   In AAAI.
                                                          Don Monroe. 2020. Seeking artificial common sense.
Filip Ilievski, Alessandro Oltramari, Kaixin Ma, Bin
                                                            Communications of the ACM.
   Zhang, Deborah L McGuinness, and Pedro Szekely.
   2021. Dimensions of commonsense knowledge.             Tuan-Phong Nguyen, Simon Razniewski, and Gerhard
   arXiv preprint arXiv:2101.04640.                         Weikum. 2021. Advanced semantics for common-
                                                            sense knowledge extraction. In WWW.
Filip Ilievski, Pedro Szekely, and Bin Zhang. 2020.
   Cskg: The commonsense knowledge graph. arXiv           Martha Palmer, Daniel Gildea, and Nianwen Xue. 2010.
   preprint arXiv:2012.11490.                              Semantic role labeling. Synthesis Lectures on Hu-
                                                           man Language Technologies.
Douglas B Lenat. 1995. Cyc: A large-scale investment
  in knowledge infrastructure. Communications of the      Fabio Petroni, Patrick Lewis, Aleksandra Piktus, Tim
  ACM.                                                      Rocktäschel, Yuxiang Wu, Alexander H Miller, and
                                                            Sebastian Riedel. 2020. How context affects lan-
Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Man-         guage models’ factual predictions. In AKBC.
  dar Joshi, Danqi Chen, Omer Levy, Mike Lewis,
  Luke Zettlemoyer, and Veselin Stoyanov. 2019.           Radityo Eko Prasojo, Mouna Kacimi, and Werner Nutt.
  Roberta: A robustly optimized BERT pretraining ap-        2018. Stuffie: Semantic tagging of unlabeled facets
  proach. arXiv preprint arXiv:1907.11692.                  using fine-grained information extraction. In CIKM.
Kaixin Ma, Filip Ilievski, Jonathan Francis, Yonatan      Julien Romero and Simon Razniewski. 2020. Inside
  Bisk, Eric Nyberg, and Alessandro Oltramari. 2021.         quasimodo: Exploring construction and usage of
  Knowledge-driven data construction for zero-shot           commonsense knowledge. In CIKM.

Julien Romero, Simon Razniewski, Koninika Pal,         Niket Tandon, Gerard de Melo, Fabian M. Suchanek,
   Jeff Z. Pan, Archit Sakhadeo, and Gerhard Weikum.     and Gerhard Weikum. 2014. Webchild: harvesting
   2019. Commonsense properties from query logs and      and organizing commonsense knowledge from the
   question answering forums. In CIKM.                   web. In WSDM.
Maarten Sap et al. 2019. Atomic: An atlas of machine
 commonsense for if-then reasoning. In AAAI.
                                                       Niket Tandon, Gerard de Melo, and Gerhard Weikum.
Robyn Speer and Catherine Havasi. 2012. Conceptnet       2017. Webchild 2.0 : Fine-grained commonsense
  5: A large semantic network for relational knowl-      knowledge distillation. In ACL.
  edge. Theory and Applications of Natural Language
  Processing.
Alon Talmor, Yanai Elazar, Yoav Goldberg, and          Sheng Zhang, Rachel Rudinger, Kevin Duh, and Ben-
  Jonathan Berant. 2019. olmpics - on what language      jamin Van Durme. 2017. Ordinal common-sense in-
  model pre-training captures. TACL.                     ference. TACL.

You can also read