A Survey on Sentiment and Emotion Analysis for Computational Literary Studies

Page created by Clyde Foster

Food & Drink

English

Like
Share
Embed
Fullscreen
Slides
Download HTML
Download PDF
Abuse

←

→

Page content transcription

If your browser does not render page correctly, please read the page content below

A Survey on Sentiment and Emotion Analysis for Computational Literary Studies

A Survey on Sentiment and Emotion Analysis for
                                                             Computational Literary Studies

                                                                       Evgeny Kim and Roman Klinger
                                                              Institut für Maschinelle Sprachverarbeitung (IMS)
                                                                             University of Stuttgart
                                                                            70569 Stuttgart, Germany
                                                          {firstname.lastname}@ims.uni-stuttgart.de

                                                        Abstract                        1     Introduction and Motivation
                                                                                        1.1    On the Importance of Emotions
arXiv:1808.03137v1 [cs.CL] 9 Aug 2018

                                        Emotions have often been a crucial part of      Human mental experiences consist of various phe-
                                        compelling narratives: literature tells about   nomena that are not directly grounded in the ob-
                                        people with goals, desires, passions, and       jective perception of the world. A large portion
                                        intentions. In the past, classical literary     of our daily decisions and interactions with others
                                        studies usually scrutinized the affective di-   are driven by subconscious processes, including
                                        mension of literature within the framework      emotions and affect. Emotions play a crucial role
                                        of hermeneutics. However, with emergence        when it comes to the arts (Johnson-Laird and Oat-
                                        of the research field known as Digital Hu-      ley, 2016). Unintentionally or not, when creating
                                        manities (DH) some studies of emotions          a piece of art, an artist introduces this emotional
                                        in literary context have taken a computa-       component into her work that in turn make us ex-
                                        tional turn. Given the fact that DH is still    perience different emotions (Anderson, 2004; In-
                                        being formed as a science, this direction       germanson and Economy, 2009). When perceiving
                                        of research can be rendered relatively new.     the arts, for example during reading a novel, peo-
                                        At the same time, the research in sentiment     ple can feel emotions, because they are drawn into
                                        analysis started in computational linguistic    the stories that depict characters who act and feel,
                                        almost two decades ago and is nowadays an       have desires and fears, reach success or fail (Djikic
                                        established field that has dedicated work-      et al., 2009). Readers of fiction have richer emo-
                                        shops and tracks in the main computational      tional experiences and better abilities of empathy
                                        linguistics conferences. This leads us to       and understanding of others’ lives than people who
                                        the question of what are the commonalities      do not consume literature (Mar et al., 2009; Kidd
                                        and discrepancies between sentiment anal-       and Castano, 2013).
                                        ysis research in computational linguistics         This observation has two major implications for
                                        and digital humanities? In this survey, we      the connection between the literature and human
                                        offer an overview of the existing body of       emotions. First, literature requires that we use
                                        research on sentiment and emotion analy-        our emotions in order to understand it (Robinson,
                                        sis as applied to literature. We precede the    2005), or better, we have to use our knowledge
                                        main part of the survey with a short intro-     about human emotions to understand the feelings
                                        duction to natural language processing and      and moods of the fictional characters. Second, emo-
                                        machine learning, psychological models of       tional experiences we draw from the literature are
                                        emotions, and provide an overview of ex-        of the same sort we have in real life, which makes
                                        isting approaches to sentiment and emotion      literature a valid source of the depiction of human
                                        analysis in computational linguistics. The      emotions (Hogan, 2010, 2015).
                                        papers presented in this survey are either         All said-above means that emotions are tightly
                                        coming directly from DH or computational        intervened with the content of artistic work, and
                                        linguistics venues and are limited to sen-      thus need to be studied in this context not only by
                                        timent and emotion analysis as applied to       humanities scholars but by psychologists as well,
                                        literary text.                                  because research in this direction can benefit the

understanding of both the arts and emotions. emotional intelligence (Bal and Veltkamp, 2013;
The link between emotions and arts in general Djikic et al., 2013; Johnson, 2012; Samur et al.,
is a matter of debates that date back to the Ancient 2018; Djikic et al., 2009). Moreover, there is a
period, particularly, to Plato, who viewed passions growing body of literature that recognizes the de-
and desires as the lowest kind of knowledge and liberate choices people make with regard to their
treated poets as undesirable members in his ideal emotional states when seeking narrative enjoyment,
society (Plato, 1969). In contrast, the Aristotle’s for example a book or a film (Zillmann et al., 1980;
view on emotive component of poetry expressed in Ross, 1999; Bryant and Zillmann, 1984; Oliver,
his Poetics (Aristotle, 1996) differed from Plato’s 2008; Mar et al., 2011). The influence of mood
in that emotions do have great importance, particu- on these choices has been studied by Zillmann
larly, in the moral life of a person (de Sousa, 2017). (1988). His mood-management theory proposes
For a long period of time, no single word or term that readers and viewers when seeking entertain-
existed in English language to describe “the emo- ment make choices that will promote or main-
tions” as a category of feeling (Downes and McNa- tain positive moods or reduce the negative ones.
mara, 2016). However, in the late 19th century the Usual objections to the mood-management theory
emotion theory of arts stepped into the spotlight of point to the fact that people still enjoy tragedies
philosophers. One of the first accounts on the topic or horror stories (Oliver, 1993; Oliver et al., 2000;
is given by Leo Tolstoy in 1898 in his essay What Oliver, 2008), though these genres provoke nega-
is Art? (Tolstoy, 1962). Tolstoy argues that art can tive emotions in them, such as sadness, fear, anxi-
express emotions experienced in fictitious context ety, and anger. A possible solution was proposed by
and the degree to which the audience is convinced Vorderer et al. (2004): Enjoyment is explained by
in them defines the success of the artistic work (cf., the notion of “meta-emotions”, i.e., emotions we
Anderson and McMaster (1986), (Hogan, 2010), experience towards our emotions directed at some
and Piper and Jean So (2015)). But why do imagi- object, which are deemed appropriate in a partic-
nary contexts make people experience emotions? ular situation. Recent research in cognitive psy-
This paradox that later received the name “para- chology suggests possible explanations why such
dox of fiction” was first pinpointed by the English experiences are perceived as positive in the first
philosopher Coling Radford (Radford and Weston, place (Tamborini et al., 2010).
1975). The paradox is formulated as follows:
New methods of quantitative research emerged
1. We experience emotions towards fictitious in the humanities scholarship bringing forth the so-
characters, object or events. called “digital revolution” (Lanham, 1989) and the
transformation of the field into what we know as
2. In order to experience emotions, we must be- digital humanities (Berry, 2012; Schreibman et al.,
lieve that these characters, object, or events 2015). The adoption of computational methods of
are real. text analysis and data mining from the fields of then
fast-growing areas of computational linguistics and
3. We do not believe that these characters or sit- artificial intelligence provided humanities scholars
uations are real. with new tools of text analytics and data-driven
approaches to theory formulation (Vanhoutte, 2013;
This paradox and its possible solutions are dis-
Jockers and Underwood, 2016).
cussed (e.g., Walton (1978), Lamarque (1981), and
Neill (1991)) and disputed (e.g., Tullmann and Although one of the first works on computa-
Buckwalter (2014)) by others, but we leave it to the tional treatment of subjective phenomena was orig-
reader to explore this philosophical problem. What inating from the area of artificial intelligence (AI)
we would like to highlight though in relation to this (Carbonell, 1979) (cited by Pang et al. (2008)), it
paradox is that Radford’s statements contributed to was only a few years later that the first work on
the popularity of the research on emotions and arts the computer-assisted modeling of emotions in lit-
in many fields, from literary studies to psychology. erature was published (Anderson and McMaster,
But what exactly can we learn from this inter- 1982). Challenged by the question why some texts
play of emotion and literature? Emotional intel- are more interesting than the others, in their pa-
ligence is a prerequisite to understanding literary per, Anderson and McMaster concluded that the
fiction but reading literature in turn enhances our “emotional tone” of a story can be responsible for

the reader’s interest. The results of their study sug- DH line of research. However, to make the reader
gest that a large-scale analysis of “emotional tone” aware of these applications, we shortly mention
of the collection of texts is possible with the help examples of them in Section 1.3.
of a computer program. There are three implica- The survey is structured as follows: Section 1
tions of this finding. First, they suggested that by is an introduction. Section 2 introduces the reader
identifying emotional tones of text passages one to the field of natural language processing (NLP)
can model affective patterns of a given text or a and to the standard pipeline used in many NLP
collection of texts, which in turn can be used to projects. Section 3 introduces the most common
challenge or test existing literary theories. Second, emotion theories used for the development of meth-
their approach to affect modeling demonstrate that ods of computational emotion analysis, as well
the stylistic properties of texts can be defined on as provides an important background to emotion
the basis of their emotional interest and not only analysis of literary texts from a classic and com-
linguistic characteristics. And finally, they suggest putational perspective. Section 4 is the core of the
that functional texts (speeches, memos, advertise- survey and is an overview of different applications
ment) can be run through an emotion analysis pro- of sentiment and emotion analysis to literary text.
gram to test whether they will have the intended Section 5 concludes the paper.
impact. With regard to these implications, the work
by Anderson and McMaster (1982) is an important 1.3 Other Applications of Emotion and
early piece as it laid out the “roadmap” for some Sentiment Analysis
of the basic applications of sentiment and emotion The survey does not cover every possible work on
analysis of texts, namely sentiment and emotion emotion analysis that exists, even in the digital hu-
pattern recognition from text and computational manities context. The understanding and automatic
text characterization based on sentiment and emo- analysis of emotions, sentiments and affects played
tion. an important role in computer science and artificial
intelligence in the last decades. It is applied in a va-
1.2 Scope and Structure of the Survey riety of studies from which we discuss a selection
in the following.
The goal of this survey is to provide a compre-
hensive overview of the methods of emotion and Robotics and Artificial Intelligence (AI)
sentiment analysis as applied to a text. The sur- While there is big overlap between the robotics
vey is prepared with a digital humanities scholar and AI, the former is mostly an engineering field
in mind who is looking for an introduction to the that deals with the design and use of robots, while
existing research in the field of sentiment and emo- the latter is more concerned with their actual
tion analysis from (primarily literary) text. All operation including but not limited to decision
the studies presented in this article are either di- making, problem solving, and reasoning (Brady,
rectly coming from digital humanities venues or 1985). This also includes emotional intelligence,
deal with sentiments and emotions in the literary as more and more robots that are developed today
text context. A substantial number of the works of serve not only pragmatic goals (e.g., cleaning,
the latter category originate from the computational warehouse operation) but social ones as well
linguistics community. Their primary goal is often (Breazeal, 2003). The motivation for affective
a methodological one rather than interpretative one. computing in robotics and AI, therefore, is to build
However, these works are still included in the sur- robots and virtual agents that are more human-like
vey, as we believe – and argue in the discussion – in terms of communication and reasoning.
that interpretation and methodology should come Robots and virtual agents that are able to recog-
hand in hand. nize and express emotions have been one of the foci
The survey does not cover applications of emo- in the fields of robotics and artificial intelligence
tion and sentiment analysis in the areas of digital for decades, both at the conceptual (Sloman and
humanities that are not focused on text, e.g., sen- Croucher, 1981; Dorner and Hille, 1995; Wright,
timent analysis of visual art and design, movies, 1997; Coeckelbergh, 2012) and implementational
or music. It does also not provide an in-depth (case-study) levels (Velásquez, 1998; Leite et al.,
overview of all possible applications of emotion 2008; Beck et al., 2010; Klein and Cook, 2012).
analysis in the computational context outside of the Some works focus on theoretical implications of

emotional robots (Sloman and Croucher, 1981; Fri- quently make the gaming experience even more
jda and Swagerman, 1987; Evans, 2004; Arbib and enjoyable.
Fellous, 2004) engaging in a fundamental discus- On the one hand, recognition and elicitation of
sion of such a possibility. A closely related body user’s emotions through mining of player data (e.g.,
of research touches upon moral and ethical impli- recognition of facial expressions and keystroke pat-
cations that arise when we talk about autonomous terns, chat message analysis) has several applica-
self-aware robots, who may make decisions which tions in the field of game development. For ex-
are against human moral judgements (Kahn Jr et al. ample, by timely and accurately recognizing the
(2012), Arkin et al. (2012), Malle et al. (2015). player’s emotional state the system can adaptively
Another thriving line of research related to AI respond to it by changing the game environment
and emotions deals with computational modeling (changing the pace, color scheme, or even sug-
of emotions in robotic and virtual agent applica- gesting the player to take a break). It can also
tions. For example, Gratch and Marsella (2004) play a role in educational games by customizing
propose a new methodology of emotion modeling the learning process (Zhou (2003), Conati (2002),
based on comparing the behavior of the computa- Conati et al. (2003)). On the other hand, games
tional model against human behavior and on the that are able to cause an emotional response in
use of standard clinical instruments for assessing players, such as fear or happiness, are more im-
emotions. Pereira et al. (2005) outline the belief– mersive (Sweetser and Johnson, 2004), and are
desire–intention architecture of emotions based on thought to be facilitating the flow (Johnson and
four modules, namely the Emotional State Man- Wiles, 2003), which is a state of profound enjoy-
ager, the Sensing and Perception Module, Capa- ment and total immersion in an activity (Csikszent-
bilities module, and the Resources module, where mihalyi and Csikszentmihalyi, 1992). Therefore,
each module is responsible for separate processes for the gaming process to be captivating and real-
within the emotion concept. Jiang et al. (2007) put istic it is important that the player interacts with
forth an extended belief-desire-intention model in- realistic non-player characters that express emo-
troducing primary and secondary emotions into the tions in an intelligent way and react to player’s
architecture. emotions appropriately (Chaplin and Rhalibi, 2004;
Human-computer interaction (HCI) can be con- Hudlicka and Broekens, 2009; Bosser et al., 2007;
sidered a subfield of artificial intelligence. It has Ochs et al., 2008, 2009; Li and Campbell, 2015;
also showed an increased interest in emotions. For Popescu et al., 2014, i.a.).
instance, Cowie et al. (2001) examine basic issues
related to the extraction of emotions from the user Emotion Detection from Voice, Face, Body, and
consolidating psychological and linguistic analyses Physiology In contrast to robotics and gaming,
of emotions. Pantic and Rothkrantz (2003) argue the goal of the recognition of emotions from bodily
that next-generation HCI designs will need to in- reactions focuses on humans; to identify patterns
clude the ability to recognize user’s affective states in acoustic speech signals, facial expressions, body
in order to become more effective and more human- postures, and physiology, and classifying them into
like. Both Beale and Creed (2008) and Beale and different emotions, often with machine learning
Creed (2009) provide an overview of the role of techniques. Calvo and D’Mello (2010) provide an
emotions in HCI highlighting important lessons in-depth survey to which we refer the reader for a
drawn from different research and providing guide- comprehensive overview. We will, however, add
lines for future research. that since the publication of Calvo’s and D’Mello’s
survey, the methodology has changed in terms of
Computer Games and Virtual Reality As the used equipment. Earlier researchers had to
video games become more complex and engag- rely on laboratory equipment. Nowadays more and
ing, research in the field of game AI gains more more studies are done with the help of non-invasive
popularity. The foci of the research are different wearable devices (wrist bands and smartphone ap-
(cf. Yannakakis (2012)) but the ones relevant to our plications) that monitor the subjects’ emotional
discussion are mining of player data and enhancing state (cf. Dupré et al. (2018), Ghandeharioun et al.
non-player character behavior. The main motiva- (2016)). This turn seems warranted as it provides
tion for researchers from this field is to study what the researchers and developers with a more natural
makes players enjoy or detest a game and conse- and close monitoring of the subjects, and hence, a

larger amount of research data. The goal here is to enable automatic detection of
the users’ posts with respect to the sentiments and
Sentiment Analysis and Opinion Mining Ap- opinions. This can be useful for automatic moni-
plications of sentiment analysis and opinion min- toring social media for emergency reports, violent
ing outside of a humanities context are not covered language, and mood of certain user groups.
by this survey. However, in Section 3.3 we will
give an overview of the existing methods used in 2 A Very Short Introduction to Natural
sentiment analysis, as some of them are relevant in Language Processing and Machine
the context of the reviewed papers. In this section, Learning
we give a short overview of other application areas
of sentiment analysis and opinion mining. In the introduction, we discussed the importance
Opinion mining deals with tracking and auto- of emotion analysis for literary studies. In Section
matic classification of opinions expressed by peo- 3.3, we will provide an overview of research in
ple (Liu, 2012). Opinion in a narrow sense is under- sentiment and emotion analysis as applied to text.
stood as evaluation or attitude towards some object As that section relies on the concepts from both
(Liu, 2015). Although opinion mining and senti- natural language processing (NLP) and machine
ment analysis are often used interchangeably in the learning, we provide a short introduction to both
literature, opinions are not sentiments (Munezero disciplines in the following.
et al., 2014): While sentiment is prompted by emo- A comprehensive overview of NLP is beyond the
tions, opinions are judgements based on objective scope of this survey paper. Therefore, we present
or subjective interpretations of the topic that are NLP tasks as steps of a single pipeline (see Fig-
not necessarily related to emotions. ure 1), which is common for many NLP projects.
As far as sentiment analysis and opinion mining Readers who are familiar with NLP may skip this
are concerned with human attitudes and feelings section without any hesitation. Readers who feel
towards anything, they find applications in many that they need an in-depth textbook-style introduc-
areas, for instance in business and sociology. A tion to the field are referred to Jurafsky and James
popular application of opinion mining is automatic (2000), which we follow in this section to describe
review analysis, often performed on a large collec- some important concepts of NLP.
tion of reviews that can originate from any domain, According to the Encyclopedia of Cognitive Sci-
for example movie reviews (Amolik et al., 2016; ence (Allen, 2006), NLP is a field that explores
Parkhe and Biswas, 2016; Tang et al., 2018), prod- computational methods for interpreting and pro-
uct reviews (books, electronics, DVDs, etc.) (Fang cessing natural language, in either textual or spo-
and Zhan, 2015; Xia et al., 2015), restaurant and ken form. NLP addresses a variety of tasks related
tourism products reviews (Kiritchenko et al., 2014; to language use and text analysis, from machine
Gan et al., 2017; Marrese-Taylor et al., 2014). The translation to code switching to named entity recog-
goal of opinion mining in this context is to classify nition to semantic role labeling. Regardless of the
reviews into positive or negative with various levels task, any NLP project includes several preliminary
of classification granularity. steps of speech or text processing that are necessary
Opinion mining is not limited to reviews. Com- for these and other downstream tasks. We now pro-
putational social sciences have also witnessed an ceed to the description of these fundamental steps
increased interest in automatic sentiment analysis, in an NLP pipeline.
for example, in the political domain (Maragoudakis
et al., 2011; Ceron et al., 2014; Rill et al., 2014; 2.1 Typical NLP pipeline
Liu and Lei, 2018). A goal of these studies is not Modern NLP pipelines may include a variety of
only to understand the electoral preferences of the processes and heavy feature engineering combining
population, but also to gain insight into how these multiple features. Figure 1 shows the basic NLP
preferences are formed and propagated via social pipeline that is most commonly used across various
media (Yaqub et al., 2017). projects.
A significant amount of research is concerned An NLP pipeline usually starts with speech
with automatic analysis of social media posts recognition (if the input is speech) and then con-
(most commonly, Twitter), for example Khan et al. tinues as if the input is directly text. The next
(2015), Rosenthal et al. (2017), Asghar et al. (2018). step is tokenization and segmentation followed by

Speech Tokenization and Morphological Syntactic analysis
recognition Semantic analysis
segmentation analysis
he hates me he , hates , me he , hate , I S he hate I
person emotion person
NP VP

he V NP

hate I

Figure 1: Typical NLP pipeline.

morphological analysis, syntactic analysis, and se- question mark or exclamation point, are unambigu-
mantic analysis (Uszkoreit, 2001). ous markers of a sentence boundary, periods are
less unambiguous, as they also indicate abbrevia-
2.1.1 Speech Recognition
tions boundaries (e.g., Mr., Mrs., Inc.). Therefore,
Such that a speech signal can be analyzed syntacti- it is often more appropriate to address word tok-
cally and semantically, it is typically first converted enization and sentence segmentation jointly.
to text to be able to apply the same methods as for
text as input. First, the analogue speech signals are 2.1.3 Morphological Analysis
sampled by time, filtered, and decomposed into the
After the text is available is a segmented form, each
frequency domain. The frequency components are
word in the text can be analyzed for its morpho-
then analyzed for features (the most common of
logical properties, e.g., inflection and case markers.
which are mel frequency cepstral coefficients (Imai,
For each token, a morphological parser outputs its
1983)) and converted into specific acoustic feature
lemma (a dictionary form), and a part-of-speech
vectors. Then, a language model and vocabulary
category with the morphosyntactic information. In
are used to calculate the phonetic likelihood of each
addition, words can be stemmed, that is, reduced to
speech sample. The decoded speech signal is then
their root with affixes and suffixes being removed.
available as hypotheses of textual representations.
Morphological analysis is an important prerequisite
2.1.2 Tokenization and Segmentation for syntactic analysis.
Text, converted from a speech signal or directly as Lemmatization is a process of casting a word
input is passed to the tokenization and segmenta- to its base form. It is often required to reduce the
tion part of the pipeline that outputs an array of variability of surface realizations of the words shar-
tokens, i.e., the input text in which units, often ing the same root. While lemmatization involves
words, are separated from each other. This process- a complex morphological analysis of a word (the
ing is required before a morphological analysis can algorithm should learn, for example, that the words
be applied. sang, sung, sings share the same lemma form sing),
In many languages, words are separated by stemming takes a simpler approach. In some ap-
whitespace and splitting the text on it, in most cases, plications, it is only important to map the word to
will produce a meaningful output. However, some its root, without full parsing of a word. Stemming
words contain whitespaces (e.g., San Francisco, does exactly that by chopping off the affixes of the
rock and roll) and, depending on the application, words. For example, in web search, one may want
some tokenizers may also tokenize multi-word ex- to map foxes to fox, but might not need to know that
pressions. Some tokenizers can also expand clitic foxes are plural (Jurafsky and James, 2000, p. 46).
contractions that are marked by apostrophes, for One popular algorithm for stemming is the Porter
example don’t is converted to do not and I’m to I stemmer (Porter, 1980).
am. Morphological parsing is important not only
In some cases, the text should be segmented for lemmatization and stemming, but for part-of-
into sentences first and only then into words. Es- speech (POS) tagging as well. In fact, POS tagging
sentially, the task of sentence segmentation is to is often based on the analysis of word affixes and
separate sentences from each other. The most com- suffixes (e.g., adjectives in English are recognized
mon cues for segmenting a text into sentences are by -able,-ful,-ish among other suffixes, while verbs
punctuation marks. Though some symbols, like by -ate and -en). The number of POS tags used

varies, from seventeen, as in the Universal POS have similar meaning. Generally, these words are
tagset (Petrov et al., 2012), to forty-five in Penn represented as vectors or arrays of numbers that
Treebank (Marcus et al., 1993), to sixty one used by are, in some way, related to word counts. These
the Lancaster UCREL project’s CLAWS (the Con- relationships are captured in a term-context matrix
stituent Likelihood Automatic Wordtagging Sys- that represents how well each word fit with other
tem) (Rayson and Garside, 1998). words (context) in the corpus. Such a matrix is of
dimensionality |V | × |V |, where each cell contains
2.1.4 Syntactic Analysis the number of times the row word (target) and the
Based on the morphological information obtained column word (context) co-occur in some context in
in the previous step, the words in the sentence some corpus. The matrix can then be used to cal-
are analyzed for their grammatical function (e.g., culate the similarity of the words, with the cosine
whether a word is a subject, object, modifier). This measure being used more commonly.
process is called parsing and it is important for In contrast to count-based sparse vector represen-
analyzing the relationship between words, includ- tations, most approaches rely on dense representa-
ing disambiguating their meaning. The output of tions nowadays, either obtained by dimensionality
this step of the pipeline is a text represented by its reduction or by predicting the target word or its
syntactic or dependency tree. context. Examples for this group vector represen-
There are two main types of parsing: con- tations include embeddings (Mikolov et al., 2013;
stituency parsing based on Chomsky’s generative Pennington et al., 2014).
grammar (Chomsky, 1993), and dependency pars-
ing based on dependency rules (Kübler et al., 2009). 2.2 Machine-learning
The main difference between the two types of pars- Although some of the previously described tasks,
ing is that the constituency parsing operates on a such as POS tagging or syntactic parsing, can be
phrase level, where each type of phrase (e.g., noun performed using rule-based approaches, most mod-
phrase, verbal phrase) is allowed to be composed ern NLP pipelines make use of machine learning
of phrases of certain type, while dependency pars- methods. The advantage of machine learning over
ing operates on a word level and takes into account rule-based systems becomes especially clear in the
dependency rules between them. context of large data that needs to be processed.
Writing down rules that capture all the minuscule
2.1.5 Semantic Analysis
differences and variety of language in some corpus
Finally, the sentences, phrases, or words of the is a tedious and by and large an impossible task.
text are analyzed for their meaning based on the That is where machine learning techniques come
information obtained in the preceding parts of the handy. Machine learning is a subfield of artificial
pipeline. intelligence widely applied to many other disci-
Semantic analysis is needed to disambiguate plines, including natural language processing and
polysemous words, which is especially difficult data science. In the remainder of this section, we in-
given a wide range of meanings a single word can troduce three main paradigms of machine learning
take. The most straight-forward approach to word and briefly describe how are they used for solving
sense disambiguation is through the use of lexi- NLP tasks.
cal resources, such as WordNet (Fellbaum, 1998).
WordNet provides a set of lemmas for nouns, verbs, 2.2.1 Learning Paradigms
adjectives, and adverbs, where each lemma is an- Machine learning is about using the right features
notated with a set of senses. to build the right models that achieve the right task
Many of the disambiguation algorithms, how- (Flach, 2012, p. 13). The machine learning models
ever, rely on contextual similarity when choosing learns to associate characteristics of each instance
the proper sense. There are different approaches with a class to be predicted. These charateristics are
to computing the word context. One of the most commonly referred to as features. Following this
popular of them is a distributional semantics ap- definition, one must acknowledge that there is no
proach. Distributional semantics deals with seman- single machine learning framework (cf. “no silver
tic properties of words derived from their distribu- bullet” argument by Brooks (1987)) that applies to
tion across texts. The intuition behind its use is all possible scenarios. Generally, machine learning
that words that occur in the same context tend to settings bifurcate into supervised and unsupervised

learning paradigms, with each of the paradigms gineering is often iterative: features are added, re-
encompassing a wide range of models. moved, normalized, and fine-tuned until the model
Supervised machine learning refers to the meth- achieves the results one expects from it. Tradi-
ods of labeling unseen data by learning a function tionally, feature engineering has been paid great
from labeled training instances. A classifier is a attention to in machine learning. However, recent
function ĉ : X → Yc , where X is an instance successes in the family of machine learning meth-
of data and Y = {c1 , c2 . . . , ck } is a finite set of ods known as deep learning have deemed features
class labels. Labels can be numerical, ordinal, or less necessary an ingredient than the model archi-
nominal, or structured, and often denote a class tecture.
membership of each data instance. For example, in
2.2.3 Deep Learning
the task of POS tagging, labels are the actual POS
tags assigned to the words in the training set. Dur- During the past decade, neural networks have re-
ing the training phase, a classification algorithm gained their once-lost popularity, which vanished
learns a mapping from instances to labels, and later, in the late 1990s due to the computational cost
during the prediction phase, classifies the new, un- associated with them and the rise of other success-
seen, instances with the class labels. Examples ful methods, for instance support vector machines.
for supervised machine learning methods are naı̈ve One of the factors that can be attributed to the re-
Bayes classifier, support vector machines, decision emergence of neural networks is the availability of
trees, and supervised deep learning algorithms. moderately expensive hardware and software capa-
However, labeled training data is not always ble of processing big data. What was not possible
available or is prohibitively expensive. Moreover, back in 1990s has become possible now: neural
sometimes researchers do not know the actual la- networks can be trained on big amounts of data,
bels of the data they have. In this case, another with comparably big sets of parameters and “deep”
family of the machine learning algorithms, referred architectures.
to as unsupervised machine learning, comes to The general idea behind deep learning is to build
the rescue. Clustering is one of the most popular models in which, specifically in NLP, words are rep-
unsupervised models that works by assessing the resented in a continuous space (following the ideas
similarity between instances and arranging them of distributional semantics). Neural networks usu-
in such a way that similar instances are put in the ally have several layers, which are trained jointly
same cluster while dissimilar instances are put in to fulfill the specific task at hand. Each of these
different clusters. The output of an unsupervised layers can be interpreted as being responsible for
machine learning algorithm can be used to better different subtasks on the route to the common goal.
understand the nature and variance of the data, or as The layers of the network extract and transform
a prerequisite step to develop a supervised learning features sequentially. The layers that are close to
task with a set of defined labels. the data input extract simple features, while higher
layers learn more complex features derived from
2.2.2 Feature-based Learning the lower layer features (Zhang et al., 2018, p. 2).
A common first approach when developing a Exactly due to such a multi-layered structure of
machine-learning-based model is to map each in- deep neural networks, manually designed features
stance into a representation of its characteristics, are of lower importance, as every layer extracts
its features. Features are functions mapping in- them from the input on its own.
stances to a set of values, for instance real num- Common network substructures for NLP include
bers, Boolean values (e.g., “is this word an adjec- the following components: embedding layers, con-
tive/noun/verb?”, “is this word a proper noun?”, “is volutional networks or a long-short-term memory
previous word ‘the’?”, etc.) and integers (when the network, and a dense layer, which we discuss ex-
feature is a count of something). In text classifi- emplarily in the following. Word embeddings trans-
cation, a common approach is the so-called bag- forms words in a vocabulary to vectors of contin-
of-words, in which each word is represented by uous real numbers that represent words as a func-
its count in an instance (a document, sentence, for tion of their context and encode linguistic patterns.
instance). Word2Vec (Mikolov et al., 2013) is one popular
Features do not come ready-made with the data word embedding approach that includes models
and the process of model building and feature en- for predicting a target word from its context and,

vice versa, predicting the contexts words given the pipeline we describe in Section 2.1 are performed
target word. Dense layers combine the information today with the help of machine learning. The pre-
received from the preceding components and often sented pipeline is rather fundamental and is often
perform the final classification. Convolutional neu- used as a part of other larger pipelines designed for
ral networks (CNN) (LeCun et al., 1989, 1998) are specific applications. These applications include di-
a special kind of neural networks originally used alogue systems, discourse analysis, document clas-
in computer vision, inspired by the human visual sification, text generation, text mining, machine
cortex. Similar to the visual system, CNNs are able translation, question answering, text summariza-
to detect relevant features in the input that is pro- tion, and, finally, sentiment and emotion analysis.
cessed in an “n-gram” fashion. This is achieved by With this necessary introduction to natural lan-
using filters, that detect relevant features from the guage processing and machine learning, we now
input, and a max pooling, an operation of extract- may proceed to an overview of what sentiment and
ing the most representative numeric values from emotion analysis is and how it is performed compu-
the filtered features. CNNs ability to capture the tationally. But before that, we first need to provide
spatial correlation of features proved to be useful a background in the emotion theories that exist in
in the NLP context, as features important for text psychology and introduce the role they play in the
classification may be located in different places of computational emotion analysis.
the input.
Long short-term memory (LSTM) networks 3 Background on Sentiment Analysis
Hochreiter and Schmidhuber (1997) have a recur- and Emotion Analysis
sive structure and interpret the input as time-series
3.1 Affect and Emotion in Psychology
and are capable of learning distant dependencies.
In contrast to CNNs that are limited to their filter The history of emotion research has a long and rich
sizes, LSTMs have a memory of more distant in- tradition that followed the 1872 Darwin’s publica-
formation. This comes at cost of computational tion of The Expression of the Emotions in Man and
complexity. The trade-off between efficiency and Animals Darwin (1872). The subject of emotion
complexity is realized in the mechanism of a “for- theories is so vast and diverse that it is not possi-
get gate”. The gate discards irrelevant information ble to even briefly mention all of the theories or
(features) from the previously read input. This name prominent psychologists who contributed to
makes LSTM efficient in learning sequential data, the emotion research throughout the nineteenth and
as irrelevant features are discarded improving the twentieth centuries (see Gendron and Feldman Bar-
prediction, which is not biased by unimportant de- rett (2009) for a brief history of ideas about emotion
tails. in psychology). Most emotion theories, however,
These and other components make deep learn- that appeared in the last century fall into one of
ing an efficient tool for solving many problems. the traditions, namely basic, appraisal, and con-
However, deep learning has its limitations. First, it structionist. In the pages that follow, we briefly
often requires large amounts of data to recognize discuss models of emotions as they are introduced
helpful characteristics in data, which is not always in psychology. We limit these descriptions to those
available, especially in certain domains. Second, theories which have been used to formalize com-
deep learning algorithms are not always easily in- putational methods for automatic analysis in the
terpretable, which often makes it difficult to under- digital humanities and natural language processing.
stand what meta-parameters of the network should Namely, we will introduce two theories from basic
be optimized for a better result. tradition, and one theory from both appraisal and
constructionist ones.
2.3 Applications 3.1.1 Ekman’s Theory of Basic Emotions
Machine learning finds an extensive application in The basic emotion theory was first articulated
natural language processing. The advancements by Silvan Tomkins in the early 1960’s (Tomkins,
in machine learning have contributed to the devel- 1962). Inspired by Darwin’s view of emotions
opment of the field in recent years, both in terms as mental states that cause stereotypic bodily ex-
of methodology and the efficiency of performing pressions (Gendron and Feldman Barrett, 2009),
certain tasks. Most of the steps of a typical NLP Tomkins postulated that certain emotions are au-

tomatically triggered by objects or events in the                    Based on the observation of facial behavior in
world. Importantly, each episode of certain emo-                  early development or social interaction, Ekman’s
tion (or “instance”), Tomkins argue, is biologically              theory also postulates that emotions should be con-
similar to other instances of the same emotion or                 sidered discrete categories (Ekman, 1993), rather
share a common trigger. Tomkins’ own work in                      than dimensional. Though this view allows for
turn inspired one of his mentees, Paul Ekman, to                  conceiving of emotions as having different intensi-
formulate a new theory of emotions. Ekman put in                  ties (for example, anger can take different intensity,
question the existing emotion theories that postu-                from resentment to rage), it does not allow emo-
lated that facial displays of emotion are socially                tions to blend and leaves no room for more com-
learned and therefore vary from culture to cul-                   plex affective states in which individuals report the
ture. Together with Sorenson and Friesen, Ekman                   co-occurrence of like-valenced discrete emotions
(Ekman et al., 1969) endeavor on a field trip to                  (Barrett, 1998). This and other theory postulates
New Guinea, Borneo, the United States, Brazil,                    were widely criticized and disputed in literature (c.f
and Japan to challenge this view. The outcome                     Russell (1994), Russell et al. (2003), Gendron et al.
of their large-scale study led to a conclusion that               (2014), Barrett (2017)).
would revolutionize the field of psychology for                      Regardless of the criticism that Ekman’s theory
many years: facial displays of fundamental emo-                   of basic emotions has undergone in recent years,
tions are not learned but innate, and therefore are               the theory itself as well as its methodology, was
universal across the nationalities. However, there                revolutionary in the time of its appearance and
are culture-specific prescriptions about how and in               continued to shape the research in emotion in the
which situations emotions are displayed.                          late twentieth century. Ekman’s categories of ba-
   To come to this basic emotion definition, Ekman                sic emotions are frequently used in the research
and et al. select 30 photographs of adult males and               on computational facial emotion recognition (e.g.,
females, children, professional actors, and mental                Essa and Pentland (1997), Pantic and Rothkrantz
patients. The photographs are selected in such a                  (2000), Bartlett et al. (2005)) and well as in emo-
way that the portrayed faces express one of the six               tion recognition from text.
basic affects from Tomkin’s list of affects, exclud-
                                                                  3.1.2 Plutchik’s Wheel of Emotions
ing interest and shame, namely anger, fear, disgust,
surprise, sadness, and happiness. The selection of                Robert Plutchik was an American psychologist and
affects is based on previous research (Ekman et al.,              a professor of psychiatry at the Albert Einstein Col-
1971)1 that finds that facial expressions pertaining              lege of Medicine, who contributed to the study of
to these emotions are clearly identifiable and can                emotions, violence and suicide2 . In the early 80’s
be scored by observers. These selected pictures are               he formulated his psychoevolutionary theory of
then shown to the participants of the study along                 emotions (Plutchik, 1991, cited by revised version)
with the list of six basic affects. The observer’s                together with the postulates that shape it, some of
task is to categorize each picture into one of the six            which overlap with the assumptions of Ekman’s the-
categories.                                                       ory, that there is a small number of basic emotions,
                                                                  which differ from each other both in physiology
   Ekman’s research boosted interest in emotion
                                                                  and behavior, and which can exist in varying de-
and brought forth new challenges and questions
                                                                  grees of intensity). However, there are important
about the nature of the emotions. In his subsequent
                                                                  differences to the Ekman’s study of emotions.
studies, Ekman showed that both nature and nur-
                                                                     First and foremost, Plutchik stated that, apart
ture must be considered in the study of emotions
                                                                  from a small set of basic emotions, all other emo-
(Ekman, 1971, 1992) and that facial expressions
                                                                  tions are mixed and derived from the various com-
of emotions, even when produced voluntarily, gen-
                                                                  binations of basic ones. He further categorized
erate the physiology and some subjective feelings
                                                                  these other emotions in the primary dyads (very
pertaining to the true emotional experience (Ekman
                                                                  likely to co-occur), secondary dyads (less likely
et al., 1983). The latter findings gave way to a new
                                                                  to co-occur) and tertiary dyads (co-occur seldom)
line of research in the biology of emotions studying
                                                                  (Plutchik, 1991, p. 117). Love, for instance, is a
the emotion-specific changes in the physiology.
                                                                  primary-dyad emotion derived from both joy and
   1                                                                 2
     In press at the time of publication of Ekman et al. (1969)        Based on the information from https://www.the-emotions.
study                                                             com/robert-plutchik.html

trust (the same applies to friendship). Delight is
an example of the secondary-dyad emotion, which
takes a little bit from both joy and surprise. Finally,
guilt is a tertiary-dyad emotion being a mixture
of fear and joy. Some other examples of blended
emotions are optimism (anticipation + joy), aggres-
sion (anticipation + anger), shame (fear + disgust),
and envy (sadness + anger). Plutchik argues that
most of our daily emotions are mixed, while pri-
mary emotions almost do not exist in their pure
form. More importantly, to Plutchik, mixed emo-
tions are the actual personality traits. He writes
that “Emotions like pride, aggression, submission,
and optimism are usually long-lasting, and in fact
are often called personality traits.” Plutchik (1991,
p. 120) and later concludes that “persisting situa-
tions which produce mixed emotions produce per-
sonality traits” (Plutchik, 1991, p. 121). In other Figure 2: Plutchik’s wheel of emotions
words, a conflict between two or more emotions
produce a new unique personality trait or attitude,
which persist over time. (1991). The wheel (Figure 2) is constructed in
the fashion of a color wheel, with similar emotions
The second radical difference of Plutchik’s emo- placed closer together and opposite emotions 180
tion theory from the basic emotion theory of Ek- degrees apart. The wheel is designed as a cone,
man is that emotion is not reduced to physiology where the vertical dimension indicates the intensity,
only. Plutchik believes that humans recognize ranging from maximum intensity at the top to a
and express emotions not with any one particular state of deep sleep at the bottom. Such a shape
physiological signal, but in terms of overall behav- implies that emotions become less distinguishable
ior. Hence, he claims, we should study emotions at lower levels of intensity. Essentially, the wheel is
through behavior and not by using bodily measure- constructed from eight basic bipolar emotions: joy
ments. Plutchik writes: “Emotion is not a thing versus sorrow, anger versus fear, trust versus dis-
in the sense as table or chair is” (Plutchik, 1991, gust, and surprise versus anticipation. The blank
p. 50). For Plutchik, emotion is “a patterned bodily spaces between the leaves are so-called primary
reaction of destruction, reproduction, [. . . ] brought dyads — emotions that are mixtures of two of the
about by a stimulus.” (Plutchik, 1991, p. 151), and primary emotions.
its (emotion) properties can only be inferred, but Just as Ekman’s theory of basic emotions influ-
not measured. As Ekman, Plutchik considers that enced the research in facial emotion recognition,
emotions are innate, but this innateness has nothing the wheel model of emotions proposed by Plutchik
to do with certain body parts or neural structures. too had a great impact on the field of affective com-
Emotions are mere adaptive devices inherited by an puting. However, in contrast to Ekman’s model,
individual from the process of evolution and strug- Plutchik’s wheel of emotions is primarily used in
gle for survival. In this sense, adaptive behavior the emotion recognition from text as a basis for
comes first, and emotion follows. Evolution taught emotion categorization (some examples are Cam-
us to explore, protect, reproduce, reject, destruct, bria et al. (2012), Kim et al. (2012), Suttles and Ide
and emotions are evolutionary devices that have (2013), Borth et al. (2013), Mohammad and Turney
relevance to basic biological adaptive processes. (2013), Abdul-Mageed and Ungar (2017)).
In order to represent the organization and proper-
ties of the emotions as they were defined by his psy- 3.1.3 Russel’s Circumplex Model
choevolutionary theory, Plutchik proposed a struc- Despite wide popularity and influence, the theory
tural model of emotions, which he called a multidi- of basic emotions elaborated in detail by Ekman is
mensional model of emotions that is more known challenged by some theoretical and empirical diffi-
today as Plutchik’s wheel of emotions Plutchik culties associated with it. Main objection raised to

the theory of basic emotions is that there are no re-
liable neural, physiological and facial correlates to
specific basic emotions (Posner et al., 2005), which
essentially challenges the idea of innate, and hence
“universal”, emotions. At the same time, investi-
gations in the subjective experience of emotions
suggest that they arise from cognitive interpreta-
tions of physiological experiences (Cacioppo et al.,
2000). Attempts to overcome the shortcomings of
basic emotions theory and its unfitness for clinical
studies led researches to suggesting various dimen-
sional models, the most prominent of which is the
circumplex model of affect proposed by James Rus-
sel (Russell, 1980). The word “circumplex” in the
name of the model refers to the fact that emotional
episodes do not cluster at the axes but at the periph-    Figure 3: Circumplex model of affect: Horizontal
ery of a circle (Figure 3).                               axis represents the valence dimension, the vertical
   At the core of the circumplex model is the notion      axis represents the arousal dimension
of two dimensions plotted on a circle along horizon-
tal and vertical axes. These dimensions are valence
                                                          3.2   Emotion Analysis in Classical Literary
(how pleasant or unpleasant one feels) and arousal
                                                                Studies
(the degree of calmness or excitement). The num-
ber of dimensions is not strictly fixed and there         Until the end of the twentieth century, literary and
are adaptations of the model that incorporate more        art theories often disregarded the importance of
dimensions, as the Valence-Arousal-Dominance              the aesthetic and affective dimension of literature,
model that adds an additional dimension of domi-          which in part stemmed from the rejection of old-
nance, the degree of control one feels over the sit-      fashioned literary history that had explained the
uation that causes an emotion (Bradley and Lang,          meaning of art works by the biography of the author
1994).                                                    (Sætre et al., 2014a). However, the affective turn
   Essentially, by moving from discrete categories        taken by a wide range of disciplines in the past two
to a dimensional representation, the researchers          decades – from political and sociological sciences
are able to account for subjective experiences that       to neurosciences to media studies – have refueled
do not fit nicely the isolated non-overlapping cat-       the interest of literary critics in human affects and
egories. Accordingly, each affective experience           sentiments.
can be depicted as a point in a circumplex that is           We already mentioned several works that ex-
described by only two parameters — valence and            plore the link between the arts and emotions in the
arousal — without need for labeling or reference to       Introduction. In this section, we will talk about
folk emotional concepts (Russell, 2003). However,         several other studies that focus on the emotions
the strengths of the model turned out to be its weak-     expressed in literary art form to set a ground for
nesses: For example, it is not clear if there are basic   further discussion of differences between classical
dimensions in the model (Larsen and Diener, 1992)         and computational approaches to theorizing about
and what to do with qualitatively different events of     emotions.
fear, anger, embarrassment and disgust that fall in          We said earlier there seems to be a consensus
identical places in the circumplex structure (Russell     among literary critics that literary art and emotions
and Barrett, 1999). Despite these shortcomings, the       go hand in hand. However, one might be chal-
circumplex model of affect is widely used in psy-         lenged to define the specific way in which emotions
chologic and psycholinguistic studies. In computa-        come into play in the text. The exploration of this
tional linguistics, the circumplex model is applied       problem is presented by van Meel (1995). Under-
when the interest is in continuous measurements           pinning the centrality of human destiny, hopes, and
of valence and arousal rather than in the specific        feelings in the themes of many artworks – from
discrete emotional categories.                            painting to literature – van Meel explores how

emotions are involved in the production of arts.        field’s prose, to describe the emotional world of
Pointing out to big differences between the two         characters. Going back and forth from psycho-
media in their possibilities to depict human emo-       narration to free indirect discourse provides Mans-
tions (painting convey nonverbal behavior directly,     field with a tool to point out the significant mo-
but lack temporal dimension that novels have and        ments in the protagonists’ lives and draw a separa-
use to describe emotions), van Meel provides an         tion between characters and narration.
analysis of the nonverbal descriptions used by the         Both van Meel’s and Kuivalainen works, sepa-
writers to convey emotional behavior of the char-       rated from each other by more than a decade, un-
acters. Description of visual characteristics, van      derpin the importance of emotional language in
Meel speculates, responds to a fundamental need         the interpretation of characters’ traits, hopes, and
of a reader to build an image of a person and her       tragedy, and this view in fact finds empirical sup-
behavior. Moreover, nonverbal descriptions add          port, for example in Barton (1996) and Van Horn
important information, which can in some cases          (1997). Of course, the power of linguistic tools
play a crucial hermeneutical role, as in Kafka’s        in conveying emotions cannot be underestimated.
Der Prozess, where the fatal decisions for K. are       But at the same time its role in the creation and
made clear by gestures rather than words. How-          depiction of emotion should not be overestimated.
ever, gestures are not the only nonverbal channels      That is, saying that someone looked angry or fear-
that are used to convey emotions in literature. Van     ful or sad, as well as directly expressing characters
Meel defines eight channels (bodily characteristics,    emotions are not the only ways the authors resort
clothing, facial expressions, looking behavior, hand    to when building believable fictional space filled
gestures, movements of the body, voice, and spa-        with characters, action, and emotions. In fact, many
tial relations) and offers a small-scale quantitative   novelists strived to express emotions indirectly by
systematic analysis of their use in literature (on a    way of figures of speech or catachresis Hillis Miller
sample of six twentieth-century “classics”). The        (2014), first of all, because emotional language can
analysis shows that the voice category was the most     be ambiguous and vague, and, second, to avoid any
frequently used followed by facial expressions, and     allusions to Victorian emotionalism and pathos.
hand gestures. The results, van Meel suggest, show
                                                          How can an author convey emotions indirectly?
that such types of analysis could contribute to un-
                                                        A book chapter by Hillis Miller (2014) in Exploring
raveling the hidden presuppositions about inner
                                                        Text and Emotions (Sætre et al., 2014b) seeks the
life and its outer appearance, and can help in re-
                                                        answer to exactly this question. Using Conrad’s
constructing the emotional universe of individual
                                                        Nostromo opening scenes as material, Hillis Miller
writers and historical periods.
                                                        shows how Conrad’s descriptions of an imaginary
   A hermeneutic approach through the lenses            space generate emotions in readers without direct
of emotions is presented by Kuivalainen (2009),         communication of emotions.
which provides a detailed analysis of linguistic fea-      Conrad’s Nostromo opening chapter is an objec-
tures that contribute to characters’ emotional in-      tive description of Sulaco, an imaginary land. The
volvement in Mansfield’s prose. The study shows         description is mainly topographical and includes
how, through the extensive use of adjectives, ad-       occasional architectural metaphors, but it combines
verbs, deictic markers, and orthography, Mansfield      wide expanse with hermetically sealed enclosure,
steers the reader towards the protagonist’s climax.     which generates “depthless emotional detachment”
Subtly shifting between psycho-narration and free       (Hillis Miller, 2014, p. 93). Through the use of
indirect discourse, Mansfield is making use of eval-    present tense, Conrad is making the readers to sug-
uative and emotive descriptors in psycho-narrative      gest that the whole scene is timeless and does not
sections, often marking the internal discourse with     change. The topographical descriptions are given
dashes, exclamation marks, intensifiers, and repeti-    in a pure materialist way: There is nothing behind
tion, which triggers an emotional climax. Various       clouds, mountains, rocks, and sea that would matter
deictic features introduced in the text are used to     to humankind, not a single feature of the landscape
pinpoint the source of emotions in the text, which      is personified, not a single topographical shape is
helps in creating a picture of characters’ emotional    symbolic. Knowingly or unknowingly, the author
world. Verbs (especially, in present tense), adjec-     argues, but by telling the reader what she should
tives, and adverbs serve the same goal in Mans-         see – with no deviations from truth – Conrad em-

You can also read