Dialogsysteme mit Gefühlen: Wie, warum und wohin? - Dr. Felix Burkhardt Research director, audEERING GmbH - ITG-Workshop Sprachassistenten

Page created by Warren Patel
 
CONTINUE READING
Dialogsysteme mit Gefühlen: Wie, warum und wohin? - Dr. Felix Burkhardt Research director, audEERING GmbH - ITG-Workshop Sprachassistenten
Dialogsysteme mit
Gefühlen: Wie, warum
    und wohin?
           Dr. Felix Burkhardt
   Research director, audEERING GmbH
Dialogsysteme mit Gefühlen: Wie, warum und wohin? - Dr. Felix Burkhardt Research director, audEERING GmbH - ITG-Workshop Sprachassistenten
Outlook

• General motivation for emotional HMI

• How to model emotions related states

• Recognition

• Simulation

• Applications

• Ethical considerations

• Market

   3.3.2020, ITG Workshop Magdeburg   Dialogsysteme mit Gefühlen: Wie, Warum und Wohin?
Dialogsysteme mit Gefühlen: Wie, warum und wohin? - Dr. Felix Burkhardt Research director, audEERING GmbH - ITG-Workshop Sprachassistenten
Trends

• Ubiquitous computing accessible via
 a) Smart mobile devices: phones, glasses, watches,
 t-shirts, implants, etc.
 b) Home automation: central intelligence controlling media,
 communication, environment
 c) Aging society gets supported by technological interfaces
 d) Big data, faster hardware and new algorithms: DNNs

• Uses natural interface: voice, gestures, wearables, …
• Gets much nearer to user, unobtrusive
• Will be emotional because it‘s easier: emotion expression is a channel of
  communication, e.g. urgency, irony, ...

    3.3.2020, ITG Workshop Magdeburg   Dialogsysteme mit Gefühlen: Wie, Warum und Wohin?
Dialogsysteme mit Gefühlen: Wie, warum und wohin? - Dr. Felix Burkhardt Research director, audEERING GmbH - ITG-Workshop Sprachassistenten
Emotions and intelligence

• Antonio Damasio demonstrated that emotions
  are central to the life-regulating processes of
  almost all living creatures.
• E.g. brain injuries specific to emotional
  processing robbed people of their capacity to
  make decisions
• Emotions help to react fast, be social, be
  motivated etc.
• In opposition to Descartes, body and mind are
  not separated
„The question is not whether intelligent
machines can have any emotions, but whether
machines can be intelligent without any
emotions,“ - Marvin Minsky

    3.3.2020, ITG Workshop Magdeburg   Dialogsysteme mit Gefühlen: Wie, Warum und Wohin?
Dialogsysteme mit Gefühlen: Wie, warum und wohin? - Dr. Felix Burkhardt Research director, audEERING GmbH - ITG-Workshop Sprachassistenten
How to model emotions: categories

• …everyone except a psychologist knows
  what an emotion is (Young 1973)
• Charles Darwin: The Expression of the
  Emotions in Man and Animals
• The big four:
 •   Anger
 •   Sadness
 •   Joy
 •   Fear                                                              Emotions as characters in Pixar‘s „Inside
• Needed to survive and „culturally universal“                         Out“ (anger, fear, joy, disgust, sadness)
• Many more catgorical models exist, e.g.
  Ekman‘s six or Plutchik‘s emotion weel

      3.3.2020, ITG Workshop Magdeburg   Dialogsysteme mit Gefühlen: Wie, Warum und Wohin?
Dialogsysteme mit Gefühlen: Wie, warum und wohin? - Dr. Felix Burkhardt Research director, audEERING GmbH - ITG-Workshop Sprachassistenten
Dimensional models

• Dimensions consider an emotion as a
  point in an n-dimensional emotion
  space.
• One of the most well-known spaces is
  the PAD-space:
   • Pleasure (valence)
   • Arousal (activation)
   • Dominance
• Specific dimensions are better
  recognized by different modalities, e.g.
  activation in the speech but valence in
  the mimics

    3.3.2020, ITG Workshop Magdeburg   Dialogsysteme mit Gefühlen: Wie, Warum und Wohin?
Dialogsysteme mit Gefühlen: Wie, warum und wohin? - Dr. Felix Burkhardt Research director, audEERING GmbH - ITG-Workshop Sprachassistenten
Panksepp‘s seven primal Emotions

• Jaak Panksepp was a neuro-scientist who suggested
   seven emotion categories in men and animals that
   can be localized in the brain.
• Search (anticipation, desire)
• Rage ((frustration, body surface irritation, restraint,
  indignation)
• Fear (pain, threat, foreboding)
• Panic/Loss ((separation distress, social loss, grief,
  loneliness)
• Play ((rough-and tumble carefree play, joy)
• Lust (copulation, mating)
• Care ((maternal nurturance)

    3.3.2020, ITG Workshop Magdeburg   Dialogsysteme mit Gefühlen: Wie, Warum und Wohin?
Dialogsysteme mit Gefühlen: Wie, warum und wohin? - Dr. Felix Burkhardt Research director, audEERING GmbH - ITG-Workshop Sprachassistenten
Appraisal theory

• Appraisal theory means that emotions
  are extracted from our evaluations
  (appraisals or estimates) of events
  that cause specific reactions in
  different people.
• E.g. Scherer's multi-level sequential
  check model
• Three levels of processing are: innate
  (sensory-motor), learned (schema-
  based), and deliberate (conceptual)                                          Source: https://en.wikipedia.org/wiki/Appraisal_theory

  3.3.2020, ITG Workshop Magdeburg   Dialogsysteme mit Gefühlen: Wie, Warum und Wohin?
Dialogsysteme mit Gefühlen: Wie, warum und wohin? - Dr. Felix Burkhardt Research director, audEERING GmbH - ITG-Workshop Sprachassistenten
How are emotions expressed: modalities

• User introspection: e.g. Emoticon, press button etc
• Text: sentiment analysis
• Audio: speech, extralinguistics
• Video: facial expression, gestures, posture
• Physiology: respiration rate, blood pressure, skin conductivity,
 neuronal activity, speech (held vowels)
• Behaviour, e.g. switched room often, typing speed
• Context: localization, weather, time of day, other people‘s
 moods etc.
   3.3.2020, ITG Workshop Magdeburg   Dialogsysteme mit Gefühlen: Wie, Warum und Wohin?
Dialogsysteme mit Gefühlen: Wie, warum und wohin? - Dr. Felix Burkhardt Research director, audEERING GmbH - ITG-Workshop Sprachassistenten
Training and evaluation data

• Ideally from the application
• From an application similar to the
  target
• From Wizard of Oz scenario
• From field recordings (e.g. VAM)
• From induced emotions („Lost
  luggage“, „Aibo“)
• From actors
                                                                       Felix Burkhardt, Astrid Paeschke, Miriam Rolfes, Walther F.
                                                                       Sendlmeier and Benjamin Weiss: A Database of German
                                                                       Emotional Speech, Proc. Interspeech 2005

   3.3.2020, ITG Workshop Magdeburg   Dialogsysteme mit Gefühlen: Wie, Warum und Wohin?
Ground truth and gold standards

• Five human labelers annotated the emotional
  content of textual data using four categories.                        labeler A     labeler B     labeler C     labeler D     labeler E     majority     machine

• A machine algorithm did the same                          labeler A          1,00          0,20          0,19          0,10          0,24         0,27        0,15

  classification.                                           labeler B                        1,00          0,79          0,46          0,15         0,81        0,15
• “majority” means the majority voting of the               labeler C                                      1,00          0,47          0,19         0,83        0,14
  human labelers.
                                                            labeler D                                                    1,00          0,09         0,52        0,07
• The chart shows the Cohen’s kappa values
  for the so-called “inter rater agreement”, i.e.           labeler E                                                                  1,00         0,29        0,10
  how much each rater agrees with all other                 majority                                                                                1,00        0,17
  raters.
                                                            machine                                                                                             1,00
• EWE (evaluator weighted estimator) is a
  possibility to weight labelers according to
  their inter rater agreement

    3.3.2020, ITG Workshop Magdeburg    Dialogsysteme mit Gefühlen: Wie, Warum und Wohin?
Recognition by statistical classification

• Basic approach:                   Early Fusion
  • extract features,
  • select best ones,
  • classify features,
  • fuse classifier outputs (unimodal/multimodal)
• Classifiers: Gaussian Mixture Models: model training
  data as Gaussian densities, Artificial Neural Networks                                  (Late)
  (ANN), e.g. Multi Layer Perceptron, Support Vector
  Machines (SVM): use „kernel functions“ to separate
  non-linear decision boundaries, Classification and
  Regression Trees (CART), Hidden Markov Models
  (HMMs) used to model temporal structure

   3.3.2020, ITG Workshop Magdeburg   Dialogsysteme mit Gefühlen: Wie, Warum und Wohin?
Recognition by Deep Neural Networks


  Tremendous success during the last                           a)
  decade

  Three reasons: more data , faster
  hardware, new algorithms

  Can work end-to-end, no feature
  engineering

  Can be used for analysis and synthesis

  Can learn from unlabeled data                                  b)

  BUT:
  
    needs lots of data (does it?)
  
    Is lazy → Explainabilty

    3.3.2020, ITG Workshop Magdeburg   Dialogsysteme mit Gefühlen: Wie, Warum und Wohin?
Deep Learning II

• Data Augmentation
• Transfer Learning
• Synthetic Training
 
     GANs
   
     Autoencoder
• Unsupervised
  learning
• Reinforcement                                                                         Images:
                                                                                        https://towardsdatascience.com

  learning                           source:
                                     medium.com
  3.3.2020, ITG Workshop Magdeburg                Dialogsysteme mit Gefühlen: Wie, Warum und Wohin?
Speech Synthesis

 • With deep learning using
      
           Embeddings
      
           Style tokens                                                                                         Source: https://towardsdatascience.com/neural-network-embeddings-explained-
                                                                                                                4d028e6f0526

Source: https://ai.googleblog.com/2017/12/tacotron-2-generating-human-like-
speech.html

                                                                              Source: https://www.researchgate.net/publication/335601425_Comic-
                                                                              Guided_Speech_Synthesis
Emofilt

• Emofilt is a Java tool to transform the
  prosody of a given utterance in order to
  simulate emotional expression
• It is based on Mbrola for speech
  generation and an arbitrary phonemization
  generator like MARY or Txt2Pho
• Mbrola is a diphone synthesizer from the
  University of Mons with databases for 34
  languages
The storyteller

• Used emofilt to “emotionalize” a fairytale
• Asked 12 schoolkids for both versions
         How they like the story
         How they like the speaker
         How many facts they remember

• F. Burkhardt: “An Affective Spoken Story
  Teller”, Interspeech, 2011

   3.3.2020, ITG Workshop Magdeburg   Dialogsysteme mit Gefühlen: Wie, Warum und Wohin?
W3C recommendation for emotion annotation
• As the Web is becoming ubiquitous, interactive, and
  multimodal, technology needs to deal increasingly with
  human factors, including emotions.
• The specification of Emotion Markup Language 1.0
  aims to strike a balance between practical applicability
  and scientific well-foundedness.
• The language is conceived as a "plug-in" language
  suitable for use in three different areas:
• manual annotation of data
• automatic recognition of emotion-related states from
  user behavior
• generation of emotion-related system behavior

                                                                                https://www.w3.org/TR/emotionml/

     3.3.2020, ITG Workshop Magdeburg          Dialogsysteme mit Gefühlen: Wie, Warum und Wohin?
Five types of applications

a) Mediated emotion
b) Affect recognition
c) Affect simulation
d) Modeling emotional intelligence
e) Modeling human emotional
  behavior

         A. Batliner, F. Burkhardt, M. van Ballegooy, E. Nöth: A Taxonomy of Applications that Utilize Emotional Awareness, Proc. IS-LTC
         2006
   3.3.2020, ITG Workshop Magdeburg              Dialogsysteme mit Gefühlen: Wie, Warum und Wohin?
Conciliation strategies

• The price question in HMI dialogs is how to
  react on sensed emotion expression
• Remember emotion implies intelligence
• Human agents are expensive
• Current lack of world knowledge prevents
  many of human strategies

   Literature: F. Burkhardt, K.P. Engelbrecht, M. van Ballegooy, T. Polzehl and J. Stegmann: Emotion Detection in Dialog Systems - Usecases, Strategies and Challenges. ACII
   2009

   3.3.2020, ITG Workshop Magdeburg                           Dialogsysteme mit Gefühlen: Wie, Warum und Wohin?
Applications irrespective of dialog strategies

• Irony / Sarcasm detection: To analyze user opinion this is still a big
  problem.
• Call center support: distribute aggressive callers, support training.
• Automated dialog support: Anger detection can be used for churn
  prevention or for automatic quality monitoring.
• Emotional Chat: Facilitate emotional computer mediated communication.
• Emotion-aware Surrounding: Computer controlled environment that
  adapts automatically on the user’s mood.
• Search–by-emotion, e.g. Entertain product.
• Believable Agent: The naturalness of an artificial ‘being’ and the
  appearance of intelligence is highly altered by emotional expressions.
• Artificial intelligence models, use emotions for motivation modeling

   3.3.2020, ITG Workshop Magdeburg   Dialogsysteme mit Gefühlen: Wie, Warum und Wohin?
The anger monitor

• Recorded and annotated anger data in office
  and at home
• Machine chunks and classifies continuously
  and warns when anger is detected
• Household with two teenagers
• Cross database training not successful

• F. Burkhardt: "You Seem Aggressive! -
  Monitoring Anger in a Practical Application”,
  LREC, 2012

   3.3.2020, ITG Workshop Magdeburg   Dialogsysteme mit Gefühlen: Wie, Warum und Wohin?
Ethical implications

• Emotional expression can be
           a) voluntary (communicative) or
           b) involuntary
• In a way, emotional expression of type b) reveals our
  “inner thoughts”
• There are cases were people decide voluntary to detect
  their “subconscious” emotions (e.g. therapy)
• It is not in any case of the interest of humans for these
  to be revealed, but what if an ethically cgorrect
  procedure is facilitated by AI, e.g. fraud detection?
• General: keep the user in control
Emerging standard: https://sagroups.ieee.org/7014/
“This standard defines "empathic technologies" as affect-sensitive technologies
employed to algorithmically infer, model and simulate understanding of emotions,
feelings, moods, perspective, attention and intention. Data insights and actions taken in
response to these automated inferences typically, but not always, inform future
interactions between a person or group and system (or between systems).”
   3.3.2020, ITG Workshop Magdeburg                   Dialogsysteme mit Gefühlen: Wie, Warum und Wohin?
Market overview Q3 2017                              Audio recognition                          Emotions from face

             Sentiment from
             text

                                                                                       Emotional
3.3.2020, ITG Workshop Magdeburg       Emotional
                                   Dialogsysteme      systems
                                                 mit Gefühlen: Wie, Warum und Wohin?
                                                                                       biosignals
Wrap up

• Emotional processing comes with pervasive
  computing
• It can be used with intuitive interfaces, more natural
  mediated communication and sophisticated AI
  models
• It is ethically very sensitive
• Emotional categories contrast with more complex
  models, but the nature of emotion is dictated by
  application
• The market is growing, all the big players are already
  there
• Deep Learning is a big driver

   3.3.2020, ITG Workshop Magdeburg   Dialogsysteme mit Gefühlen: Wie, Warum und Wohin?
You can also read