IAAA / PSTALN Dialog systems - Benoit Favre last generated on January 20, 2020 - page du TP
←
→
Page content transcription
If your browser does not render page correctly, please read the page content below
IAAA / PSTALN Dialog systems Benoit Favre Aix-Marseille Université, LIS/CNRS last generated on January 20, 2020 Benoit Favre (AMU) PSTALN: Dialog January 20, 2020 1 / 35
What is a dialog system? Definition : ▶ Input/output with natural language ▶ Free use of language ▶ Reproduces human agent behavior ▶ Reply (or not) in natural language ▶ Uni/multimodal Spoken Dialog System (SDS) ▶ Interactive system with spoken language ▶ Required when using an acoustic communication channel only (phone) ▶ Can free other modalities (hands free) Difficulty ▶ No control of inputs ▶ Contextualize information ▶ Automatic speech recognition → transcript errors Benoit Favre (AMU) PSTALN: Dialog January 20, 2020 2 / 35
Examples Online discussion ▶ Eliza, virtual therapy ⋆ https://www.eclecticenergies.com/ego/eliza ▶ Mitsuku (best chatbot at Loebner price 2013) ⋆ http://www.mitsuku.com/ Automated voice services ▶ “To erase a message, say erase..." Customer care ▶ 1013 (in France): describe freely your problem ▶ Air Travel Information System (ATIS) Assistants ▶ Clippy ▶ SIRI Benoit Favre (AMU) PSTALN: Dialog January 20, 2020 3 / 35
Learning with an intelligent agent Replaces teacher in MOOC Benoit Favre (AMU) PSTALN: Dialog January 20, 2020 4 / 35
Interaction with a robot Caroline Lyon, Chrystopher L. Nehaniv, Joe Saunders, Interactive Language Learning by Robots: The Transition from Babbling to Nao (http://www.aldebaran- Word Forms, PLoS One, 2012 robotics.com) Benoit Favre (AMU) PSTALN: Dialog January 20, 2020 5 / 35
Problems What’s a dialog? ▶ Study human behavior How to understand a sentence? ▶ Rule-based and template-based system ▶ Robust concept detection What strategies to make a successful dialog? ▶ Enforce local coherency ▶ Finite state machine ▶ Explore possible futures How to formulate an answer ▶ Language and speech synthesis Benoit Favre (AMU) PSTALN: Dialog January 20, 2020 6 / 35
Corpus-based study Human-human dialogues ▶ Language in the wild ▶ Want to build a program that replicates human behavior ▶ Difficult to ignore non-verbal interactions ▶ Humans behave differently in front of a system Wizard of oz (WoZ) ▶ Replace the system by a human operator ▶ Make the user believe that it is a real system (simulate errors / wait time...) ▶ Collect users dialog strategies Human-machine dialog ▶ Collect interactions with an existing system ▶ Use these data to evaluate / improve the system Simulation ▶ From a user model, simulate human input ▶ Detect infinite loops ▶ Estimate resolution time ▶ Train system parameters Benoit Favre (AMU) PSTALN: Dialog January 20, 2020 7 / 35
Human dialogs Sequence of spoken turns ▶ Between two or more people Who speaks after whom? ▶ Interruptions ▶ Finish someone else’s sentence ▶ Overlap Establish of a common ground ▶ Use of a common vocabulary ▶ Acquiescence, reformulation, convergence Benoit Favre (AMU) PSTALN: Dialog January 20, 2020 8 / 35
Speech acts Theory (Austin & Searl) * Meaning can be expressed in terms of actions instead of concepts (declarative logical form) Examples: ▶ “I apologize" ▶ “Can you do this?" Types of dialog acts ▶ verdictifs ou actes juridiques (acquitter, condamner, décréter) ▶ exercitifs (dégrader, commander, ordonner, pardonner, léguer) ▶ promissifs (promettre, faire vu de, garantir, parier, jurer de) ▶ comportatifs (sexcuser, remercier, déplorer, critiquer) ▶ expositifs (affirmer, nier, postuler, remarquer) Benoit Favre (AMU) PSTALN: Dialog January 20, 2020 9 / 35
Cooperation principle (Grice) For a conversation to be successful, speakers must cooperate: ▶ Quantity: give the right amount of information ▶ Quality: tell the truth ▶ Relevance: say important things ▶ Manner: be clear, brief and structured Example (Mitkov - Computational Linguistics) ▶ Quantity: Marie ate some chocolate → Marie did not eat all the chocolate ▶ Quality: (about an invoice) It costs an arm → It costs a lot ▶ Relevance: A: Can I watch TV? B: It’s bath time. ▶ Manner: Are you ready? vs Are you ready or are you not ready? Integration in a dialogue system: ▶ The user follows them if he has something to gain ▶ Should the system follow these principles? Benoit Favre (AMU) PSTALN: Dialog January 20, 2020 10 / 35
Dialog acts (DA) Dialog acts: a specialization of speech acts Subdivision of a speaking turn into “intentions" ▶ Question ⋆ Closed / open questions ▶ Declaration ⋆ Short answers ▶ Dysfluancies ⋆ Repetitions ⋆ Wrong pronunciation ▶ Interruption ▶ Filled pauses ▶ MRDA / DASL: specification of more than 160 types Task-oriented dialogue act categories ▶ Greetings ▶ Opening ▶ Negotiation ▶ Closing ▶ Good-bye Benoit Favre (AMU) PSTALN: Dialog January 20, 2020 11 / 35
Anatomy of a dialog system syntactic words tree concepts, question Automatic Syntactic Semantic relations transcription analysis analysis Dialog manager Speech Lexical Syntactic logical answer synthesis generation generation representation words, primitive prosody syntax Comprehension ▶ Dialogue acts ▶ Grammar ▶ Concepts Dialog management ▶ Matching ▶ Finite state machines ▶ Exploration Generation ▶ Templates ▶ Statistical generation Benoit Favre (AMU) PSTALN: Dialog January 20, 2020 12 / 35
DA classification Model domain dialog acts (politeness, commands, information, ...) Corpus: (15651 instances, 66 classes) ▶ “je n’ai plus de tonalité sur ma ligne" → interruption_ligne ▶ “j’ai un problème suite euh déménagement" → mise_en_service ▶ “euh téléphone illimité en panne" → internet_voip ▶ “le clignotant reste toujours allumé" → messagerie_vocale Classifier 1 Extract word n-gramms ⋆ je je_n'ai je_n'ai_plus n'ai_plus_de plus_de_tonalité... 2 Use them as features of a classifier (résultats mlcomp.org) ⋆ 0.207 SMO_weka_nominal ⋆ 0.216 boostexter ⋆ 0.232 sgd-logistic-stepsize0.3-iter5 ⋆ 0.272 liblinear-s6-B1 3 Most relevant features ⋆ “plus de tonalité" ⋆ “sur ma ligne" ⋆ “une autre demande" ⋆ “ai un problème" Benoit Favre (AMU) PSTALN: Dialog January 20, 2020 13 / 35
Grammars Extension of classic syntagmatic grammars ▶ Integrate domain semantics into grammar Requête Action Voyage Je voudrais Moyen Départ Arrivée Temps un train de Paris à Marseille dans l'après-midi Grammar example: Request -> $Action $Travel Travel -> $Mean $Departure $Arrival $Time Action -> I would like | I would have to | have you got Mean -> a train | a flight | a plane Departure -> from $City | starting from $City Arrival -> to $City | arriving at $Ville Weather -> in the afternoon | today | tomorrow | at $Hour | the $Date Benoit Favre (AMU) PSTALN: Dialog January 20, 2020 14 / 35
Concept detection Example: from Media corpus (WoZ, hotel reservation) ▶ Detect triplets (modality, type, value) linked to a task ▶ heu [+:reponse:oui oui] [+:connect:opposition mais] j’aimerai d’abord savoir si le [+:hotel:Ibis ibis] il y a un [?:service:jacuzzi jacuzzi] et une [?:service:piscine piscine] si ils acceptent [?:service:animaux les chiens] Formulate as sequence prediction ▶ BIO formalism (begin-inside-outside) ▶ Model type HMM/RNN/CRF ▶ Features on words, pos-tags... ... ... occasion O de O la B-evenement fête I-evenement du I-evenement cheval I-evenement donc B-connect euh O à B-loc-relative proximité I-loc-relative Benoit Favre (AMU) PSTALN: Dialog January 20, 2020 15 / 35
Dialogue management Modèle de Modèle de l'utilisateur la tâche Réponse Analyse sémantique Modèle de Modèle Commandes discours du monde Entrées Gestionnaire de dialogue Sorties Task model: expected requests, possible commands User model: do not tell the user what he already knows, predict the following sentence (objectives, knowledge, interests) Discourse model: conversation history (pronouns resolution), dialogue states, what to do at a given time World model: general knowledge for understanding Benoit Favre (AMU) PSTALN: Dialog January 20, 2020 16 / 35
ELIZA: the psychanalist EMACS: alt-x-doctor Benoit Favre (AMU) PSTALN: Dialog January 20, 2020 17 / 35
ELIZA: How does it work? Key words → responses ▶ “BONJOUR" → “Comment vas-tu aujourd’hui.. De quoi désires-tu discuter?" ▶ “PEUX-TU" → “Tu ne crois que je suis capable de
Dialog management: AIML (ALICE) AIML (Artificial Intelligence Markup Language) WHO IS YOUR DADDY? Steve Recursive rules Memorize previous answer and topic (broad category) Synonyms Random Learn new responses from user input Benoit Favre (AMU) PSTALN: Dialog January 20, 2020 19 / 35
ALICE: how to make a bot ? Available source code ▶ Program AB: https://code.google.com/p/program-ab/ ▶ Knowledge base: https://code.google.com/p/aiml-en-us-foundation-alice/ Question/answers ▶ Yes ("Is violet a color?") ▶ No ("Are fish mammals?") ▶ Sometimes ("Is the sky blue?") Reductions ▶ Synonyms: “Hello", “Hi there", “Howdy" → “Hi". ▶ Simplification: “I am feeling very happy right now" → “I am happy". ▶ Input segmentation: “Yes my name is Jim" → “Yes" + “My name is Jim". Personality age = 15 baseballteam = Red Sox birthday = Nov. 23, 1995 birthplace = Bethlehem, Pennsylvania boyfriend = I am single celebrities = Oprah, Steve Carell, John Stewart, Lady Gaga ... Benoit Favre (AMU) PSTALN: Dialog January 20, 2020 20 / 35
Voice XML W3C standard XML representation of a dialogue Please choose airline, hotel, or rental car. [airline hotel "rental car"] You have chosen . Can specify a grammar for the fields to be filled and the order in which to fill them Code evaluation () Benoit Favre (AMU) PSTALN: Dialog January 20, 2020 21 / 35
Information retrieval-based system Find most relevant answer for a question ▶ Exemple : stack overflow Multi-user: ask a question to another user (chatbots) Benoit Favre (AMU) PSTALN: Dialog January 20, 2020 22 / 35
Automaton Finite state machine to represent dialogue states ▶ Don’t list flights until all the fields are filled Entry into a state is conditioned by an interpretation (of concepts, recognition DA...) and dialogue history A state corresponds to a command and/or a response from the system Limit: locked in the structure of the dialogue ▶ Given a state and an interpretation of the user, we always go to another state Benoit Favre (AMU) PSTALN: Dialog January 20, 2020 23 / 35
MDP Problem: Which trajectory to follow in the dialog automaton to minimize task completion time? ▶ We don’t know the future, we have to make assumptions Markov Decision Process ▶ At each time t, the process (the user) is in a state s (it has an intention) ▶ The system chooses to perform an action a (for example it asks a question) ▶ This action randomly places the process (the user) in state s ′ , according to a probability Pa (s, s ′ ) ▶ This change of state gives the system a gain Ra (s, s ′ ) ▶ The choices of the process depend only on a and s and not on the rest of the history (Markov property) How to maximize the cumulative gain over the entire dialogue? ▶ We call policy a series of actions noted Pi ▶ Use dynamic programming to find the policy that maximizes ∞ ∑ γt Ra (st , st+1 ) t=0 Benoit Favre (AMU) PSTALN: Dialog January 20, 2020 24 / 35
POMDP Partially Observable Markov Decision Process (POMDP) ▶ Extension of MDP the state (user intent) isn’t directly observed ▶ Distribution over states that the user can be in given what she says ▶ Observations are user sentences MDP POMDP s' s' R(s,s') R(s,s') s s Passé Futur Passé Futur Requires approximate inference Benoit Favre (AMU) PSTALN: Dialog January 20, 2020 25 / 35
Model comparison Markov Chain Markov Decision Process ? ? ? ? Hidden Markov Chain ? ? Partially-observable MDP Benoit Favre (AMU) PSTALN: Dialog January 20, 2020 26 / 35
Advanced elements Initiative ▶ System initiative (= le technicien freebox qui suit un script) ▶ User initiative (Google Now, commande vocale) ▶ Mixed initiative : ⋆ “SIRI, can you change brightness?" (user initiative) ⋆ “Yes, how bright?" (system initiative) Confirmation ▶ Explicit ⋆ “J’aimerai aller de Marseille à Barcelone" ⋆ “Votre lieu de départ est-il Marseille ?" ⋆ “Oui" ⋆ “Votre lieu d’arrivée est-il Barcelone" ▶ Implicite ⋆ “J’aimerai aller de Marseille à Barcelone" ⋆ “Quand souhaitez-vous aller de Marseille à Barcelone ?" Who is the user talking to? ▶ Kinect / Nao Benoit Favre (AMU) PSTALN: Dialog January 20, 2020 27 / 35
Chatbots with deep learning Replace classic elements 1 Intent classification (what do you want?) 2 Slot filling (what are the parameters of your query?) 3 Next action prediction Typical frameworks (i.e. RASA) ▶ Provide pretrained intents / slots ▶ Active learning for annotating data ▶ Dialog flow is pre-scripted (best control) Data-driven approach to conversation modeling ▶ Given a conversation up to a point, can we predict what will happen next ▶ No need for linguistic analysis, but no linguistic prior ▶ Examples: 1 Alternating language models 2 Turn retrieval 3 Machine translation (history → next turn) Benoit Favre (AMU) PSTALN: Dialog January 20, 2020 28 / 35
Alternating language model A simplified version of the encoder-decoder (or seq2seq) framework ▶ Trained the same way as a regular word-based language model ▶ At prediction time, alternate between user input and generation ⋆ Training data needs to be in the same form Implementations ▶ Word-by-word prediction ▶ Any language model (GPT-2...) ▶ Attention mechanisms w1 w2 w3 w1 w2 w3 M H w1 w2 M Benoit Favre (AMU) PSTALN: Dialog January 20, 2020 29 / 35
Representation learning Create an information retrieval system ▶ Which can retrieve the next turn given a history ▶ Encode history with a first recurrent model ▶ Encode next turn with a second recurrent model ▶ Compute a similarity between those representations (dot product) Training objective: triplet ranking ▶ Make sure the correct association has a higher score than a randomly selected pair Problem: the cost of retrieving a turn ▶ Everything can be precomputed, just the dot product remains ▶ Many approaches for finding approximate nearest neighbors in a high dimensional space (ie. locality preserving hashing) history M w1 w2 H w1 w2 cosine response M w1 w2 Benoit Favre (AMU) PSTALN: Dialog January 20, 2020 30 / 35
Bi-encoder training Maximize margin between the result of hi · ri and ni · ri ▶ hi is the history ▶ ni is a random history ▶ ri is the response 1∑ Loss = max(0, 1 − hi · ri + ni · ri )) n i Keras model Benoit Favre (AMU) PSTALN: Dialog January 20, 2020 31 / 35
Experiments on Datcha corpus Corpus: Orange ATH TV Stat Train Valid Test Conversations 16,140 698 606 Turns 465,693 20,090 18,392 Words 7,744,262 327,979 299,340 Preprocessing ▶ Tokenization (based on penn tokenizer) ▶ A few rules to strip additional URLs, phone numbers, etc. ▶ Lower case ▶ Concatenate turns of the same participant with ▶ Separate conversations by ▶ Replace all TC[1-9] by a generic TC Benoit Favre (AMU) PSTALN: Dialog January 20, 2020 32 / 35
Datcha results Evaluation metrics ∑ ▶ Perplexity (PPL): − n1 logP(turn|history) ▶ Better-than-random (BTR): n1 |P(turn|history) > P(turn|noise)| Results on the ATH TV test set (3 last files): Method PPL BTR Language model 17.52 69.39% Information retrieval 11.85 93.91% Parameters ▶ LM: vocab=30k, layers=2, hidden=650, sample=1024, maxlen=35, batch=20, optim=sgd, epochs=8 ▶ Bi-encoder: vocab=30k, embeddings=128 (init=w2v), hidden=256, maxlen=64, repr=128, batch=256, optim=Nadam, epochs=100 Benoit Favre (AMU) PSTALN: Dialog January 20, 2020 33 / 35
t-SNE Analysis t-SNE Projections of turn representations Benoit Favre (AMU) PSTALN: Dialog January 20, 2020 34 / 35
Conclusion Dialogue systems ▶ Understanding ▶ Planning ▶ Data driven Open issues ▶ Noisy input ⋆ Speech recognition / spelling errors ▶ Unrestricted domain ⋆ Chit-chat ⋆ Rapid development of new domains ▶ Grounding ⋆ Tackle real-world objects ▶ End-to-end training ⋆ Speech-recognition → dialog management → speech synthesis ▶ Training without experiencing ⋆ Simulation Current state of the art ▶ https://nlpprogress.com/english/dialogue.html Benoit Favre (AMU) PSTALN: Dialog January 20, 2020 35 / 35
You can also read