Dialogsystem Logopediutbildningen VT05 Staffan Larsson

Dialogsystem Logopediutbildningen VT05 Staffan Larsson sl@ling.gu.se
Institutionen för lingvistik, GU

Översikt Dialogsystem som gränssnitt Dialogmodellering
Agenter och dialog TrindiKit och informationstillstånd Ett exempel: BeadieEye Frågebaserad dialoghantering i GoDiS Dialogsystem och AAC

Dialogsystem som gränssnitt

Vad är gränssnitt? Teknologier som utgör kontaktytan mellan människan och maskinen Tillåter kommunikation mellan människa och maskin Exempel: Skärm Tangentbord Mus Talad dialog? Gränssnittet bestämmer hur vi relaterar till maskinerna i vardagen

Språket som gränssnitt
Kan språket (det mänskliga, naturliga, talade) användas för att överbrygga gränsen mellan människa och maskin? Hur? Genom att få maskiner (datorer) att förstå mänskligt språk och kunna föra dialoger: detta är ett av målet för språkteknologin Varför? Talad dialog är – för människor - det mest naturliga sättet att interagera

Dialog och dialogsystem
interaktivt informationsutbyte med hjälp av naturligt språk Dialogsystem: Teknologier som möjliggör interaktion mellan människa och maskin med hjälp av (talad) dialog

Praktiska användningsområden (urval)
Demokratisering av teknologi Dialog som gränssnitt till teknologi som annars hade varit för komplex Fördel: kräver ingen teknisk fallenhet eller förkunskaper (utöver att kunna tala) Kommunikationsstöd För döva För blinda Andra funktionshinder Möjliggöra kommunikation Tal-till-tal-översättning

Varför är dagens dialogsystem inte bättre?
Exempel: SJ:s tidtabellsupplysning ( ), Tidpunkten ( ) Dessa systems förmåga att föra en dialog ligger ljusår från människans. Varför? De utgår i stor utsträckning från maskinernas förutsättningar, inte från teorier om den mänskliga dialogförmågan!

Hur ska vi kunna bygga bra dialogsystem?
För det första: Vi måste förstå hur människor använder språk! Wittgenstein ( ): Språklig aktivitet är alltid förankrad i någon specifik mänsklig aktivitet! Till varje aktivitet hör ett språkspel; ett språk som svenska är i själva verket en samling språkspel För det andra: vi måste formulera våra teorier på ett sätt som är begripligt för en dator Formalisering och simulation

Formalisering Översättning till ett formellt språk
Exempel på formella språk Matematik Används för att formalisera bl a fysik Logik Programmeringsspråk, t ex Java Formella språk är generellt sett extremt enkla och regelbundna Detta gör att datorer kan ”förstå” dem

Simulation Antag att vi har en teori om (en del av) den mänskliga språkförmågan Hur ska vi ta reda på om den är sann? Empiriska studier (lingvistik) Studera insamlat språkligt material Gör experiment Simulering (språkteknologi / datalingvistik) Översätt teorin till ett datorprogram (formalisera) Kör programmet och se om beteendet är det förväntade ”reverse engineering”

Dialogsystem som simulationer
Dialogsystem försöker simulera hela den mänskliga språkförmågan Att uppfatta, tolka och förstå talade yttranden Att resonera om vad som ska sägas och när det ska sägas Att formulera och uttala yttranden

Två sorters språkteknologi?
”Praktisk” språkteknologi Förbättra gränssnitt I princip ointressant huruvida man försöker efterlikna en mänsklig språklig förmåga Simulerande språkteknologi / datalingvistik En forskningsmetodik för att utforska den mänskliga språkförmågan I princip ointressant huruvida resultatet är praktiskt användbart Dessa sammanfaller i den mån det människolika också är det praktisk användbara Det finns skäl att tro att möjligheterna att simulera hela den mänskliga språkförmågan är begränsad

Maskinen människan? Descartes (1596-1650) de la Mettrie (1709-1751)
Djur är att betrakta som maskiner Människan har en själ, och skiljer sig därför på ett fundamentalt sätt från djuren de la Mettrie ( ) ”Maskinen människan” Skillnaden mellan människa och maskin är en kvantitativ skillnad i komplexitet Ingen fundamental skillnad!

Turingtestet Kan en maskin vara intelligent?
Är ”artificiell intelligens” (AI) möjlig? Turing ( ): ”Turingtestet” Testperson A får föra en dialog (via textterminaler) med B. A:s mål är att försöka avgöra om B är en människa eller en dator (OBS! Detta är en förenklad version av Turingtestet!)

Turingtest och dialogsystem
Enligt Turingtestet – vad är det fundamentalt mänskliga? Förmågan att föra en dialog med naturligt språk! Varför skulle detta vara fundamentalt? Antagande: I talet visar sig alla andra mänskliga förmågor (direkt eller indirekt)

En svart låda Turing behandlar i stor utsträckning psyket som en svart låda Om vi lyckas simulera mänskligt beteende, betyder det att vår teori är korrekt? Turing: Ja; alla eventuella skillnader är oväsentliga! Fenomenologisk kritik (Dreyfus, Heidegger) : Nej; även om vi lyckats fånga de centrala iakttagbara aspekterna av det mänskliga psyket i datorn, så ÄR den inte en människa – vi kan inte veta att datorn fungerar som en människa inuti!

Heideggers projekt i Varat och tiden
Beskriv grunderna för det mänskliga varandet Hur det är att vara människa Detta kan enligt Heidegger bara göras ”inifrån” H:s utläggning är ej avsedd att vara begriplig för någon som inte är människa En sådan förklaring är inte möjlig, hävdar H.; det mänskliga går inte att förklara ”från scratch” Ändå är det en sådan förklaring som eftersträvas inom AI-forskningen

Principargument mot ”artificiell intelligens”
(här: AI = att datorer skulle kunna förstå språk på samma sätt som människor gör det) ”Religiösa” argument Människan har en immateriell själ, vilket maskiner aldrig kan ha Materiella argument Varelser av silikon och metall kan aldrig vara sant mänskliga; till detta krävs kött och blod (och neuroner)

Fler principargument Freudianska argument Fenomenologiska argument
Människor och maskiner har radikalt annorlunda konstitution och ursprung; detta ursprung (t ex barnets utvecklingsfaser) är fundamentalt för hur vi förstår världen och språket Maskiner föds inte av två andra maskiner; de byggs Fenomenologiska argument Spädbarn är, strikt talat, inte människor; de måste först socialiseras in i världen Maskiner socialiseras aldrig in i världen; de programmeras AI kräver formalisering av tolkningsbakgrund, vardaglig omedveten praktisk förmåga att förstå världen Men denna ”kunskap” är inte en samling fakta Människor har en biologisk kropp som ligger till grund för deras sätt att förstå världen AI förutsätter att intelligens kan abstraheras från kroppen

(Hur människolika ska maskinerna bli?)
Vill man verkligen efterlikna alla mänskliga engenskaper? Freudianska felsägningar? Dåligt minne? Panikångest? Utbrändhet?

Principfrågan – i princip ointressant?
Är det egentligen intressant huruvida människor i princip är / inte är maskiner? Frågan om huruvida en dator kan tänka är inte mer intressant än frågan om huruvida en ubåt kan simma. (E. W. Dijkstra, )

Fel fråga? Istället för ”Kan maskiner vara intelligenta”:
På vilket sätt kan maskiner vara intelligenta? Hur simmar ubåtar?

Dialog i begränsade regelbundna domäner
Även om vi inte kan hoppas simulera den mänskliga språkförmågan.. ...så kan dialogsystem fortfarande göra nytta för att förbättra gränssnitt till datorer och annan teknologi Detta kan göras för relativt enkla och regelbundna domäner (resebyrå. programmera videon, ordbehandling...) (Wittgensteins språkspel) Men även om vi inte försöker simulera mänsklig dialogförmåga i datorn... ...så måste den kunna delta i en dialog med en människa detta kräver kunskap om mänsklig kommunikation och dialog

Dialogmodellering

Dialogue modelling Theoretical motivations Practical motivation
find structure of dialogue explain structure relate dialogue structure to informational and intentional structure Practical motivation build dialogue systems to enable natural human-computer interaction (what is natural?)

Informal approaches to dialogue modelling
speech act theory (Austin, Searle, ...) utterances are actions illocutionary acts: ask, assert, instruct etc. discourse analysis (Schegloff, Sacks, ...) turn-taking, pre-sequences etc. dialogue games (Sinclair & Coulthard,...) structure of dialogue segments (rather than separate utterances) can e.g. be encoded as regular expressions or finite automata qna-game -> question qna-game* answer

Computational approaches implemented in systems and toolkits
finite state automata (CLSU toolkit, Nuance) frame-based (Philips, SpeechWorks) plan-based (TRAINS, Allen, Cohen, Grosz, Sidner, ...) general reasoning (Sadek, ...) information states (TRINDI: Traum, Bos, ...)

Why build dialogue systems?
theoretical: test theories (of human-computer dialogue) e.g. what kind of information does the an artificial dialogue agent need to keep track of? problem: complex system with many components practical: natural language interfaces databases (train timetables etc) electronic devices (mobile phones,...) instructional/helpdesk systems booking flights etc tutorial systems

What does a system need to be able to do?
speech recognition parsing, syntactic and semantic interpretation resolve ambiguities anaphora and ellipsis resolution, etc... dialogue management how does an utterance change the state of the dialogue? given the current state of the dialogue, what should the system do? natural language generation speech synthesis

Why spoken dialogue? Spoken dialogue is the natural way for people to communicate as far as possible, computers should adapt to humans rather than the other way around (but humans will also need to adapt their conversational style) important to enable system and user to communicate in a natural (human-like) way mixed initiative turntaking, feedback, barge-in handle embedded subdialogues ...

What’s happening with dialogue systems
Beginning to be used commercially Limited domains need to encode domain-specific knowledge; a general system would require general world knowledge speech recognition is harder with large lexicon Simple dialogue types mostly information-seeking Need to bridge gap between dialogue theory and working systems

Agenter (och dialog)

Vad är en (artificiell) agent?
beteendebaserad defintion autonomi: agenter handlar utan direkt inblandning av människor eller andra, och har kontroll över sina egna handlingar och sitt eget interna tillstånd social förmåga: agenter interagerar med andra agenter (inkl. människor), bl a med hjälp av språk reaktivitet: agenter uppfattar sin omgivning (den fysiska världen, ett grafiskt användarinterface, internet...) och reagerar på förändringar i omgivningen proaktivitet: aganter reagerar inte bara på omgivningen, utan är också kapabla till målinriktat beteende och kan ta initiativ

Två huvudtyper av ramverk för artificiella agenter
”Deliberative” en agent har en explicit representerad symbolisk modell av världen beslut fattas genom logiskt slutledning (mönstermatchning, symbolmanipulation) teoribaserade Exempel: General Problem Solver (Newell & Simon) Reaktiv ingen symbolisk modell ingen komplex symbolprocessning Exempel: situerade finita automater (Rosenschein & Kaelbling) tenderar att vara ad hoc det finns ocskå hybridteorier ett reaktivt och ett deliberativt lager Är människor reaktiva eller deliberativa? Eller kanske hybrider...

Attityder för deliberativa agenter
Privat Social Informationsattityd kunskap / tro Proattityd handling, mål

Reaktivitet Perception Privata informationsattityder Reaktion
agenter uppfattar världen genom sinnesorganen, vilket ger upphov till kunskap / trosföreställningar om världen Privata informationsattityder trosföreställningar (beliefs, B) kunskap (sann berättigad tro) Reaktion kräver förmåga att agera

Proaktivitet Initiativ Kräver Privat proattityd: intention
Agenter har behov, önskningar och avsikter och försöker ofta ändra världen utgående från dessa Kräver förmåga att planera förmåga att bestämma sig Privat proattityd: intention

Autonomi agenter handlar utan direkt inblandning av människor eller andra, och har kontroll över sina egna handlingar och sitt eget interna tillstånd Privata attityder (info- och proattityder): trosföreställningar (beliefs, B) önskningar/vilja (desires, D) intentioner (I)

Social förmåga Människor är också sociala varelser; de står i sociala relationer till varandra och agerar utifrån dessa Sociala informationsattityder: delad tro/kunskap (shared belief), Sociala proattityder skyldigheter (obligations) åtaganden (committments), rättigheter (rights) (?)

Kunskap för dialogagenter Informella approacher Formella ramverk
(Agenter och) dialog Kunskap för dialogagenter Informella approacher Formella ramverk

Typer av kunskap som behövs för att kunna delta i en dialog
sociala informationsattityder (delad kunskap) statisk generell världskunskap för att tolka yttranden aktivitetsspecifik världskunskap språklig kunskap; förmåga att tolka och konstruera yttranden, inkl. kunskap om talakter och dialogspel dynamisk privata och sociala attityder dialogmodell; ``dialogprotokoll'’: håller reda på gemensamma antaganden, aktuella frågor, skyldigheter, referenter mm.

Hur ska kunskap representeras?
Kunskapsrepresentationsspråk, t ex FOL, semantiska nätverk, frames... Kunskapsbas = mängd av statser + inferensregler ontologier / typhierarkier (för begreppskunskap)

Hur mycket och vilken typ av kunskap som behövs beror på dialogtyp
enkel -> komplex call routing tidtabellsupplysning databassökning programmera video instruktionsdialog (t ex ge vägbeskrivning) förhandling planera framtida aktivitet vardagligt småprat (?)

Ramverk för dialogagenter
Finita automater strikt ”flödesschema” från starttillstånd till sluttillstånd i varje tillstånd är ett begränsat antal handlingar möjliga Logikbaserade Rationalitetsaxiom + inferens axiomatiserad talaktsteori (i modallogik) problem med komplexitet och avgörbarhet Planbaserade Planering & planigenkänning talakter som planer problem med komplexitet Informationstillstånd dialogdrag, dialogspel, uppdateringsregler variabel komplexitet deliberativ <-> reaktiv Dessa kan kombineras!

En (artificiell) dialogagent kan
interagera och kommunicera med andra agenter på ett koherent sätt delta i dialoger (d v s kommunikativa utbyten med en längre sekvens av yttranden) om ett givet ämne med avsikten att uppnå ett gemensamt övergripande mål Yttranden är handlingar som ändrar mentala tillstånd kontexten och dialogtillståndet

Interaktion på flera nivåer
Ide: modellera dialog som handlingar på flera nivåer ej bara satsnivå (talakter) 4 talaktsnivåer (Traum & Hinkelmann 1992) turtagning ”grounding” bekräftelse att man förstår varandra ”core speech acts” (traditionella illokuta akter) Exempel: Inform, YNQ, Check, Eval, ReqRepair, RecAck en CSA involverar flera agenter, eftersom de måste bekräftas argumentationshandlingar (retoriska handlingar) Exempel: Elaborate, Summarize, Clarify, Q&A, Convince, Find-Plan

TrindiKit

What is TrindiKit? a toolkit for not a dialogue system in itself
building and experimenting with dialogue move engines and systems, based on the information state approach not a dialogue system in itself

Total Information State
control DME module1 module… modulei modulej module… modulen Total Information State (TIS) Information state proper (IS) Module Interface Variables Resource Interface Variables resource1 resource… resourcem

Information State (IS)
an abstract data structure (record, DRS, set, stack etc.) accessed by modules using conditions and operations the Total Information State (TIS) includes Information State proper (IS) Module Interface variables Resource Interface variables

Dialogue Move Engine (DME)
module or group of modules responsible for updating the IS based on observed moves selecting moves to be performed dialogue moves are associated with IS updates using IS update rules there are also update rules no directly associated with any move (e.g. for reasoning and planning) update rules: rules for updating the TIS rule name and class preconditon list: conditions on TIS effect list: operations on TIS update rules are coordinated by update algorithms

Modules and resources Modules (dialogue move engine, input, interpretation, generation, output etc.) access the information state no direct communication between modules only via module interface variables in TIS modules don’t have to know anything about other modules increases modularity, reusability, reconfigurability may interact with user or external processes Resources (device interface, lexicons, domain knowledge etc.) hooked up to the information state (TIS) accessed by modules defined as object of some type (e.g. ”lexicon”)

How to use TrindiKit We start from TrindiKit
Implements the information state approach Takes care of low-level programming: dataflow, datastructures etc. TrindiKit information state approach

How to build a basic system
Formulate a basic dialogue theory Information state Dialogue moves Update rules Add appropriate modules (speech recognition etc) basic dialogue theory basic system TrindiKit information state approach

How to build a genre-specific system
Add genre-dependent IS components, moves and rules genre-specific theory additions genre-specific system basic dialogue theory basic system TrindiKit information state approach

How to build an application
Add application-specific resources application domain & language resources genre-specific theory additions genre-specific system basic dialogue theory basic system TrindiKit information state approach

Building a domain-independent Dialogue Move Engine
Come up with a nice theory of dialogue Formalise the theory, i.e. decide on Type of information state (DRS, record, set of propositions, frame, ...) A set of dialogue moves Information state update rules, including rules for integrating and selecting moves DME Module algorithm(s) and basic control algorithm any extra datatypes (e.g. for semantics: proposition, question, etc.)

Specifying Infostate type
the Total Information State contains a number of Information State Variables IS, the Information State ”proper” Interface Variables used for communication between modules Resource Variables used for hooking up resources to the TIS, thus making them accessible from to modules use prespecified or new datatypes

example: BeadieEye IS information state type BEL: Set(Prop)
DES: Set(Prop) INT: Set(Action) MBEL: Set(Prop) IS : LM: Set(Move)

Specifying a set of moves
amounts to specifying objects of type move (a reserved type) there may be type constraints on the arguments of moves Example: GoDiS dialogue moves Ask(Q), Q is a question Answer(A), A is an answer (proposition or fragment) Request(),  is an action Confirm() Greet Quit

Writing rules rule = conditions + updates
if the rule is applied to the IS and its conditions are true, the operations will be applied to the IS conditions may bind variables with scope over the rule (prolog variables, with unification and backtracking)

Example: BeadieEye moves and a rule
moves: assert(P), askif(P) rule( integrate_assert, [ in( $lm, assert(P) ) ], add( is/mbel, P ), add( is/bel, P ), del( lm, assert(P) ) ] ).

Example BeadieEye rule application
BEL = { happy(sys) } DES = { knowif( happy(usr) ) } INT = { } MBEL = { } IS = LM = { assert( happy(usr) } Rule application: integrate_assert, > add( IS/MBEL, happy(usr) ), > add( IS/BEL, happy(usr) ), > del( LM, assert(happy(usr) ) BEL = { happy(sys), happy(usr) } DES = { knowif( happy(usr) ) } INT = { } MBEL = { happy(usr) } IS = LM = { }

Example: a rule from GoDiS
rule( integrateUsrAnswer, [ $/shared/lu/speaker = usr, assoc( $/shared/lu/moves, answer(R), false ), fst( $/shared/qud, Q ), $domain : relevant_answer( Q, R ), $domain : reduce(Q, R, P) ], [ set_assoc( /shared/lu/moves, answer(R),true), shared/qud := $$pop( $/shared/qud ), add( /shared/com, P ) ] ).

Building modules Algorithm
For DME modules: coordinate update rules For control modules: coordinate other modules TrindiKit includes a language for writing algorithms For DME modules: basic imperative programming constructs For control module: basic imperative constructs plus asynchronous triggers

From DME to dialogue system
Build or select from existing components: Modules, e.g. input interpretation generation output Still domain independent the choice of modules determines e.g. the format of the grammar and lexicon

Domain-specific system
Build or select from existing components: Resources, e.g. domain (device/database) interface dialog-related domain knowledge, e.g. plan libraries etc. grammars, lexicons Example resources: GoDiS VCR control VCR interface Domain knowledge Lexicon

TrindiKit features explicit information state datastructure
makes systems more transparent enable e.g. context sensitive interpretation, distributed decision making, asynchronous interaction update rules provide an intuitive way of formalising theories in a way which can be used by a system represent domain-independent dialogue management strategies

TrindiKit features cont’d
resources represent domain-specific knowledge can be switched dynamically e.g. switching language on-line in GoDiS modular architecture promotes reuse basic system -> genre-specific systems genre-specific system -> applications

Theoretical advantages of TrindiKit
theory-independent allows implementation and comparison of competing theories promotes exploration of middle ground between simplistic and very complex theories of dialogue intuitive formalisation and implementation of dialogue theories the implementation is close to the theory

Practical advantages of TrindiKit
promotes reuse and reconfigurability on multiple levels general solutions to general phenomena enables rapid prototyping of applications allows dealing with more complex dialogue phenomena not handled by current commercial systems

availability TrindiKit website SourceForge project licensed under GPL
SourceForge project development versions available developer community? licensed under GPL more info in Larsson & Traum: NLE Special Issue on Best Practice in Dialogue Systems Design, 2000 TrindiKit manual (available from website)

Issue-based Dialogue Management in GoDiS

Overview of contents Introduction
Basic issue-based dialogue management Grounding and feedback Adressing Unraised Issues Action-oriented Dialogue Multilinguality Conclusions

Introduction, goals explore and implement issue-based dialogue management starting from Ginzburg’s theory of dialogue semantics based on notion of QUD (Questions Under Discussion) adapt to dialogue system (GoDiS) and implement extend theory coverage, taking in relevant theories general theory of dialogue minimize effort for adapting dialogue system to new domains incrementally extending system to handle increasingly complex types of dialogue clarifies relation between dialogue genres promotes reuse of update rules Larsson (2002): Issue-based Dialogue Management (PhD Thesis)

GoDiS: an issue-based dialogue system
Built using TrindiKit Toolkit for implementing and experimenting with dialogue systems based on the information state approach Explores and implements issue-based dialogue management Extends theory to more flexible dialogue Multiple tasks, information sharing between tasks Feedback and grounding Accommodation, re-raising, clarification Menu based action oriented dialogue Multi-linguality & mutiple domains

control DME input inter- pret update select gene- rate output data-
TIS DATABASE LEXICON DOMAIN database lexicon domain knowledge

application-specific
Xerox manual home device manager Travel Agency VCR manager Auto- route genre-specific GoDiS-I GoDiS-A IBDM GoDiS TrindiKit IS approach

Issue-based dialogue management
enquiry-oriented dialogue (database search) basis: Ginzburg’s Dialogue Gameboard (DGB) and related DGB update protocols dialogue moves: ask, answer, greet, quit raising and addressing issues incl. short answers. e.g.”yes”, ”no”, ”paris”, ”in april” dialogue plans sample domain: travel agency extension: reraising issues handling multiple issues

Semantics simple First Order Logic without quantifiers, but with questions questions Y/N-questions: ?P, P is a proposition wh-questions: ?x.p(x) (p is a predicate) ? works much like like  alt-questions: {?P1, …, ?Pn} Content of short answers individual markers: paris, april, … yes, no

Semantics, cont’d Q-A relations (adapted from Ginzburg)
resolves(A,Q): A resolves Q dest-city(paris) resolves ?x.dest-city(x) relevant(A,Q): A is relevant to Q (about Q) not(dest-city(paris)) is relevant to ?x.dest-city(x), but does not resolve it

basic GoDiS information state record type
AGENDA : OpenQueue( Action ) PLAN : stack( Action ) PRIVATE : BEL : set( Prop ) COM : set( Prop ) QUD : stack( Question ) SHARED : LU: SPEAKER: Speaker MOVES: OQueue( Move )

sample dialogue plan < findout(?x.transport(x))
findout(?x.dest-city(x)) findout(?x.depart-city(x)) findout(?x.dept-month(x)) findout(?x.dept-day(x)) raise({?class(economy), ?class(business)} consultDB(?x.price(x)) >

Answer integration integrateAnswer update rule
Before an answer can be integrated by the system, it must be matched to a question on QUD in($/SHARED/LU/MOVES, answer(A)) fst($/SHARED/QUD, Q) $DOMAIN:relevant(A, Q) pre: ! DOMAIN: combine(Q, A, P) add(/SHARED/COM, P) eff:

basic dialogue with updates
U: ”price information please”; raises price issue if user asks Q, push respond(Q) on AGENDA if respond(Q) on AGENDA and PLAN empty, find plan for Q and load to PLAN if findout(Q) first on PLAN, ask Q S: ”where do you want to go?” U: ”Paris” if LM=answer(A) and A relevant to Q, add P=Q[A] to SHARED.COM if P in SHARED.COM and Q topmost on QUD and P resolves Q, pop QUD if P in SHARED.COM and P fulfils goal of findout(Q) and findout(Q) on PLAN, pop PLAN

basics cont’d … S: ”Do you want economy class or business class?”
U: ”economy class” if consultDB(Q) on PLAN, consult database for answer to Q; store result in PRIVATE.BEL if Q on QUD and P in PRIVATE.BEL s.t. P resolves Q, answer(P) S: ”The price is £123”

Information sharing across plans
GoDiS does not keep track of when propositions were added, or which plan was being executed so information sharing is determined by question sharing across plans plan for VISA question: findout(?x.dest-city(x)) findout(?x.citizenship(x)) shares a questions with plan for ?x.price(x) so if visa-issue raised after price-issue, no need to ask for destination again

dealing with multiple open issues
if user asks Q, push Q on QUD and load plan for dealing with Q if users asks Q’ while system is dealing with Q, throw out plan for Q but Q remains on QUD; load plan for Q’ when Q’ resolved, Q topmost on QUD will trigger reloading plan for dealing with Q general rule: if SHARED.COM contains info resolving Q, don’t ask Q so any resolved questions in plan will be thrown out

Sample dialogue: multiple tasks & info sharing
S> Welcome to the travel agency! U> price information S> (…) Lets see. How do you want to travel? U> by flight S> (…) What city do you want to go to? U> paris S> (…) What city do you want to go from? U> do I need a visa ? S> (…) Lets see. What country are you from? U> sweden S> Okay. Yes, you need a Visa. S> Returning to the issue of price. Lets see. What city do you want to go from?

Grounding and feedback
making sure that the participants are percieving, understanding, and accepting each other’s utterances dealing with problematic situtations where e.g. an utterance is not percieved Feedback moves (short) utterances which signal grounding status of previous utterance Sequencing moves utterances which signal switching task (”returning to…”), reraising questions (”so…”) etc.

Grounding moves We want a typology of grounding moves Some parameters:
Both feedback and sequencing Some parameters: Action levels in dialogue (Allwood, Clark) contact perception Understanding Word meanings Contextual meaning acceptance Polarity Positive Negative Interrogative (asking for confirmation)

Some grounding moves in GoDiS
Formal representation icm:Level/Type{*Polarity}{:Content} Feedback moves icm:und*neg – ”I don’t understand” icm:und*pos:P – ”To Paris.” icm:acc*neg:Q – ”Sorry, I can’t …” icm:acc*pos – ”Okay” Feedback type selected depending on Quality of recognised speech Whether system can find a (relevant) interpretation Whether system can accept what’s been said Sequencing moves icm:reraise:Q – ”Returning to the issue of Q” icm:loadplan – ”Let’s see…”

Grounding on multiple levels
Negative perception feeback S> Lets see. What can I do for you? U> (mumble mumble) S> Sorry, I didnt hear what you said. What can I do for you? U> siridus S> I heard you say siridus . Sorry, I dont understand. What can I do for you? U> channel one S> channel one. I dont quite understand. Do you want to add a program or change channel? (dialog3a_edit) Positive perception Negative understanding Negative understanding (relevance) Positive understanding (word meaning) Task clarification

Positive understanding Rejection (negative acceptance feedback) U> turn on the t v S> turn on the TV… Sorry, I cannot control the TV set. Do you want to add a program or change channel?” U> pardon S> Sorry, I cannot control the TV set. Do you want to add a program or change channel? U> change channel S> Okay. (dialog3b_edit) User negative perception System repetition Positive acceptance

Correcting the system S> Lets see. What can I do for you?
U> add program S> Okay. Add a program… S> Lets see. What channel do you want? U> channel six S> OK, channel five. U> no S> not channel five. So, What channel do you want? S> Okay. channel six.” (dialog4_edit) Interrogative feedback Negative answer to fb. Signals reraising (Sequencing ICM)

Addressing Unraised Issues
QUD and ISSUES General strategy: question accommodation issue accommodation QUD accommodation dependent issue accommodation issue reacommodation

A problem with QUD If QUD=<q1,q2> and q1 is resolved, q2 is available for resolution of short answers takes no account of how many turns since q2 was raised but short answers a long distance away from the question are not as easily processed as an adjacent answer

ISSUES and QUD We extend Ginzburg’s DGB by adding ISSUES of type Stack(Question) ISSUES contains all raised but unresolved questions ISSUES determines relevance of user answers QUD used for resolving short answers questions drop off QUD after N turns a short answer to a question that’s on ISSUES but not QUD requires adjusting QUD by copying a question on ISSUES

Typical human-human dialogue
S(alesman), C(ustomer) S: hi C: flights to paris S: when do you want to travel? C: april, as cheap as possible ...

Accommodation Lewis (1979): If someone says something at t which requires X to be in the conversational scoreboard, and X is not in the scoreboard at t, then (under certain conditions) X will become part of the scoreboard at t Has been applied to referents and propositions, as parts of the conversational scoreboard / information state

Question accommodation
If questions are part of the information state, they too can be accommodated If the latest move was an answer, and there is an action in the plan to ask a matching question, then put that question on ISSUES (and QUD if it is a short answer) Requires that the number of possible matching questions is not too large (or can be narrowed down by asking clarification question)

issue accommodation PLANISSUES
If LM=answer(A) no Q in ISSUES s.t. about(A,Q) then find findout(Q) in PLAN s.t. about(A,Q) push Q on ISSUES used when prevously unraised question (available in plan) is answered using a short or full answer

QUD accommodation ISSUESQUD
If LM=answer(A) no Q in QUD s.t. about(A,Q) then find Q in ISSUES s.t. about(A,Q) push Q on QUD raise Q in ISSUES (make Q topmost) used when previously raised question has dropped off QUD, but is answered using a short answer previously unraised question is answered using short answer [needs PLANISSUES accommodation]

dependent issue accommodation DOMAINISSUES (+PLAN)
If LM=answer(A) no Q in ISSUES s.t. about(A,Q) no findout(Q) in PLAN s.t. about(A,Q) then find Plan for some Q’ in DOMAIN s.t. findout(Q) or raise(Q) in Plan and about(A, Q) push Q’ on ISSUES set PLAN to Plan used when previously unraised question, unavailable in PLAN, is answered using full or short answer (AKA ”task accomodation”)

Sample dialogue: accommodation
S: Welcome to the travel agency. U: From London to Paris in April not relevant to any question that has been raised, or to any current task look in domain knowledge for a plan (for dealing with some question Q) with matching questions load this plan, push Q on ISSUES find in the plan the question(s) matching the user’s answer integrate answer (requres matching question on ISSUES) S: Alright, you want to know about price. (…) proceed to next plan item S: How do you want to travel? ISSUES=<?x.how(x), ?x.price(x)>

Task clarification (VCR application)
S> What can I do for you? U> channel five S> channel five. I dont quite understand. Do you want to add a program or change channel? U> change channel S> Okay. change channel. S> Lets see. channel five? U> yes S> The channel was changed

Issue reraising SHARED.COMISSUES
If LM=ask(Q) there is a P in SHARED.COM s.t. relevant(P,Q) then push Q on ISSUES remove P from SHARED.COM used when previously resolved question is asked again reraising should be inicated; ”so,…”; reformulation may be needed

Issue reaccommodation SHARED.COMISSUES
If LM=answer(A) no Q in ISSUES s.t. about(A,Q) P in SHARED.COM s.t. there is a Q s.t. about(A,Q) and about(P,Q) then push Q on ISSUES remove P from SHARED.COM Used when previously resolved question is answered again

Revision by reaccommodation and reraising
S> What can I do for you? U> add program channel five S> Okay. Do you want to add a program? U> yes S> Lets see. channel five? S> What date? U> channel six S> Okay. channel six… What date?

Issues and goals in Action-Oriented Dialogue (AOD)
each goal action associated with a dialogue plan add field SHARED.ACTIONS : StackSet(Action) New moves: request(Action), report(Action, Status) ACTIONS has a similar role to ISSUES adapt accommodation strategies to AOD

ACTIONS : stackset( Action )
AOD information state AGENDA : stack( Action ) PLAN : stackset( Action ) PRIVATE : BEL : set( Prop ) TMP : (same type as SHARED) COM : set( Prop ) ACTIONS : stackset( Action ) SHARED : QUD : stackset( Question ) ISSUES: stackset(Question) LU:…

Requests vs. answers requests adresses a general question dialogue
”what shall I do next?”, ”what can I do for you?” or similar (”prompt”) semantics in AOD: ?x.action(x) dialogue ”What can I do for you?” raises?x.action(x) ”Search the phonebook” interpreted as request(search_phonebook) Rule: If ?x.action(x) is topmost on ISSUES and L-M is request(a) then pop ISSUES

Questions vs. answers a question can be regarded as a special type of request so in a specific sense, questions are also answers… questions adresses a general question ”what issue shall I address next?”, ”what can I do for you?” or similar (”prompt”) semantics: ?x.issue(x) this question is not presupposed semantics of ”How can I help you?” in IOD is ask(?x.issue(x))

Action Oriented Dialogue subtypes
”Command dialogue” user instructs system to perform actions device control; menu-based travel agency booking ”Instructional dialogue” system instructs user to perform actions ”interactive manual” system can explaing how to perform some action joint planning user and system jointly agree to a plan also involves negotiation

LEXICON DOMAIN DEVICES
control input nuance inter- pret update select gene- rate output nuance Information State LEXICON DOMAIN DEVICES device alert dim10 lexicon home lexicon vcrphone device phone rel2 device vcr rel1 domain home domain vcrphone VCR

Converting menus to plans
multi-choice list P1, P2, ..., Pn findout({?P1,?P2,...,?Pn}) tick-box +/-P findout(?P) dialogue window p=___ findout(?x.p(x)) pop-up message M inform(M)

Menu-based AOD S: Welcome to the telephone and VCR manager!
U: Channel one today S: Right, I assume you want to add a program. Let’s see. Okay. Channel one. Today. What time do you want to start recording? S: What time is it? U: It’s one forty five S: Set the clock to two forty five U: Okay, set the clock. Two forty five, is that correct? S: Yes. U: Okay. The clock was set. Returning to timer recording. So, what time do you want to start recording? U: eight thirty to nine thirty S: Okay. Starting at eight thirty. Ending at nine thirty. The program has been added.

Task and question accommodation in AOD
S> Lets see. What can I do for you? U> channel five today S> Alright. add a program… Lets see. Okay. channel five… today… What time do you want to start recording? U> from six thirty to seven thirty S> Okay. from six thirty… to seven thirty… The program has been added. ” Signals accommodation (sequencing ICM)

7. Multilinguality and multiple domains
Language and domain resources can be switched on the fly Does not affect the rest of the system Thanks to modularity structure of system

Multilinguality S> Lets see. What can I do for you?
U> add program S> Okay. Do you want to add a program? U> yes S> Lets see. What channel do you want? U> svenska S> Okej. S> Gå tillbaks till lägg till ett program. Vilken kanal vill du spela in? U> kanal ett S> kanal ett , är det korrekt? S> Så, Vilken kanal vill du spela in? U> english S> Okay...

Multiple domains S> Welcome to the VCR manager!
S> Lets see. What can I do for you? U> change domain S> Okay. change domain… Lets see. What domain do you want? U> telephone S> Okay. This is your automatic telephone operator… Lets see. Please specify a function. U> call S> Okay. make a phone call. S> Lets see. Please specify the destination of the call. U> luis S> Okay. Luis… Calling

Conclusions: GoDiS & Issue-Based Dialogue Management
general solutions to dealing with multiple tasks sharing information between tasks grounding and feedback user initiative (accommodation) menu-based dialogue rapid prototyping of applications dialogue plans switching language and domain online

Current and future work
Extend to more complex dialogue types Negotiation (theory exists, not yet implemented) Tutorial dialogue Explore use of QUD and ISSUES to assign proper focus intonation Stina Ericsson, forthcoming PhD thesis In-home multimodal menu-based dialogue Video player, mp3 player, lamps, agenda Integrate with type-theoretical situation semantics

Dialogue systems and AAC
Speech-controlled devices, e.g. toys standard dialogue system speech recognition adapted for AAC needs Communication aids dialogue system tracks dialogue between user and other people; attempts to give proposals of what to say

Dialogsystem och AAC

Using speech-controlled toys to train verbal interaction using AAC technology
user communicates using some communication device user+device communicates verbally with the toy via a dialogue system also send information (text) directly from AAC device to dialogue system; speech recognition not essential AAC device dialogue system Turtle ))) (((

An example: a Logo turtle (Papert)
for learning math a robot is programmed to draw geometrical shapes what’s new: instead of typing commands to the turtle, the user speaks to it

Multimodal dialogue systems for AAC users
multimodal menu-based dialogue allows user to adapt modalities to current needs

Ett annat sätt att se på dialogsystem:
Snarare än gränssnitt människa-maskin: Hjälpmedel för kommunikation mellan människor och organisationer Exempel: SJ’s dialogsystem kan ses som en representant för SJ att tala med systemet är att tala med SJ, förmedlat via ett dialogsystem jfr talesman för en organisation Detta perspektiv kanske är intressant för AAC? en användare kan välja att överlåta en del av sin kommunikation till en “representant” i form av ett dialogsystem (som användaren förprogrammerat till vissa beteenden, yttranden, historier...)

Dialogsystem Logopediutbildningen VT05 Staffan Larsson

Liknande presentationer

En presentation över ämnet: "Dialogsystem Logopediutbildningen VT05 Staffan Larsson"— Presentationens avskrift:

Liknande presentationer

Om projektet

Kontakta oss

Logga in

Logga in via sociala nätverk:

Dialogsystem Logopediutbildningen VT05 Staffan Larsson

Liknande presentationer

En presentation över ämnet: "Dialogsystem Logopediutbildningen VT05 Staffan Larsson"— Presentationens avskrift:

Liknande presentationer

Om projektet

Kontakta oss