Multimodal Interfaces - 1 Introduction to
←
→
Page content transcription
If your browser does not render page correctly, please read the page content below
Multimodal Interfaces [1] Introduction to Human-Computer Interaction Jacques Bapst
Content Overview Multimodal Interfaces § Human-Computer interaction basic paradigms § Human capabilities / Human modeling / Cognitive Framework Ø Input-Ouput channels / Human senses Ø Model Human Processor Ø Fitts' Law Ø Human memory (sensory, short-term, long-term) Ø Reasoning (deductive, inductive, abductive) Ø Human perception Ø Interaction process (action theory, affective aspects, emotion) § User-Computer Dialog / Interaction Styles Ø Mode / Temporal and Spatial Mode / Quasimode Ø Command language Ø Form fill-in / Spreadsheets Ø Menu selection Ø Natural language / Query language Ø WIMP / Point-and-Click Ø Direct manipulation / Indirect manipulation Ø 3D Interfaces / Brain computer interface § Appendix : HCI and GUI short history EIA-FR / J. Bapst MMI_01 2
Multimodal Interfaces [1.1] Human-Computer Interaction "Bridging the Gap"
Human-Computer Interaction Multimodal Interfaces "With teletype interface and the Fortran language, the computer will be easy to use" From RAND-Corporation 1950 : A scientist shows how a "home computer" could look like in 2004 EIA-FR / J. Bapst MMI_01 4
Human-Computer Interaction (HCI) Multimodal Interfaces § Human-Computer Interaction is also known as Man-Machine Interaction (MMI). § One possible definition (ACM) : Ø Human-computer interaction is a discipline concerned with the design, evaluation and implementation of interactive computing systems for human use and with the study of major phenomena surrounding them. § HCI is a large interdisciplinary science involving Ø Computer science and Engineering Ø Graphic design Ø Cognitive psychology Ø Ergonomics (user's physical capabilities) Ø Sociology (wider context of the interaction) § User interface design is an important subset of the HCI field of study. EIA-FR / J. Bapst MMI_01 5
Interaction Design and HCI Multimodal Interfaces § Relationship between Interaction Design, Human-Computer Interaction and other fields : academic disciplines, design practices and interdisciplinary fields. Academic Disciplines Design Practices Ergonomics Graphic Design Psychology Cognitive sciences Product Design Informatics Artist Design Interaction Engineering Design Industrial Design Computer Science Social Sciences Information (sociology, anthropology, ...) Systems Computer Supported Human-Computer Human Factors (HF) Cooperative Work (CSCW) Interaction (HCI) Cognitive Engineering Cognitive Ergonomics Interdisciplinary Fields EIA-FR / J. Bapst MMI_01 6
Multimodal Interfaces [1.2] Human Capabilities Communication Channels
Human Characteristics / Abilities Multimodal Interfaces § HCI is undoubtedly a multi-disciplinary topic. § In HCI design, it is really important to understand something about Ø Human information-processing characteristics ü Cognitive architecture, memory, perception, motor skills, … Ø How human action is structured Ø The nature of human communication Ø Human physical and physiological requirements § Human are limited in their capacity to process information. This has important implications for the interaction design. § Important aspects Ø Input-output channels (senses and effectors) Ø Memory Ø Learning(acquiring skills) Ø Reasoning / Problem solving (cognitive activity) EIA-FR / J. Bapst MMI_01 8
Input-Output Channels Multimodal Interfaces § Input in the human occurs mainly through the senses (sensory channels) and output through the motor control of the effectors. § Five major senses + Balance + Kinesthesis Ø Sight Ø Sense of Ø Proprioception Ø Hearing equilibrium Ø Touch Ø Taste Do not currently play a significant role in HCI Ø Smell except in specialized systems or virtual reality § Effectors Ø Limbs (arms, legs, body position, …) Ø Fingers Ø Eyes Ø Head / Face Ø Body Ø Vocal system EIA-FR / J. Bapst MMI_01 9
Human Senses : Sight Multimodal Interfaces § Human vision is a highly complex activity with a range of physical and perceptual limitations. § Primary source of information. § Two main stages Ø Physicalreception of the stimulus (photoreceptor of the retina) Ø Processing and interpretation of that stimulus § Visual perception Ø Size and depth (visual angle, stereoscopy, knowledge, …) Ø Color (hue, intensity, saturation), Color blindness Ø Brightness (luminance, contrast) § Reading Ø Visual pattern perception Ø Pattern decoding using internal representation of language Ø Syntactic and semantic analysis (phrases) EIA-FR / J. Bapst MMI_01 10
Human Senses : Hearing Multimodal Interfaces § Hearing is often considered secondary to sight (we tend to underestimate the amount of information that we receive through our ears). § Two main stages Ø Physical reception of the stimulus (sound wave propagated along the auditory canal, received by tympanic membrane and transmitted to the cochlea) Ø Processing and interpretation of that stimulus § Hearing perception Ø Pitch (main frequency) Ø Loudness (amplitude) Ø Timbre (spectrum, envelope) Ø Location (stereophony) § Voice recognition Ø Perception, decoding, syntactic and semantic analysis EIA-FR / J. Bapst MMI_01 11
Human Senses : Touch Multimodal Interfaces § Touch is also known as haptic perception. § It provides us with vital information about our environment. § The skin contains three types of sensory receptor Ø Thermoreceptors respond to temperature Ø Nociceptors respond to intense pressure, heat and pain Ø Mechanoreceptors respond to pressure § Act as Ø Sensory receptor thermoreceptor, pressure receptor, pain Ø Warning hot, sharp, … Ø Feedback feel when in contact, necessary in prehension § A second aspect of haptic perception is the awareness of the position of the body and limbs. This conscious awareness of body position is known as kinesthesis or (if we include the sense of equilibrium) proprioception. EIA-FR / J. Bapst MMI_01 12
Multimodal Interfaces [1.3] Human Modeling Cognitive Framework
What is Cognition ? Multimodal Interfaces [1] § What goes on in the mind in our everyday activities ? § Different kinds of cognition. EIA-FR / J. Bapst MMI_01 14
What is Cognition ? Multimodal Interfaces [2] § Norman (1993) distinguishes between two general modes : Ø Experiential cognition ü Driving a car, reading a book, having a conversation, playing a game, … Ø Reflective cognition ü Designing, learning, problem solving, writing a book, … § Cognition may also be described in terms of specific kind of processes : Ø Attention Ø Perception and recognition Ø Memory Ø Learning Ø Reading,speaking, listening Ø Problem-solving, planning, reasoning, decision-making Ø. . . EIA-FR / J. Bapst MMI_01 15
Human Modeling Multimodal Interfaces § Unfortunately no general and unified theory. § Cognitive and interaction models attempt to represent the users as they interact with a system, modeling aspects of their understanding, knowledge, intentions or processing. § Human models can be divided into categories according their abstraction level. Reasoning Æ Theory of Action [Norman] Æ Rasmussen's model Æ Human processor [Card, ...] Reflex + Others : GOMS, ICS, CCT, Keystroke, … EIA-FR / J. Bapst MMI_01 16
Model Human Processor (MHP) Multimodal Interfaces § From Card, Moran and Newell (1983). § Describes the cognitive process that people go through between perception and action § Simplistic view of human behavior Ø Ignores environment and other people (social interactions) § Low level model Ø Performance oriented Ø Allow empirical tests § Basis of GOMS model EIA-FR / J. Bapst MMI_01 17
Fitts' Law Multimodal Interfaces [1] § The Fitts' law (1954) predicts the time required to move from a starting position to a final target area based on the size and the distance of this target area. § It describes the behavior of aimed and rapid movement. W D d0 d1 d2 d3 d4 T = k log2(2D / W) k : constant values based on D 10 cm 10 cm 30 cm cycle times τp and τm W 1 cm 1 mm 2 mm Usually : 0.1 s T 0.43 s 0.76 s 0.82 s EIA-FR / J. Bapst MMI_01 18
Fitts' Law Multimodal Interfaces [2] § The Fitts' law has been formulated in several different ways. § One common form is the Shannon formulation. T = a + b log2(D / W + 1) a and b are empirical constants which depend of the pointing device and the user dexterity. [a ≈ 0.1 s, b ≈ 0.1 s] § The logarithmic term of the formula ( log2(D /W + 1) ) is called index of difficulty (ID) ð T = a + b ID § Useful when Ø Designinguser interfaces Ø Comparing pointing devices (determining a and b) § Test yourself : www.tele-actor.net/fitts/index.html EIA-FR / J. Bapst MMI_01 19
Human Memory Multimodal Interfaces [1] § Much of our everyday activity relies on memory. § It is generally agreed that there are three types of memory : Ø Sensory buffers Ø Short-term memory or working memory Ø Long-term memory § These memories interact, with information being processed and passed between memory stores : Sensory memories Attention Repetition - Iconic (visual) - Echoic (aural) - Haptic (touch) Short-term memory Rehearsal or Encoding Working memory Information not attended to Retrieval Long-term memory Forgetting EIA-FR / J. Bapst MMI_01 20
Human Memory Multimodal Interfaces [2] § First stage : selection and encoding Ø Determines which information is attended to in the environment and how it is interpreted. Ø The more attention paid to something, and the more it is processed in terms of thinking about it and comparing it with other knowledge, the more likely it is to be remembered. § We don’t remember everything : involves filtering and processing what is attended to. § Context is important in affecting our memory (i.e., where, when). Ø Sometimes it can be difficult to recall information that was encoded in a different context. § We recognize things much better than being able to recall things. Ø Betterat remembering images than words Ø A reason why interfaces are largely visual ü GUIs provide visually-based options : recognition ü Command-line UIs require users to remember commands : recall EIA-FR / J. Bapst MMI_01 21
Sensory Memory Multimodal Interfaces § The sensory memories act as buffers for stimuli received through the senses (constantly overwritten by new information [0.1 … 0.5 s]). § A sensory memory exists for each sensory channel Ø Iconicmemory for visual stimuli Ø Hechoic memory for aural stimuli Ø Haptic memory for touch § Information is passed from sensory memory into short-term memory by attention. § Attention is the concentration of the mind on one out of a number of competing stimuli. § We can choose which stimuli to attend to (according our need, level of interest, …; this explains the noisy party phenomenon). § Information received by sensory memories is quickly passed into a more permanent memory store (generally the short-term memory) or overwritten and lost. EIA-FR / J. Bapst MMI_01 22
Short-Term Memory Multimodal Interfaces § Short-term memory or working memory (a slightly different concept) acts as a scratch-pad for temporary recall of information. § Examples of use : Ø Calculation (e.g. 25 x 6) Ø Reading § Short-term memory access time is in the order of 70 ms. § Information can only be held temporarily : ~ 200 ms … 10 s § Short-term memory has a limited capacity. This was established in experiments by Miller ("The magical number seven, plus or minus two"). § The average person can remember 7 ± 2 chunks of information. § Further studies say 4 ± 2 (if no relationship between information) § A chunk of information is not precisely defined. It is any meaningful unit (digits, words, people's faces, chess positions, etc.) and depends of the user experience with the kind of information. EIA-FR / J. Bapst MMI_01 23
Miller's Magical Number : 7±2 Multimodal Interfaces § George Miller's theory (1956) says that 7±2 chunks of information can be held in short-term memory at any one time. § It's one of the best known et remembered finding in psychology. § But unfortunately, this theory is often misinterpreted by HCI designers. § Examples of inappropriate application of the theory : Ø Never have more than seven bullets in a list Ø Have no more than seven options on a pull-down menu Ø Display only seven icons on a menu-bar or tool-bar Ø Place no more than seven tabs at the top of a website page Ø and so on… § All of these are wrongly based on Miller's law because all the items can be scanned and rescanned visually and hence do not have to be recalled from short-term memory. EIA-FR / J. Bapst MMI_01 24
Long-Term Memory Multimodal Interfaces § The long-term memory is our permanent memory store intended for the long-term storage of information (everything that we know : experiential knowledge, procedural skills, etc.). § It has a huge capacity (if not unlimited). § Relatively slow access time (~0.1 s). § Forgetting, if at all, occurs much more slowly than in short-term memory (long-term recall after minutes is the same as that after days). § Today, most researchers distinguish three long-term memory sub-systems : Ø Episodic memory : memory of events and experiences Declarative Memory in a serial form (chronology) Ø Semantic memory : structured record of facts, concepts that we have acquired Ø Procedural skills : "know-how" memory (skills, procedures) EIA-FR / J. Bapst MMI_01 25
Long-Term Memory Structure Multimodal Interfaces Long-term memory Facts Declarative memory Skills Procedural memory Play piano Ride a bike Semantic memory Episodic memory Paul's address Last birthday party Words meaning § The information in semantic memory is derived from that in our episodic memory (we can learn new concept from our experiences). § Memory structure and processes are very complex and cannot easily be reduced to a simple model. EIA-FR / J. Bapst MMI_01 26
Reasoning Multimodal Interfaces § Reasoning is the process by which we use our knowledge to draw conclusions or infer new information. § There are different types of reasoning : Ø Deductive reasoning / Deduction Ø Inductive reasoning / Induction Ø Abductive reasoning / Abduction § Humans are able to think about things of which they have no experience and solve problem they have never seen before. § Problem solving involve reasoning and vice versa. § Recurrent familiar situations allow people to acquire skills in a particular domain (better information structure). § People build their own theories (called Mental models) to understand the causal behavior of systems. § Sometimes they are based on an incorrect interpretation of the facts and this can lead to the well known "Human error". EIA-FR / J. Bapst MMI_01 27
Deductive Reasoning Multimodal Interfaces § Deductive reasoning derives the logically conclusion from the given premises. Ø If it is Monday it rains Ø It is Monday Ø Therefore it rains § Note that the logical conclusion does not necessarily have to correspond to our notion of truth. § Deductive reasoning is therefore sometimes misapplied : Ø Some people are students Ø Some students attend a lecture about Multimodal interfaces Ø Therefore some people attend a lecture about Multimodal interfaces Is this logically correct ? § The human deduction is poor when there is a clash between truth and logical validity. EIA-FR / J. Bapst MMI_01 28
Inductive Reasoning Multimodal Interfaces § Inductive reasoning is generalizing from cases we have seen to infer information about cases we have not seen. § With inductive reasoning, we draw conclusions by moving from specific case or cases and deriving a general rule (just the opposite of deductive reasoning). § Example : Ø Every EIA-FR student I have ever seen owns a desktop computer Ø Therefore I infer that all EIA-FR students own a desktop computer § Of course, this inference is unreliable and cannot (or at least difficult) be proved to be true. It can only be proved to be false. § In spite of its unreliability, induction is a useful process, which we use constantly in learning about our environment. EIA-FR / J. Bapst MMI_01 29
Abductive Reasoning Multimodal Interfaces § Abductive reasoning is reasoning from observed facts to the action or state that caused it. § This is the method we use to derive explanations for the events we observe (we try to find hypothesis that would explain the observed facts). § Example : ØI know that Bob take his car when he misses the bus Ø If I see Bob driving his car Ø Therefore I may infer that he missed the bus § In spite of its unreliability, people do infer explanations in this way and hold onto them until they have evidence to support an alternative theory or explanation. § In interactive systems, if an event always follows an action, the user will infer that the event is caused by the action. If, in fact, the event and the action are unrelated, confusion and even error may result. EIA-FR / J. Bapst MMI_01 30
Human Perception Multimodal Interfaces [1] § Human senses cannot easily be compared with computer peripherals. Don't forget the functions of the brain which heavily processes the sensors information before interpretation. § All human senses are subject to illusions in their perception process. § Illusion research can provide fundamental insights into general brain mechanisms. EIA-FR / J. Bapst MMI_01 31
Human Perception Multimodal Interfaces [2] EIA-FR / J. Bapst MMI_01 32
Human Perception Multimodal Interfaces [3] § Stroop effect. Noir Papier Manger Vert Livre Rouge Vert Maison Creuser Bleu Texte Orange Téléphone Rouge Bleu Rire Agenda Orange Golf Noir EIA-FR / J. Bapst MMI_01 33
Human Perception Multimodal Interfaces [4] § 2- and 3-dimensional interpretation EIA-FR / J. Bapst MMI_01 34
Human Perception Multimodal Interfaces [5] § Can you read this ? Sleon une édtue de l'Uvinertisé de Cmabrigde, l'odrre des Aoccdrnig to a rscheearch at Cmabrigde Uinervtisy, it deosn't ltteers dnas un mot n'a pas d'ipmrotncae, la suele coshe mttaer in waht oredr the ltteers in a wrod are, the olny iprmoetnt ipmrotnate est que la pmeirère et la drenèire letrte soit à la tihng is taht the frist and lsat ltteer be at the rghit pclae. bnnoe pclae. The rset can be a toatl mses and you can sitll raed it wouthit Le rsete peut êrte dnas un dsérorde ttoal et vuos puoevz porbelm. tujoruos lrie snas porlblème. Tihs is bcuseae the huamn mnid deos not raed ervey lteter by C'est prace que le creaveu hmauin ne lit pas chuaqe ltetre elle- istlef, but the wrod as a wlohe. mmêe, mias le mot cmome un tuot. § Probably a hoax, but nevertheless surprising. EIA-FR / J. Bapst MMI_01 35
Human Perception Multimodal Interfaces [6] § Face recognition § Inverted faces EIA-FR / J. Bapst MMI_01 36
Human-Machine Interaction Process Multimodal Interfaces § Action theory (D.A. Norman, 1986) § Action cycle composed of two main processes Ø Execution process Ø Evaluation process § Seven stages : 1. Establish a goal 2. Form an intention 3. Specify an action sequence 4. Execute an action 5. Perceive the system state 6. Interpretthe state 7. Evaluate the system state with respect to the goals and intentions EIA-FR / J. Bapst MMI_01 37
Action Cycle Multimodal Interfaces [1] § The action theory try to deconstruct the process of translating an intention into an action. Ø The action itself has two major aspects : doing something and checking (execution and evaluation) Goals Intention to act Evaluation What to do ? Comprehension Gulf of Gulf of execution Sequence of actions Interprétation evaluation Interaction How to do it ? Execution Perception World EIA-FR / J. Bapst MMI_01 38
Action Cycle Multimodal Interfaces [2] § Forming the goal. Ø Can be stated in a very imprecise way, e.g. "Make a nice meal" § Execution : Ø Forming the Intention. Goals must be transformed into intentions, i.e., specific statements of what has to be done to satisfy the goal; e.g. "Make a chicken casserole using a can of prepared sauce" Ø Specifying an Action Sequence. What is to be done to the world. The precise sequence of operators that must be performed to effect the intention; e.g. "Defrost frozen chicken, open can, ..." Ø Executing an Action. Actually doing something. Putting the action sequence into effect on the world; e.g. actually opening the can. EIA-FR / J. Bapst MMI_01 39
Action Cycle Multimodal Interfaces [3] § Evaluation : Ø Perceiving the State of the World. Perceiving what has actually happened; e.g. the experience of smell, taste and look of the prepared meal. Ø Interpreting the State of the World. Trying to make sense of the perceptions available; e.g. putting those perceptions together to present the sensory experience of a chicken casserole. Ø Evaluating the Outcome. Comparing what happened with what was wanted; e.g. did the chicken casserole match up to the requirement of "a nice meal" ? § Design objective Ø To reduce the evaluation's gulf Ø To reduce the execution's gulf § Design principles : } Visibility } Good mapping } Good conceptual model } Feedback EIA-FR / J. Bapst MMI_01 40
Execution's and Evaluation's Gulfs Multimodal Interfaces § The gulfs represent the gaps that exist between the user and the interface § Gulf of execution Ø Distance from the user to the physical system § Gulf of evaluation Ø Distance from the physical system to the user § The goal : reduce the gulfs in order to reduce the cognitive effort required to perform a task Interface mechanism ï Action sequence ï Intentions Interface perception ð Interpretation ð Evaluation EIA-FR / J. Bapst MMI_01 41
Affective aspects / Emotions Multimodal Interfaces [1] § Human experience is far more complex and not limited to perceptual and cognitive abilities. § Our emotional response to situations affects how we perform. § Positive emotions enable us to think more creatively and to solve problem more quickly. § Negative emotions pushes us into narrow thinking. A problem, easy to solve, will become difficult if we are frustrated or stressed. § Emotion involves both physical and cognitive events. § That biological response - known as affect - changes the way we deal with different situations, and this has an impact on the way we interact with computer systems. § If we try to build interfaces that promote positive responses (e.g. by using aesthetics, reward, ergonomics, …) then they are likely to be more successful. EIA-FR / J. Bapst MMI_01 42
Affective aspects / Emotions Multimodal Interfaces [2] § Recent empirical studies [Tractinsky, 2000] show that the aesthetics of an interface can have a positive effect on people's perception of the system's usability. § Importance of the look and feel of an interface (and not only usability) is gaining acceptance within the HCI community. § Users are likely to be more tolerant when the interface is pleasing : Ø Beautiful graphics Ø Well-designed fonts and icons Ø Nice feel of the way the elements have been laid out Ø Elegant use of images and color Ø Good sense of balance Ø. . . § A key concern is to strike a balance between designing Pleasurable interfaces \__^__/ Usable interfaces EIA-FR / J. Bapst MMI_01 43
Affective aspects / Emotions Multimodal Interfaces [3] § Expressive interfaces can affect user attitude and behavior. Ø Reassuring, informative, fun Ø Intrusive, annoying, get user angry § Anthropomorphism pros and cons. Ø Well accepted by children Ø Some people may feel anxious, feeling inferior or stupid Ø A controversial debate § Interface agents, avatars, virtual pets, interactive toys Ø Often considered as trying and intrusive Ø Downright deceptive, deceiving and frustrating EIA-FR / J. Bapst MMI_01 44
Individual Differences Multimodal Interfaces § The psychological principles and properties apply to the majority of people. § Notwithstanding this, we should remember that, although we share common processes, humans (i.e. users) are not all the same. § We should be aware of individual differences Ø Long term differences : sex, physical / intellectual capabilities, … Ø Short term differences : emotion, stress, fatigue, … Ø Continuous changes : age, experience, skills, … § These differences should be taken into account in interface design Ø Define personas (artifacts) Ø Promote flexibility and adaptability Ø Universal accessibility (impaired people) § In summary : User-centered design (philosophy and process) EIA-FR / J. Bapst MMI_01 45
Multimodal Interfaces [1.4] User-Computer Dialog Interaction Styles Interaction Paradigms
Mode Multimodal Interfaces § A mode is a distinct state of the interface in witch the same user input will produce a different result than it would in other state. The mode influences the effect of actions. § Typical examples : Ø Insert/Replace mode in word processing applications Ø Caps lock (keyboard physical mode) Ø Tool palettes in photo editing or drawing applications Ø Modal dialog boxes § Modal interfaces should be avoided, if at all possible, because they lead to confusion or input errors, known as mode errors (the user performs an action that is appropriate to a different mode and gets hence an unexpected response). § If modes must be used, there should be clear indicators of the current mode to help prevent mode errors. § Nevertheless… modes can sometimes be helpful to control and guide user input. EIA-FR / J. Bapst MMI_01 47
Spatial Mode / Temporal Mode Multimodal Interfaces § One can also present a mode as a multiplexer of user input that allows giving one precise meaning to an action. § Distinction between spatial modes and temporal modes (spatial multiplexing of input vs. temporal multiplexing). § A spatial mode uses the location of a user's action to determine the meaning of an event. Ø Example : Resizing handles in a drawing editor § A temporal mode uses the ordering of events in time to determine their meaning. Ø Example : Tools palette in a drawing editor § Drawback of temporal modes : they have no intrinsic feedback. ØA possible workaround : change the shape of the cursor or other noticeable interface state § Avoid sub-modes and limit the number of states to what is strictly necessary. EIA-FR / J. Bapst MMI_01 48
Quasimode and Modeless Interfaces Multimodal Interfaces § A quasimode (or spring-loaded mode) is a mode that is kept in place only through some constant action on the part of the user (e.g. pressing the Shift-Key or Ctrl-Key). § A quasimode is a modeless interaction that allows for the benefits of a mode without the associated cognitive burden (the user is performing a conscious action). § Modifier keys used in interfaces usually start a quasimode. § An interface that doesn't use modes is known as a modeless interface (the same input from the user will always trigger the same perceived action). EIA-FR / J. Bapst MMI_01 49
Interaction Styles Multimodal Interfaces § Interaction can be seen as a dialog between the computer and the user. § The choice of interaction style can have a profound effect on the nature of this dialog. § The most common interaction styles : Ø Command language / Command line interface Ø Form-fills and spreadsheets Ø Menus Ø Natural language and query language Ø Question/answer dialog Ø WIMP Ø Point-and-click Ø Direct manipulation Ø 3D interfaces (à virtual reality) Ø Brain-computer interface EIA-FR / J. Bapst MMI_01 50
Command Language Multimodal Interfaces [1] § User types in commands in response to a prompt § May use function keys, abbreviations or whole- word commands § Examples Ø OS commands ü MS-DOS ü Unix shell Ø Applications ü FTP ü Telnet EIA-FR / J. Bapst MMI_01 51
Command Language Multimodal Interfaces [2] § Earliest form of interaction style and is still widely used § Flexible and extensible interface (appealing for expert users) § Required a formally defined syntax (should use user's vocabulary) § Useful for repetitive tasks § Support regular expression and creation of user-defined scripts and macros Ø $> zip archive photo*.jpg § Suitable for interacting with networked-computer (low bandwidth) § Poor usability : Ø Typing is tiring and error prone Ø Difficult to remember task names and parameters (bad learnability) Ø Difficult to remember correct syntax Ø Error messages and assistance are hard to provide § Not suitable for non-expert users EIA-FR / J. Bapst MMI_01 52
Form Fill-in ("fill in the blanks") Multimodal Interfaces § Used primarily for data entry (but also useful in data retrieval) § Aimed at non-experts users § Paper form metaphor § Originally no need for a pointing device (Keyboard, Tab, Enter) § Easy movement from field to field § Form fill-in interfaces were (and Classic form fill-in still are) especially useful for routine, clerical work or for tasks that require a great deal of data entry § User already familiar with actual form (often based on actual paper form for compatibility) More modern form fill-in EIA-FR / J. Bapst MMI_01 53
Spreadsheets Forms Multimodal Interfaces § Spreadsheets can be considered as a sophisticated variation of form filling § Grid of cells with formula § System maintains consistency and updates values immediately § User can manipulate values (in any order) and observe effects § Sometimes blurred distinction between input and output fields § Attractive style for = Qty * Unit Price complex forms § Spreadsheets are an attractive and flexible medium for interaction EIA-FR / J. Bapst MMI_01 54
Menu Selection Multimodal Interfaces [1] § A menu is a set of options displayed on the screen § Text-based (ev. options numbered) or GUI-based (mouse selection) § Selection and execution of one (or more) of the options results in a state change of the interface § Classical web-pages are often mainly based on menu selection EIA-FR / J. Bapst MMI_01 55
Menu Selection Multimodal Interfaces [2] § Advantages Ø Affords exploration ("look around") Ø Relies on recognition rather than recall Ø Ideal for novice or intermittent users Ø Structures workflow and decision making (sequential, hierarchical, grouping) Ø User's input does not have to be parsed ð easier error handling § Disadvantages Ø May be slow for frequent users (shortcuts should be implemented) Ø Too many menus may lead to information overload (discouraging the users) Ø Not always suited for small graphic displays (need adaptation) Ø Highly hierarchical menus may be tedious (where to drill down ?) EIA-FR / J. Bapst MMI_01 56
Menu Selection / Guidelines Multimodal Interfaces [3] § Make menu options meaningful in the user’s language § Logically group similar options to aid recognition § Use hierarchical organization where appropriate (menus/submenus) § Use sequential organization where appropriate (arrange options in order to suggest a workflow sequence) EIA-FR / J. Bapst MMI_01 57
Natural Language Interaction Multimodal Interfaces § Natural language understanding § Forms : speech or written input (smart command language) § Very attractive style of interaction (at least at first glance) § A very difficult task Ø Parsing language is very difficult (a language is vague and imprecise by it’s very nature) Ø Phrases and words are quite often ambiguous (homonyms, …) Ø Spelling errors and/or variations exacerbate written input Ø Synonyms exacerbate written and speech input Ø Converting audio speech to machine-readable text is very difficult § Subject of considerable interest and research § Relatively successful in restricted domains or after an extensive learning process (still natural language ?) § A simpler approach : query-language (restricted context, more formal) EIA-FR / J. Bapst MMI_01 58
Natural / Query / Command Language Multimodal Interfaces § Distinction between written natural language, query language and command language is sometimes blurred. § What appears as a natural language interface may simply be a front-end for a query sub-system. § The question is parsed into keywords to form a query § Other related example : web search engine EIA-FR / J. Bapst MMI_01 59
Question / Answer Multimodal Interfaces § Simple mechanism for providing input to an application. § The user is asked a series of questions and is led through the interaction step by step. Ø Yes/no response Ø Multiple choice Ø Codes § Examples : web questionnaires, web inquiries § Easy to learn but limited in functionality and power (appropriate for restricted domains and for novice users). EIA-FR / J. Bapst MMI_01 60
WIMP Multimodal Interfaces § WIMP : Windows + Icons + Menus + Pointing devices § Popularized by Graphical User Interface (GUI) § Currently the most common environment for interacting with computers (sometimes simply called Windowing system) § Need appropriate visual representation of objects to interact with Ø Symbolized pictorial representations (icons) may be difficult to interpret because of their small size § Typically based on metaphors The essence of metaphor is understanding and experiencing one kind of thing in terms of another § Particularly suited to explore an application EIA-FR / J. Bapst MMI_01 61
Point-and-Click Interaction Multimodal Interfaces § Point-and-Click is closely related to WIMP but a little bit simpler § In information browsing or simple multimedia systems, most interactions require only a single click of a mouse button § Closely related to hypertext idea § Popularized by the web (browser) § Not limited to mouse device, also used for touch screen (interactive information kiosks) EIA-FR / J. Bapst MMI_01 62
Direct Manipulation Multimodal Interfaces [1] § Direct manipulation (defined by Ben Shneiderman in 1982) is often closely related to WIMP but not limited to it. § Direct manipulation involves continuous representation of the object of interest, and rapid incremental reversible operations whose impact on the object is immediately visible (feedback). An auditory feedback may also be provided. § The intention is to allow a user to directly manipulate objects presented to them, using actions that correspond at least loosely to the physical world (metaphor). § Direct manipulation implies physical actions instead of complex syntax. § E.g. Drag-and-drop operation EIA-FR / J. Bapst MMI_01 63
Direct Manipulation Multimodal Interfaces [2] § Features of a direct manipulation interface (highlighted by Ben Shneiderman) : Ø Visibility of the objects of interest Ø Incremental action with rapid feedback on all action Ø Reversibility of all actions (allows exploration without penalties) Ø Syntactic correctness of all actions (every user action is a legal operation) Ø Replacement of complex command language with actions to manipulate directly the visible objects § With direct manipulation there is no clear distinction between input and output (e.g. the document icon is an output expression in the desktop metaphor, but that icon is used by the user to move the document). § Directness partly depends on the gap between user's goal and system image (evaluation and execution's gulf in action's theory). EIA-FR / J. Bapst MMI_01 64
Direct Manipulation Multimodal Interfaces [3] § Two variants of direct manipulation Ø Program manipulation Ø Content manipulation § Program manipulation is typically focused on the management of the program and its interface. Ø Selection Ø Drag-and-drop Ø Controlmanipulation (button pushing, scrolling, ...) Ø Resizing, reshaping, repositioning Ø Connecting objects § Content manipulation is involved primarily with the direct manual creation, modification and movement of data with a pointing device. Ø Drawing,painting, sketching Ø 3D Modeling EIA-FR / J. Bapst MMI_01 65
Direct Manipulation Multimodal Interfaces [4] § The tool seems to disappear. § The user can apply intellect directly to the task (not to the tool). § Advantages Ø Easy to learn and remember Ø Encourages exploration (reduced anxiety) Ø High subjective satisfaction (fun and entertaining) Ø Recognition memory § Drawbacks Ø Mouse operations may be slower than typing Ø Need to learn meaning of components (visual representation) Ø Not so intuitive (most users don't discover it independently) Ø More difficult to program (is this relevant ?) Ø Tedious for repeated actions Ø History keeping is harder EIA-FR / J. Bapst MMI_01 66
Indirect Manipulation Multimodal Interfaces § Not all tasks can be described using concrete objects and not all actions can be performed directly. § There is a continuum from "Do it yourself" to "Command control". § Most GUI's are a combination of direct and indirect manipulation. Ø Using a menu is rather an indirect manipulation Ø Using a pop-up menu is more direct, but it is less direct than dragging an element. § Example 1 : choosing a color Ø Using an "eye dropper" mouse pointer ð [direct] Ø By typing the color values ([255, 255, 0] to get yellow) ð [indirect] § Example 2 : defining a text size § The expression indirect manipulation is also used when the user interact with the real world (instrument, plant) through an interface. EIA-FR / J. Bapst MMI_01 67
3D Interfaces Multimodal Interfaces [1] § We live in a three-dimensional world. § There is an increasing use of 3D effects in user interface design. § From simple techniques (shading, etching, sculptural effects, 3D icons, …) to more complex 3D workspaces. § 3D workspaces give extra space in a more natural way than iconizing windows (objects shrinks when they are further away). EIA-FR / J. Bapst MMI_01 68
3D Interfaces Multimodal Interfaces [2] § In 3D workspaces, objects are displayed in perspective and their relative size, light, angle and occlusion provide an intuitive sense of distance. § The next step is virtual reality where the user can move within a simulated 3D world (will be discussed later). EIA-FR / J. Bapst MMI_01 69
Man-Computer Symbiosis Multimodal Interfaces § J.C.R. Licklider (1960) outlined "man-computer symbiosis" "The hope is that, in not too many years, human brains and computing machines will be coupled together very tightly and that the resulting partnership will think as no human brain has ever thought and process data in a way not approached by the information- handling machines we know today." EIA-FR / J. Bapst MMI_01 70
Brain-Computer Interface Multimodal Interfaces § A direct brain-computer interface (BCI) or direct neural interface) would add a new dimension to human-machine interaction. It would represent one of the new frontiers in science and technology. § Cerebral electric activity is recorded via the EEG : electrodes, attached to the scalp, measure the electric signals of the brain. These signals are amplified and transmitted to the computer, which transforms them into device control commands. § Interesting research work in this direction has been already initiated, mainly motivated by the hope to create new communication channels for those with severe neuromuscular disorders. EIA-FR / J. Bapst MMI_01 71
Multimodal Interfaces [1.5] Appendix HCI and GUI Short History
Douglas Engelbart's Vision (≈1955) Multimodal Interfaces § "…I had the image of sitting at a big CRT screen with all kinds of symbols, new and different symbols, not restricted to our old ones. The computer could be manipulated, and you could be operating all kinds of things to drive the computer… ... I also had a clear picture that one's colleagues could be sitting in other rooms with similar work stations, tied to the same computer complex, and could be sharing and working and collaborating very closely. And also the assumption that there'd be a lot of new skills, new ways of thinking that would evolve…" § In 1962 Doug Engelbart developed a conceptual framework for augmenting human intellect ("boost the collective IQ"). § He was a precursor in perceiving computers as facilitators for communication, rather than only computation. § He founded the Augmentation Research Center at Stanford, the precursor to Xerox PARC and developed a working vision of a collaborative computing environment, with a graphic windowed interface, mouse, hypertext system, networking, and electronic mail. EIA-FR / J. Bapst MMI_01 73
First Mouse (Douglas Engelbart, 1964) Multimodal Interfaces EIA-FR / J. Bapst MMI_01 74
Xerox PARC Alto + Star Projects (≈1975-1981) Multimodal Interfaces § Concept of personal workstation Ø local processor Ø Idea of a local area network to share resources § Modern graphical user interface (GUI) Ø bit-mapped display, mouse Ø Windows, menus, scroll bars, mouse selection, etc Ø Familiar user’s conceptual model (simulated desktop) Ø Promoted recognizing/pointing rather than remembering/typing Ø Property sheets to specify appearance/behavior of objects Ø What you see is what you get (WYSIWYG) Ø Modeless interaction § First system based upon usability engineering § Commercial flop EIA-FR / J. Bapst MMI_01 75
Apple Lisa (1983) Multimodal Interfaces § Predecessor of Macintosh § Based upon many ideas of the Star computer (Xerox) § Commercial failure as well (price ≈ $10'000) EIA-FR / J. Bapst MMI_01 76
Apple Macintosh (1984) Multimodal Interfaces § “Old ideas” but well done ! § Mistakes of Lisa corrected + aggressive pricing § Interface guidelines encouraged consistency between applications § Developer’s toolkit encouraged third party software § Domination in desktop publishing because of affordable laser printer and excellent graphics EIA-FR / J. Bapst MMI_01 77
Windows 1.01 (1987) Multimodal Interfaces § Built on the cryptic MS-DOS operating system § Almost unusable Ø No overlapping windows (unsightly tiled windows) Ø No icons EIA-FR / J. Bapst MMI_01 78
Windows 2.03 (1988) Multimodal Interfaces § With overlapping windows and Mac-like Ø Window-manipulation terminology : "Minimize", "Maximize" Ø Keyboard shortcuts (underlined mnemonics) Ø Introduction of Word for Windows and Excel § No commercial success Ø Developers still maintained DOS versions of their applications EIA-FR / J. Bapst MMI_01 79
Windows 3.1 (1992) Multimodal Interfaces § Follows Windows 3.0 (1990), a transition version which introduced significantly revamped user interface and numerous technical improvements. § First serious and successful desktop platform Ø TrueType font system Ø File Manager Ø Program Manager Ø Minesweeper § Followed in 1993 by Windows for Workgroups (3.11) with native networking support EIA-FR / J. Bapst MMI_01 80
MacOS X Multimodal Interfaces § New Aqua GUI Ø Double-buffering Ø Minimized windows stretching and squeezing into the dock Ø Expose feature to fit all applications on screen Ø Several eye- candy features EIA-FR / J. Bapst MMI_01 81
And a lot more... Multimodal Interfaces Amiga WorkBench (1985) Digital-Research GEM (1985) NeXTstep (1988) OS/2 (1992) EIA-FR / J. Bapst MMI_01 82
And more... Multimodal Interfaces Linux KDE (1996) Windows 95 (1995) BeOS (1998) Windows Vista (2006) EIA-FR / J. Bapst MMI_01 83
GUI History Timeline Multimodal Interfaces § A history of the GUI (by Jeremy Reimer) : arstechnica.com/articles/paedia/gui.ars § GUI Gallery (by Nathan Lineback) : toastytech.com/guis EIA-FR / J. Bapst MMI_01 84
Future GUI's / 3D Skins Multimodal Interfaces [1] § Microsoft Task Gallery Research project § Video : research.microsoft.com/adapt/taskgallery/video.mpg EIA-FR / J. Bapst MMI_01 85
Future GUI's Multimodal Interfaces [2] § Sun Looking Glass Research project § Video : www.sun.com/software/looking_glass/demo.xml EIA-FR / J. Bapst MMI_01 86
Key Points / What You Should Know Multimodal Interfaces § Human Communication Channels Ø Senses / Effectors § Human Modeling Ø Model Human Processor / Fitts' Law Ø Human Memory (Sensory / Short-term / Long-term) Ø Action Theory / Action Cycle Principle / Execution + Evaluation Gulfs § Reasoning Ø Deductive / Inductive / Abductive Ø Affective aspects / Emotions § Interaction Paradigms Ø Mode (Spatial / Temporal), Quasimode, Modeless Ø Most Common Interaction Styles ü Command-Line ü … ü Direct Manipulation / Indirect Manipulation ü … ü Brain-Computer Interface EIA-FR / J. Bapst MMI_01 87
You can also read