PLAN-BASED ASSISTANCE IN THE WEBBROWSER FIREFOX
←
→
Page content transcription
If your browser does not render page correctly, please read the page content below
PLAN-BASED ASSISTANCE IN THE WEBBROWSER FIREFOX Thomas A. Bertz Peter Reiss, M.A. Chair for Artificial Intelligence Chair for Artificial Intelligence University of Erlangen-Nuernberg University of Erlangen-Nuernberg Haberstrae 2, D-91058 Erlangen Haberstrae 2, D-91058 Erlangen email: thomas.a.bertz@informatik.stud.uni-erlangen.de email: reiss@informatik.uni-erlangen.de ABSTRACT complex desktop. The henceforth perennial application of We present a developed function library named pbifSys- GUIs has though bared new problems in control and gener- tem which will allow for remote control of the webbrowser ated new requirements. Mainly these include: Firefox. Instead of controlling the browser through gen- eral pc input devices such as mouse or keyboard, we will self-explanation Usage of menu-based applications has use functions of the library. The developed pbifSystem become more intuitive and easier (by simply brows- is meant to be a service layer that can be used by assis- ing the menus for the function), but the complete tance systems. The service offered appears in the pbifSys- functional range is though hidden behind the menu tem as planning an action sequence and visually executing similar to the interpreters console-based applications. it single-handed, which is appropriate for reaching the re- By integrating documentation in the program itself, quested goal by the user (e.g. Increase to enlarge the seperated handbooks can be omitted. Moreover can font of a currently shown website). To demonstrate pbif- the contained informations account for (interactive) System’s mode of operations without assistance system, a demonstrations that could be presented the user. This toolbar (pbifBar) was built. way an application could explain itself, its possible usage options and modes. KEY WORDS HMI, Planning, Application usability Because of the use of menu structures it is possi- ble, that many functions can’t be reached directly but only through sequences of actions (e.g. a sequence 1 Introduction of mouse clicks). Assistence systems that have been made aware of the user’s goals on the one side (e.g. 1.1 Motivation through speech dialog components) and that know how to read current system states on the other side In the beginning of computer science a special education can help to re-establish ergonomics in usability with- for handling computers was necessary. Available programs out requiring special software educations of the user. and commands had to be passed to a command interpreter (shell). Despite the abstraction of the underlying machine, accessibility Novices and with new programs or program versions expert users, too, are handicapped in learn- users had and still have to know about a huge functional range. Functionality was covered behind the interpreter or ability of new functions if they are hidden. Further- the handbook. Yet another abstraction layer was created more, the control with mouse and keyboard is possibly only suitable to a limited extend for disabled people. through graphical user interfaces (GUI) whereby today’s operating systems and applications can be operated much Assistance systems are a promising solution, because more intuitive. The idea is to allow for novices, too, to be they adapt to the user and not the other way round. able to solve some standard tasks with a personal computer autonomously without a need for studying handbooks or 1.2 Related Work attend courses of instruction. At least in principle a sim- ple mouse click replaced complex commands on command The UNIX Consultant (UC) [1] is an intelligent assistence shells today. system for the operating system UNIX. In a passive mode Nevertheless the functionality boost of today’s applications unskilled users can pose questions about typical UNIX and limited space on the desktop constrains developers to tasks to the system in natural language. UC answers with present the functions in a clearly laid out form. This be- concrete instructions. In an active mode UC can intervene comes apparent in many applications in their menu struc- and correct or optimize tasks activated by the user. To do ture where functions belonging together are grouped to- so UC uses several modules, e.g. a speech dialog module to gether in menus. Their content (menuitems) is not shown aquire the users goals, a knowledge database that contains until the user wishes so (e.g. by a mouse click). These syntax and semantics of the UNIX command domain and menu structures provide a tidy, clear and still potentially the possible user pragmatics and an inference and planning
module that allows to reason from internal and external sys- JavaScript While XUL and CSS are pure description lan- tem states to possible actions. This planbased attempt for guages, JavaScript [5] can execute commands and ac- finding solutions for user goals was exemplary for the de- tions, react on events [6] or do calculations. To of- veloped library pbifSystem in the work at hand. fer access to structure and style for programs, many Another prominent example for assistance systems may be browsers (so does Firefox) implement the DOM inter- the assistant of Microsoft’s Office package [2]. It uses face [7] in their JavaScript interpreters. The DOM in- bayesian networks to estimate the user’s goal or problem terface defines, how the structure of XUL documents from its naturally formed, textual request and presents an are mapped to a tree. Trees are important in com- adequate solution. This already draws a distinction to puter science because they are easy to navigate on, the atempt of UC and the pbifSystem. While the Office- read from and write to. Assistent exactly knows the way (plan) to solve the prob- lem or to reach the goal (it is static, hardcoded), the goal XPCOM Regarding security JavaScript only has access to itself is only estimated with help of bayesian networks. In unprivileged commands and calls. XPCOM (Cross contrast the preliminaries in the work at hand are oppo- Platform Component Object Model) [8] was created sitional: here the goal is known exactly, because a quite to allow for system calls from JavaScript and to make rudimental language is being developed, with which it is bridging between different programming languages possible to address the goals directly. Against it: the path possible. Via XPIDL (XPCOM Interface Definition to the goal is a priori unknown and will be calculated not Language) one firstly defines the interfaces of the until runtime by a planning component. components to be implemented. This definition is platform independant. Secondly the components are implemented in an eligible programming language (in 1.3 Goals the work at hand C/ C++). The emerged components (XPCOM components) can then be accessed from The work at hand pursues three goals: JavaScript by the so called XPConnect technique. direct access customize functionality that is hidden be- hind menu structure PDDL stands for Planning Domain Definition Lan- guage [9] [10]. In LISP-alike syntax one can rep- visualization System should be put in a position to visual- resent a section of the world (situation) with objects, ize system activity that it could also serve as an inter- predicates and an initial situation. A goal situation de- active control and teaching tool. scribes the situation to be reached by a plan. So called planoperators (actions) allow for transitions from sit- application in assistance systems The developed system uations to situations. These enable a planner (here the shall constitute a library that can serve as a basis for Fast-Forward Planner [11] is used) to find a possible assistance systems in context with the webbrowser path (a plan) from the initial situation to the goal situ- Firefox. ation. In conjunction with assistance systems the pbifSystem should hereby meet the requirements from above (self- 3 The pbifSystem explanation, usability and accessibility). Firefox is a pretty slim webbrowser. It contains just the code that is necessary to fulfill its tasks. In exchange ev- 2 Tools ery user can install additional features through so-called ex- tensions (XUL, CSS and JavaScript-Code) or components Firefox and the work at hand use a couple of interfaces and (XPCOM-Code). Within the scope of the work at hand, the programming languages. These shall be introduced here: pbifSystem [12] was developed, a Firefox extension that XUL stands for XML User Interface Language [3]. With can be used as library or basic service layer by assistance XUL you can describe GUIs in a platform indepen- systems to control the webbrowser. An example for a co- dant manner. Similar to XML is the strict separation operating assistance system is CONALD [13] developed at of structure and design. A rendering engine (Gecko our chair. In the pbifSystem we first concentrated on the in Firefox) is then responsible to transform XUL tags functionality of the main menu. To be able to demonstrate like , or into the working pbifSystem without an assistance system, we graphical widgets like buttons, menus or scollbars. developed a toolbar by name pbifBar (see Figure ??). It is meant to simulate function calls of an assistance sys- CSS stands for Cascading Style Sheets [4]. CSS describes tem. In a listbox the user can currently choose among two the design (style) of XUL elements that is interpreted assistance modes namely pbifGuideMeByLabel and by Gecko, likewise. In the work at hand CSS is mainly pbifGuideMeByUserGoal. The first case addresses employed to visualize system activity to the user (e.g. goals by label names of the accordant menuitem while in currently selected menuitem by the system). the second case they are addressed by self-defined goal
names that have been communicated to the system in a teaching mode before. The middle listbox holds the goal name. With a click on the button Execute the system starts to search for a valid plan and in case of success returns an assistance sequence that is visualized and exe- cuted. Figure 1. pbifBar: Toolbar in Firefox 3.1 Call for Assistance 3.1.1 Internal Program Cycle Suppose, a user in a typical assistance situation wants to enlarge the font of the currently displayed webpage. With- Figure 2. Process of a call for assistence out an assistance system the user would have to execute the click sequence Edit | Text Size | Increase. Asking the pbifSystem to enlarge the font size would look like this: pbifGuideMeByLabel() as the mode and (isOpen ?x) Increase as the goal name. If for some reason the (clicked ?x)) menu holds more labels with the same name the pbifSys- tem would choose the first occurrance (at the typesetting of ( :action openMenu :parameters (?pMenu - MENU_T) this paper, but it is easy to imagine, that a bunch of other :vars (?lParent) meaningful semantics is possible and can be implemented). :precondition Figure ?? shows the rough process of a typical call for as- (and sistance to the pbifSystem. In the phase of initialization (isPARENTof ?lParent ?pMenu) the pbifSystem traverses the main menu’s complete DOM- (isOpen ?lParent) tree and transforms it to an internal representation which (not (isOpen ?pMenu))) can be translated to PDDL later on. This represents the :effect (isOpen ?pMenu)) Firefox-Menuworld. If a user addresses a request for a goal to the pbifSystem, it is passed to Firefox’s JavaScript inter- ( :action clickMenuitem preter and likewise translated to PDDL. Out of both merged :parameters (?pItem - MITEM_T) PDDL fragments a planning call is sent to the planner. That :vars (?lParent - MENU_T) :precondition will return a valid plan for the planning problem in case of (and success. The plan is a click sequence of menu elements. (isPARENTof ?lParent ?pItem) This click sequence is returned to the JavaScript interpreter (isOpen ?lParent) that in turn executes and visualizes it. Visualization is real- (not (clicked ?pItem))) ized through CSS in doing short-time changes of the back- :effect ground color of the active elements (blinking). (and (clicked ?pItem) (not (isOpen ?lParent))))) 3.1.2 Description of the Planning Domain Listing 1. Firefox menu domain PDDL code Listing 1 shows an extract of the Firefox-Menu domain PDDL code. (define (domain domain0) Predicates and planoperators (actions) have been ( :types MENUBAR_T MENU_T MITEM_T) hardcoded into the pbifSystem and are constant all over runtime. They describe the semantics of Firefox’s menu- ( :predicates world. A menu only allows for opening if its parent is (isPARENTof ?pParent ?pChild) opened and is closed itself. These are the preconditions to
execute the action. The postcondition (effect) of this plan- plan returned by the planner can directly be executed by operator is the opened menu (which is what we expected). the JavaScript method eval(). The operator clickMenuitem is described similarly. 4 Results 3.1.3 Description of the Planning Problem In this work we showed on the basis of the webbrowser In contrast, objects and the predicates’ values are calculated Firefox that a planbased approach is helpful and promising during runtime in the phase of initialization and can there- in the development and application of assistance systems. fore change before each planning phase. As mentioned UC for the console based UNIX and the pbifSystem for the above, the complete DOM-tree is traversed and translated GUI oriented Firefox showed, that the concept can be trans- to PDDL for this purpose. An extract is shown in Listing fered to other programs. Preconditions like read access 2. from and control access of the program must be met. Fire- (define (problem problem0) fox in particular allows for extension and generalization to (:domain domain0) dialog windows or webpage content. For that purpose it (:objects is needed to write more and extensive planoperators. The main-menubar - MBAR_T ; Root pbifSystem offers exemplarily two ”intelligent” operation pbifID_0 - MENU_T ; File modes, a sensitive one (pbifGuideMeByLabel() and pbifID_1 - MITEM_T ; New Window an adaptive one (pbifGuideMeByUserGoal()). The pbifID_2 - MITEM_T ; New Tab pbifSystem empowers assistance systems to control the pbifID_12 - MITEM_T ; Print... pbifID_15 - MITEM_T ; Quit webbrowser Firefox and allows for building more ”higher pbifID_16 - MENU_T ; Edit intelligent” service layers. pbifID_17 - MITEM_T ; Undo pbifID_18 - MITEM_T ; Redo ... References ) [1] R. Wilensky, D. N. Chin, M. Luria, J. H. Martin, (:init J. Mayfield, and D. Wu, “The berkeley unix con- (isOpen main-menubar) sultant project.,” Computational Linguistics, vol. 14, (isPARENTof main-menubar pbifID_0) no. 3, pp. 35–84, 1988. (isPARENTof pbifID_0 pbifID_1) (isPARENTof pbifID_0 pbifID_2) [2] D. Heckerman and E. Horvitz, “Inferring informa- ... tional goals from free-text queries: A bayesian ap- (isPARENTof pbifID_0 pbifID_15) proach,” 1998. (isPARENTof main-menubar pbifID_16) (isPARENTof pbifID_16 pbifID_17) [3] unknown author, XULPlanet, “Xul reference,” 1999. ... http://www.xulplanet.com/references/, 2006-04-19 ) 12:27. (:goal (clicked pbifID_38)) [4] H. W. Lie and B. Bos, “Cascading style sheets, level ) 1,” 1999. http://www.w3.org/TR/REC-CSS1, 2006-04-19 Listing 2. Firefox menu problem PDDL code 12:23. Each object in the Firefox-Menuworld is associated with [5] D. Flanagan, JavaScript. Cambridge: O’Reilly, 1998. an identification number named pbifID to be able to ref- [6] T. Pixley, “Document object model (dom) level 2 erence it without mix-ups. It was not possible to use the events specification,” 2000. label names instead because they are not unique necessar- http://www.w3.org/TR/2000/REC-DOM-Level-2- ily. Moreover they allow symbols in their strings (Unicode) Events-20001113, 2006-04-11 10:04. not defined for the PDDL syntax. The pbifID is gener- ated automatically during reading the DOM-tree. After- [7] L. Wood et al., “Document object model (dom) level wards the menu’s, submenu’s and menuitem’s parent-child 1 specification (second edition),” 2000. relationship is mapped into the planning world through http://www.w3.org/TR/2000/WD-DOM-Level-1- the isPARENTof predicate. Translating the goal request 20000929/, 2006-04-11 10:03. to PDDL forms the last step. All this comprises a com- plete planning problem in a planning domain and is sent to [8] D. Turner and I. Oeschger, “Cre- the planner. The list (sequence) consists of planoperators ating xpcom components,” 2003. whose names were already modeled according to the ap- http://www.mozilla.org/projects/xpcom/book/, propriate JavaScript function names. Because of this, the 2006-04-19 12:54.
[9] D. McDermott, M. Ghallab, and A. Howe, “Pddl - the planning domain definition language,” tech. rep., AIPS’98 Planning Competition Committee, Oktober 1998. Version 1.2. [10] A. Gerevini et al., “Plan constraints and preferences in pddl3,” tech. rep., Department of Electronics for Automation, University of Brescia, Italy, August 2005. [11] J. Hoffmann, “Ff: The fast-forward planning system,” The AI Magazine, 2001. [12] T. Bertz, “Planbasierte Benutzerführung im Web- browser Firefox,” studienarbeit, Universität Erlangen- Nürnberg, May 2006. [13] M. Klarner, Hybride, pragmatisch eingebettete Re- alisierung mittels Bottum-Up-Generierung in einem natürlichsprachlichen Dialogsystem. PhD thesis, Universität Erlangen-Nürnberg, 2005.
You can also read