Towards Using Promises for Multi-Agent Cooperation in Goal Reasoning
←
→
Page content transcription
If your browser does not render page correctly, please read the page content below
Towards Using Promises for Multi-Agent Cooperation in Goal Reasoning Daniel Swoboda, Till Hofmann, Tarik Viehmann, Gerhard Lakemeyer Knowledge-Based Systems Group RWTH Aachen University, Germany {swoboda,hofmann,viehmann,lakemeyer}@kbsg.rwth-aachen.de Abstract (Kaelbling and Lozano-Pérez 2011), and continual planning arXiv:2206.09864v1 [cs.AI] 20 Jun 2022 (Brenner and Nebel 2009; Hofmann et al. 2016) allow to Reasoning and planning for mobile robots is a challenging deal with some types of execution uncertainty, planning for problem, as the world evolves over time and thus the robot’s goals may change. One technique to tackle this problem is the overall objective often simply is too time consuming. goal reasoning, where the agent not only reasons about its Goal reasoning solves these problems by splitting the ob- actions, but also about which goals to pursue. While goal jective into smaller goals that can be easily planned for, reasoning for single agents has been researched extensively, and allows to react to changes dynamically by refining the distributed, multi-agent goal reasoning comes with additional agent’s goals. This becomes even more relevant in multi- challenges, especially in a distributed setting. In such a con- agent settings, as not only the environment is dynamic, but text, some form of coordination is necessary to allow for co- the other agents may also act in a way that affects the agent’s operative behavior. Previous goal reasoning approaches share goals. While some work on multi-agent goal reasoning ex- the agent’s world model with the other agents, which already ists (Roberts et al. 2015, 2021; Hofmann et al. 2021), pre- enables basic cooperation. However, the agent’s goals, and vious approaches focus on conflict avoidance, rather than thus its intentions, are typically not shared. facilitating active cooperation between multiple agents. In this paper, we present a method to tackle this limitation. Extending an existing goal reasoning framework, we pro- pose enabling cooperative behavior between multiple agents In this paper, we propose an extension to multi-agent through promises, where an agent may promise that certain goal reasoning that allows effective collaboration between facts will be true at some point in the future. Sharing these agents. To do this, we attach a set of promises to a goal, promises allows other agents to not only consider the cur- which intuitively describe a set of facts that will become true rent state of the world, but also the intentions of other agents after the goal has been achieved. In order to make use of when deciding on which goal to pursue next. We describe promises, we extend the goal lifecycle (Roberts et al. 2014) how promises can be incorporated into the goal life cycle, with goal operators, which, similar to action operators in a commonly used goal refinement mechanism. We then show planning, provide a formal framework that defines when a how promises can be used when planning for a particular goal by connecting them to timed initial literals (TILs) from PDDL goal can be formulated and which objective it pursues. We planning. Finally, we evaluate our prototypical implementa- use promises to evaluate the goal precondition not only in tion in a simplified logistics scenario. the current state, but also in a time window of future states (with some fixed time horizon). This allows the goal rea- soning agent to formulate goals that are currently not yet 1 Introduction achievable, but will be in the future. To expand the goal into While classical planning focuses on determining a plan that a plan, we then translate promises into timed initial literals accomplishes a fixed goal, a goal reasoning agent reasons from PDDL2.2 (Edelkamp and Hoffmann 2004), thereby al- about which goals to pursue and continuously refines its lowing a PDDL planner to make use of the promised facts goals during execution. It may decide to suspend a goal, to expand the goal. With this mechanism, promises enable re-prioritize its goals, or abandon a goal completely. Goals active collaborative behavior between multiple, distributed are represented explicitly, including goal constraints, rela- goal reasoning agents. tions between multiple goals, and tasks for achieving a goal. Goal reasoning is particularly interesting on mobile robots, The paper is organized as follows: In Section 2, we sum- as a mobile robot necessarily acts in a dynamic environment marize goal reasoning and discuss related work. As basis of and thus needs to be able to react to unforeseen changes. our implementation, we summarize the main concepts of the Computing a single plan to accomplish an overall objective CLIPS Executive (CX) in Section 3. In Section 4, we intro- is often unrealistic, as the plan becomes unrealizable dur- duce a formal notion of a goal operator, which we will use ing execution. While planning techniques such as contingent in Section 5 to define promises. In Section 6, we evaluate planning (Hoffmann and Brafman 2005), conformant plan- our approach in a distributed multi-agent scenario, before ning (Hoffmann and Brafman 2006), hierarchical planning we conclude in Section 7.
2 Background and Related Work same time. In such cases, multi-agent coordination is neces- sary to avoid potential conflicts, create robust solutions and Goal Reasoning. In goal reasoning, agents can “deliber- enable cooperative behavior between the agents (Dias et al. ate on and self-select their objectives” (Aha 2018). Sev- 2006). Agents may coordinate implicitly by recognizing or eral kinds of goal reasoning have been proposed (Vat- learning a model of another agent’s behavior (Sukthankar tam et al. 2013): The goal-driven autonomy (GDA) frame- et al. 2014; Albrecht and Stone 2018). Particularly rele- work (Muñoz-Avila et al. 2010; Cox 2016) is based on fi- vant is plan recognition (Carberry 2001), where the agent nite state-transition systems known from planning (Ghallab, tries to recognize another agent’s goals and actions. A L - Nau, and Traverso 2016) and define a goal as conjunction LIANCE (Parker 1998) also uses an implicit coordination of first-order literals, similar to goals in classical planning. method based on impatience and acquiescence, where an They define a goal transformation function β, which, given agent eventually takes over a goal if no other agent has ac- the current state s and goal g, formulates a new goal g 0 . In complished the goal (impatience), and eventually abandons the GDA framework, the goal reasoner also produces a set a goal if it realizes that it is making little progress (acqui- of expectations, which are constraints that are predicted to escence). In contrast to such approaches, we propose that be fulfilled during the execution of a plan associated with the agents explicitly communicate (parts of) their goals, so a goal. In contrast to promises, which we also intend as a other agents may directly reason about them. To obtain a model of constraints to be fulfilled, those expectations are conflict-free plan, each agent may start with its own individ- used for discrepancy detection rather than multi-agent co- ual plan and iteratively resolve any flaws (Cox and Durfee ordination. The Teleo-Reactive Executive (T-REX) architec- 2005), which may also be modeled as a distributed constraint ture (McGann et al. 2008) is a goal-oriented system that optimization problem (Cox, Durfee, and Bartold 2005). In- employs multiple levels of reasoning abstracted in reactors, stead of starting with individual plans, Jonsson and Rovat- each of which operates in its own functional and temporal sos (2011) propose to start with some initial (sub-optimal) scope (from the entire mission duration to second-level op- shared plan and then let each agent improve its own ac- erations). Reactors on lower levels manage the execution of tions. Alternatively, agents may also negotiate their goals subgoals generated at higher levels, working in synchronised (Davis and Smith 1983; Kraus 2001), e.g., using argumen- timelines which capture the evolution of state-variables over tation (Kraus, Sycara, and Evenchik 1998), which typically time. While through promises we also attempt to encode a requires an explicit model of the mental state. Verma and timeline of (partial) objective achievement, our approach has Ranga (2021) classify coordination mechanisms based on no hierarchy of timelines and executives. Timing is not used properties such as static vs dynamic, weak vs strong, im- to coordinated between levels of abstraction on one agent, plicit vs explicit, and centralized vs decentralized. Here, but instead is used to indicate future world-states to other, we focus on dynamic, decentralized and distributed multi- independent agents reasoning in the same temporal scope. robot coordination. Alternatively, coordination approaches A Hierarchical Goal Network (HGN) (Shivashankar et al. can be classified based on organization structures (Horling 2012) is a partial order of goals, where a goal is selected and Lesser 2004), e.g., coalitions, where agents cooperate for execution if it has no predecessor. HGNs are used in but each agent attempts to maximize its own utility, or teams, the G O D E L planning system (Shivashankar et al. 2013) to where the agents work towards a common goal. Such teams decompose a planning problem, similar to HTN planning. may also form dynamically, e.g., by dialogues (Dignum, Roberts et al. (2021) extend HGNs to Goal Lifecycle Net- Dunin-Keplicz, and Verbrugge 2001). Here, we are mainly work (GLNs) to integrate them with the goal lifecycle. The interested in fixed teams, where a fixed group of robots goal lifecycle (Roberts et al. 2014) models goal reasoning as have a common goal. Role assignment approaches attempt an iterative refinement process and describes how a goal pro- to assign fixed and distinct roles to each of the participat- gresses over time. As shown in Figure 1, a goal is first formu- ing agents and thereby fixing the agent’s plan. This can be lated, merely stating that it may be relevant. Next, the agent achieved either by relying on a central instance or through selects a goal that it deems the goal to be useful. The goal is distributed approaches (Iocchi et al. 2003; Vail and Veloso then expanded, e.g., by querying a PDDL planner for a plan 2003; Jin et al. 2019). Intention sharing approaches allow that accomplishes the goal. A goal may also be expanded agents to incorporate the objectives of other agents into their into multiple plans, the agent then commits to one particular own reasoning in an effort to preemptively avoid conflicting plan. Finally, it dispatches a goal by starting to execute the actions (Holvoet and Valckenaers 2006; Grant, Kraus, and plan. It has been implemented in ActorSim (Roberts et al. Perlis 2002; Sarratt and Jhala 2016). SharedPlans (Grosz 2016) and in the CX (Niemueller, Hofmann, and Lakemeyer and Kraus 1996) use complex actions that involve multi- 2019), which we extend in this work with promises. Most ple agents. They use an intention sharing mechanism, for- goal reasoning approaches focus on scenarios with single malize collaborative plans, and explicitly model the agents’ agents or scenarios where a central coordinating instance is mental states, forming mutual beliefs. Similar to promises, available. In contrast, our approach focuses on a distinctly an agent may share its intention with intending-that mes- decentralized multi-agent scenario with no central coordina- sages, which allow an agent to reason about another agent’s tor. actions. Depending on the context, agents must have some Multi-Agent Coordination. In scenarios where multiple form of trust (Yu et al. 2013; Pinyol and Sabater-Mir 2013) agents interact in the same environment, two or more agents before they cooperate with other agents. Market-based ap- might require access to the same limited resources at the proaches as surveyed by Dias et al. (2006) use a bidding
process to allocate different conflict-free tasks between the Up to two robots (WALL -E and R2D2) produce the mate- agents of a multi-robot system. This can be extended to auc- rial Xenonite by first mining Regolith in a mine, refining the tion of planning problems in an effort to optimize temporal Regolith to Processite in a refinery machine, and then using multi-agent planning (Hertle and Nebel 2018). In the con- Processite to produce Xenonite in a production machine. Let text of multi-agent goal reasoning, Wilson and Aha (2021) us assume the following setting as the basis for our example: perform the goal lifecycle for an entire multi-robot system one machine is filled with Regolith, one machine is empty, on each agent separately and then use optimization methods the robots currently pursue no goals. In this scenario, each to prune the results. robot may pursue five classes of goals: (a) F ILL C ONTAINER to fill a container with Regolith at the mine or with Proces- 3 The CLIPS Executive site, (b) C LEAN M ACHINE to fill a container with Processite or Xenonite at a machine, (c) D ELIVER a container to a ma- The CLIPS Executive (CX) (Niemueller, Hofmann, and chine, (d) S TART M ACHINE to use a refinery or production Lakemeyer 2019) is a goal reasoning system implemented machine to process the material, (e) D ELIVER X ENONITE to in the CLIPS rule-based production system (Wygant 1989). deliver the finished Xenonite to the storage. Because one Goals follow the goal lifecycle (Roberts et al. 2014), with machine is filled, a goal of class S TART M ACHINE can be domain specific rules guiding their progress. It uses PDDL formulated. The parameters of the goal class are grounded to define a domain model, which is parsed into the CLIPS accordingly, i.e., the target machine is set to the filled ma- environment to enable continuous reasoning on the state of chine. Should both machines be filled at the same time, two the world, using a representation that is compatible with a goals of class S TART M ACHINE could be formulated, one for planner. Therefore, the notions of plans and actions are the each of the machines. The agent now selects one of the for- ones known from planning with PDDL. Multiple instances mulated goals for expansion. Since in our case only one goal of the CX (e.g., one per robot), can be coordinated through is formulated, it will be selected. Should there be S TART- locking and domain sharing mechanisms, as demonstrated M ACHINE goals for each machine, then, e.g., the one oper- in the RoboCup Logistics League (RCLL) (Hofmann et al. ating on second machine in the production chain might be 2021). However, these mechansims only serve to avoid con- prioritized. flicts as each agent still reasons independently, without con- sidering the other agents intents and goals. Execution Model. After a goal has been expanded (i.e., a plan has been generated), the CX commits to a certain plan Goals. Goals are the basic data structure which model cer- and then dispatches the goal by executing the plan. For se- tain objectives to be achieved. Each goal comes with a set of quential plans, whenever no action of the plan is executed, preconditions which must be satisfied by the current world the CX selects the next action of the plan. It then checks state before being formulated. In the CX, these precondi- whether the action is executable by evaluating the action’s tions are modelled through domain-specific CLIPS rules. precondition, and if so, sends the action to a lower level exe- Each instance of a goal belongs to a class, which repre- cution engine. If the action’s precondition is not satisfied, it sents a certain objective to be achieved by the agent. Addi- remains in the pending state until either the precondition is tionally, the goal can hold parameters that might be required satisfied or a timeout occurs. In a multi-agent scenario, each for planning and execution. A concrete goal is obtained by CX instance performs its own planning and plan execution grounding all parameters of a goal class. Depending on the independently of the other agents. current state of the world, each robot may formulate multiple The execution of two goals with one robot in our exam- goals and multiple instances of the same goal class. ple scenario is visualized in with one robot is visualized in Each goal follows a goal lifecycle, as shown in Figure 1. Figure 2 as S CENARIO 1. After finishing the goal of class The agent formulates a set of goals based on the current state S TART M ACHINE, a goal of class C LEAN M ACHINE is for- of the world. These can be of different classes, or multi- mulated and dispatched, clearing the machine’s output. ple differently grounded versions of the same class. After formulation, one specific goal is selected, e.g., by priori- Multi-Agent Coordination. The CX supports two meth- tizing goals that accomplish important objectives. The se- ods for multi-agent coordination: locking actions acquire a lection mechanism may be modelled in a domain-specific lock for exclusive access, e.g., a location so no two robots way, which allows adapting the behavior according to the try to drive to the same location. They are part of a plan and domain-specific requirements. In the next step, the selected executed similar to regular actions. Goal resources allow co- goal is expanded by computing a plan with an off-the-shelf ordination on the goal rather than the action level. Each goal PDDL planner. Alternatively, a precomputed plan can be may have a set of required resources that it needs to hold fetched from a plan library. Once a plan for a goal is de- exclusively while the goal is dispatched. If a second goal re- termined, the executive attempts to acquire its required re- quires a resource that is already acquired by another goal, sources. Finally, the plan is executed on a per-action basis. the goal is rejected. This allows for coordination of conflict- ing goals, e.g., two robots picking up the same container. Goals Example. To illustrate goals and goal reasoning in Both methods are intended to avoid conflicts and overlap- the CX, let us consider a simple resource mining scenario ping goals between the agents, rather than fostering active which we will evolve throughout this paper and finally use cooperation. as the basis for our conceptual evaluation. Different case ex- To illustrate coordination by means of goal resources, let amples based on the given setting are presented in Figure 2. us extend the previous example S CENARIO 1 by adding the
second robot. In S CENARIO 2, illustrated in Figure 2, both ize goals. Similar to Cox (2016), we base our definition of WALL -E and R2D2 formulate and select S TART M ACHINE a goal on classical planning problems. However, as we need for M1, which requires the machine as a resource. WALL -E to refer to time later, we assume that each state is timed. For- starts formulating first and manages to acquire the resources. mally, given a finite set of logical atoms A, a state is a pair Therefore it is able to dispatch its goal. R2D2 on the other (s, t) ∈ 2A × Q, where s is the set of logical atoms that hand must reject the goal as it is not able to acquire the ma- are currently true, and t is the current global time. Using the chine resource. After its goal has been rejected, it may select closed world assumption, logical atoms not mentioned in s a different goal. Since now the machine is occupied, it has are implicitly considered to be false. A literal l is either a to wait until the preconditions of some class are met and a positive atom (denoted with a) or a negative atom (denoted new goal can be formulated. This is the case once the effects with a). For a negative literal l, we also denote the corre- of S TART M ACHINE are applied, which causes R2D2 to for- sponding positive literal with l (i.e., l = l). A set of atoms s mulate and dispatch C LEAN M ACHINE to clear the machine. satisfies a literal l, denoted with s |= l if (1) l is a positive Execution Monitoring. The CX includes execution mon- literal a and a ∈ s, or (2) l is a negative literal a and a 6∈ s. itoring functionality to handle exogenous events and action Given a set of literals L = {l1 , . . . , ln }, the state s satisfies failure (Niemueller, Hofmann, and Lakemeyer 2018). Since L, denoted with s |= L, if s |= li for each li ∈ L. most actions interact with other agents or the physical world, We define a goal operator similar to a planning operator it is critical to monitor their execution for failure or delays. in classical planning (Ghallab, Nau, and Traverso 2016): Action may either be retried or failed, depending on domain- Definition 1 (Goal Operator). A goal operator is a tuple γ = specific evaluations of the execution behavior. Additionally, (head(γ), pre(γ), post(γ)), where timeouts are used to detect potentially stuck actions. The failure of an action might lead to reformulation of a goal. • head(γ) is an expression of the form goal (z1 , . . . , zk ), This means that the plan of the goal might be adapted w.r.t. where goal is the goal name and z1 , . . . , zk are the goal the partial change in the world state since original formu- parameters, which include all of the variables that ap- lation. Alternatively, a completely new objective might be pear in pre(γ) and post(γ), followed by the agent. • pre(γ) is a set of literals describing the condition when a goal may be formulated, 4 Goal Reasoning with Goal Operators • post(γ) is a set of literals describing the objective that the goal pursues, akin to a PDDL goal. Check goal Check goal A goal is a ground instance of a goal operator. A goal g precondition precondition can be formulated if s |= pre(g). When a goal is finished, FORMULATED FORMULATED Choose among goals its post condition holds, i.e., s |= post(g). Note that in con- SELECTED SELECTED trast to an action operator (or a macro operator), the effects TILs Generate plan of a goal are not completely determined; any sequence of ac- EXPANDED EXPANDED tions that satisfies the objective post(g) is deemed feasible Commit to a plan or sub-goal and additional effects are possible. Thus, the action sequence COMMITTED COMMITTED Rejection that accomplishes a goal is not pre-determined but computed Acquire goal resources by a planner and may depend on the current state s. Consider DISPATCHED Promises DISPATCHED Action selection and execution the goal C LEAN M ACHINE from Listing 1. Its objective de- FINISHED FINISHED fines that the robot carries a container and that the container Evaluation of goal outcome is filled. However, to achieve this objective given the set pre- EVALUATED EVALUATED conditions, the robot also moves away from its current lo- CX/System cation, which will be an additional effect of any plan that RETRACTED RETRACTED achieves the goal. Promising Goal Promise-dependent Goal We have extended the CX with goal operators (shown in Listing 1). For each goal operator and every possible Figure 1: The CX goal lifecycle (Niemueller, Hofmann, grounding of the operator, the goal reasoner instantiates the and Lakemeyer 2019) of two goals and their interaction by corresponding goal precondition and tracks its satisfaction means of promises. When the first goal is dispatched, it while the system is evolving. Once the goal precondition is makes a set of promises, which can be used by the second satisfied, the goal is formulated. Afterwards, the goal fol- goal for its precondition check. If the goal precondition is lows the goal lifecycle as before (Figure 1). satisfied based on the promises, the goal can already be for- mulated (and subsequently selected, expanded, etc.) even if 5 Promises the precondition is not satisfied yet. Additionally, promises A goal reasoning agent relying on goal operators has a are also used as timed initial literals (TILs) for planning in model of the conditions that have to be met before it exe- order to expand a goal. cutes a goal g (pre(g)) and a partial model of the world after the successful execution of its plan (its objective post(g)). To define promises and how promises affect goal formula- Thus, when the agent dispatches g, it assumes that it will tion in the context of goal reasoning, we first need to formal- accomplish the objective at some point in the future (i.e.
Scenario 1 - Base Case start-machine(M1) move(BASE,M1) collect-processite(M1) move(M1,BASE) Wall-E formulated formulated StartMachine CleanMachine dispatched dispatched effect: machine-in-state(M1, READY) Scenario 2 - Multi-Agent, No Promises start-machine(M1) Wall-E formulated StartMachine dispatched effect: machine-in-state(M1, READY) M1 M1 move(BASE,M1) collect-processite(M1) move(M1,BASE) formulated formulated R2D2 StartMachine CleanMachine rejected dispatched Scenario 3 - Promises start-machine(M1) Wall-E formulated StartMachine dispatched promise: effect: machine-in-state(M1, READY) M1 machine-in-state(M1, READY) M1 promised-M1 promised-M1 move(BASE,M1) collect-processite(M1) move(M1,BASE) formulated R2D2 CleanMachine dispatched Scenario 4 - Promises, Resource Conflict Legend Action State start-machine(M1) Running Wall-E formulated StartMachine Pending dispatched Failed promise: effect: machine-in-state(M1, READY) machine-in-state(M1, READY) Resources M1 promised-M1 R Resource move(BASE,M1) collect-processite(M1) acquired formulated R2D2 CleanMachine rejected dispatched timeout released t1 t2 t Figure 2: Multiple scenarios of interaction between two goals and robots in the CX with and without promises as discussed in the running example. S CENARIO 1: one robot WALL -E and no promises. It successively formulates and executes the ap- plicable goals. S CENARIO 2: two robots (WALL -E and R2D2) and no promises. Both attempt to dispatch S TART M ACHINE, illustrating interaction between multiple agents and goal resources. S CENARIO 3: two robots, promises. R2D2 formulates C LEAN M ACHINE earlier from the promises, but the action collect-processite is pending until the effects from WALL - E are applied. S CENARIO 4: two robots, promises, and failed resource handover. collect-processite is pending until the resource M1 is handed over. A timeout occurs and the goal fails.
1 (goal-operator (class CleanMachine) 2 (param-names r side machine c mat) Promises are tied to goals and can either be based on the 3 (param-types robot loc machine cont material) effects of the concrete plan actions of an expanded goal, or 4 (param-quantified) simply be based on a goal’s postcondition. They can be ex- 5 (lookahead-time 20) tracted automatically from these definitions, or hand-crafted 6 (preconditions " in simple domains. Though the concept can be applied more 7 (and flexibly (e.g. dynamic issuing of promises based on active 8 (robot-carries ?r ?c) plans and world states), in our implementation promises are 9 (container-can-be-filled ?c) static and created during goal expansion. 10 (location-is-free ?side) 11 (location-is-machine-output ?side) Goal Formulation with Promises 12 (location-part-of-machine ?side ?machine) 13 (machine-in-state ?machine READY) We continue by describing how promises and promise times 14 (machine-makes-material ?machine ?mat) for literals can be used for goal formulation. As shown in 15 (not (storage-is-full)) Listing 1, we augment the goal operator with a lookahead 16 ) time, which is used to evaluate the goal precondition for fu- 17 ") ture points in time. Given a set of promises P , a lookahead 18 (objective time tl and a goal precondition pre(γ), the goal reasoner 19 "(and (robot-carries ?r ?c) formulates a goal in state (s, t) iff the precondition will be 20 (container-filled ?c ?mat) satisfied within the next tl time units, i.e., iff 21 )") 22 ) From(pre(γ), s, t, P ) ≤ t + tl Listing 1: The definition of the goal operator C LEAN - Note that if s |= pre(γ), then From(pre(γ), s, t, P ) = t and M ACHINE. The precondition defined when a goal may be thus the condition is trivially satisfied. Also, if the looka- formulated; the objective defines the PDDL goal to plan for head time is set to tl = 0, then the goal is formulated iff its once the goal has been selected. precondition is satisfied in the current state, thus effectively disabling the promise mechanism. Furthermore, this condition results in an optimistic goal s |= post(g)). However, the other agents are oblivious to formulation, as the goal will still be formulated even if the g’s objectives and the corresponding changes to the world goal’s precondition is promised to be no longer satisfied at a and therefore need to wait until the first agent has fin- future point in time. To ensure that a goal is only formulated ished its goal. To avoid this unnecessary delay, we introduce if its precondition is satisfied for the whole lookahead time, promises, which intuitively give some guarantee for certain we can additionally require: literals to be true in a future state: Until(pre(γ), s, t, P ) ≥ t + tl Definition 2 (Promise). A promise is a pair (l, t), stating that the literal l will be satisfied at (global) time t. In our implementation, Until is currently not considered, as we are mainly interested in optimistic goal formulation. Promises are made to the other agents by a promising Incorporating promises into goal formulation leads to agent when it dispatches a goal. The receiving agent may more cooperative behavior between the agents as goals that now use those promises to evaluate whether a goal can be build on other goals can be considered by an agent during formulated, even if its precondition is not satisfied by the selection, thereby enabling parallelism. Thus, it is a natu- world. More precisely, given a set of promises, we can de- ral extension of the goal lifecycle with an intention sharing termine the point in time when a goal precondition will be mechanism. satisfied: Definition 3 (Promised Time). Given a state (s, t) and a set Promises in the CX. In the CX, promises are continously of promises P = {(li , ti )}i , we define From(l, s, t, P ) as the evaluated to compute Until and From for each available timepoint when a literal l is satisfied and Until(l, s, t, P ) grounding of a goal precondition formula. The results are as the timepoint when l is no longer satisfied: stored for each formula and computed in parallel to the nor- mal formula satisfaction evaluation. Therefore, promises do t if s |= l not directly interfere with the world model, but rather are in- From(l, s, t, P ) = min(l,ti )∈P ti if ∃ti : (l, ti ) ∈ P tegrated into the goal formulation process by extending the ∞ else precondition check to also consider From for a certain looka- head time tl . t if s |= l The lookahead time is manually chosen based on the ex- pected time to accomplish the objective and are specific for Until(l, s, t, P ) = min ( t l,ti )∈P i if ∃ti : l, ti ∈ P each goal. In scenarios with constant execution times for ac- ∞ else tions, those can be used directly as the estimate. If execution We extend the definition to a set of literals L: times for actions are not constant (e.g. driving actions with variable start and end positions), average recorded execution From(L, s, t, P ) = max From(li , s, t, P ) times or more complex estimators might be used, at the loss li ∈L of promise accuracy. By choosing a lookahead time of 0, Until(L, s, t, P ) = min Until(li , s, t, P ) promises can be ignored. li ∈L
Promises are shared between the agents through the Execution Monitoring with Promises shared world model of the CX. For now, we hand-craft Goals that issue promises might take longer than expected, promises for each goal operator γ based on post(g). How- get stuck, or fail completely, possibly leading to promises ever, extracting promises automatically from postconditions that are never realized. Therefore, goals that have been for- and plans may be considered in future work. If a goal is mulated based on promises may have actions whose precon- completed or fails, the promises are removed from the world ditions will never be satisfied. To deal with this situation, we model. again rely on execution monitoring. If a pending action is not executable for a certain amount of time, a timeout occurs Using Promises for Planning and the action and its corresponding goal are aborted. How- ever, the timeout spans are increased such that the potentially With promises, an agent is able to formulate a goal in ex- longer wait times from promise-based execution can be ac- pectation of another agent’s actions. However, promises also counted for. need to be considered when expanding a goal with a PDDL Promises may not cause deadlocks, as a deadlock may planner. To do so, a promise is translated into a timed ini- only appear if two goals depend on each other w.r.t. tial literal (TIL) (Edelkamp and Hoffmann 2004). Similar promises. However, this is not possible, as promises are used to a promise, a TIL states that some literal will become true to formulate a goal, but only are issued when the goal is dis- at some point in the future, e.g., (at 5 (robot-at M1 patched. Thus, cycles of promises are not possible. ) states that (robot-at M1) will become true in 5 time Recall our previous example S CENARIO 3. Action steps. We extended the PDDL plan expansion of the CX to collect-processite remains in the state pending until translate promises into TILs and to use POPF (Coles et al. WALL -E finishes the execution of action start-machine 2010), a temporal PDDL planner that supports TILs. and its effects are applied. Should WALL -E not complete the action (correctly), execution monitoring will detect the time- out on the side of R2D2 and will trigger new goal formula- Goal Execution with Promises tion. If the promised-from time has elapsed, the promises will not be considered by R2D2 anymore when formulating When executing a goal that depends on promises (i.e., it has new goals, leading to a different objective to be chosen by been formulated based on promises), it is necessary to check the agent. Should WALL -E manage to fulfill its goal, then whether the promise has actually been realized. To do so, we the effects will be applied, and R2D2’s original goal’s pre- build on top of the execution engine of the CX (Niemueller, conditions are satisfied through the actual world state. Hofmann, and Lakemeyer 2018). For each plan action, the CX continuously checks the action’s precondition and only Resource Locks with Promises starts executing the action once the precondition is satisfied. In the CX, resource locks at the goal level are used to coor- While the goal’s preconditions can be satisfied (in parts or dinate conflicting goals. Naturally, goals that are formulated fully) by relying on promises, they are not applied to ac- on promises often rely on some of the same resources that tion preconditions. Thus, the action execution blocks until are required by the goal that issued the promise. A goal that the precondition is satisfied. This ensures that actual inter- fills a machine requires the machine as a resource. Another actions in the world only operate under the assumption of goal that is promise-dependent and operates on the same correct and current world information. Otherwise, actions machine might require it as a resoure as well. In this sce- might fail or lead to unexpected behavior. nario, the second goal may be formulated but would imme- Let us revise our examples from S CENARIO 1 and S E - diately be rejected, as the required resource is still held by CENARIO 2 by introducing promises, but ignoring the role the promising goal. To resolve this issue, promise-dependent of goal resources for now. Similarly to the previous sce- goals will delay the resource acquisition for any resource narios, WALL -E formulates, selects and executes a goal of that is currently held by the promising goal. As soon as the class S TART M ACHINE on machine M1. With promises, the promising goal is finished and therefore the resource is re- expected outcome of that goal (machine-in-state(M1, leased, the resource is acquired by the promise-dependent READY)) will be announced to the other agents when the goal. To make sure that there is no conflict between two goal gets dispatched (t1 ). R2D2 now uses the provided in- different promise-dependent goals, for each such delayed formation to formulate the goal C LEAN M ACHINE ahead of resource, a promise resource is acquired, which is simply time, even though the effects of S TART M ACHINE have not the main resource prefixed with promised-. Effectively, a been applied to the world yet. This behavior is shown in promise-dependent goal first acquires the promise resource, S CENARIO 3 of Figure 2. The first action of the goal’s plan then, as soon as the promising goal releases the resource, (move) can be executed directly. However, the precondition the promise-dependent goal acquires the main resource, and of action collect-processite is not yet satisifed, since then releases the promise resource. the action start-machine is still active on WALL -E un- To illustrate the mechanism, let us once again consider til t2 . The executive of R2D2 notices this and collect- the example from S CENARIO 3. When dispatched, the processite is pending until the effects have been applied. goal S TART M ACHINE holds the resource M1 and R2D2 Once the effects are applied, R2D2 can start executing its formulates the goal C LEAN M ACHINE based on WALL - action. An example of this behavior occuring in simulation E’s promise. Since the goal operates on the same ma- is visualized in Figure 3. chine, it needs to acquire the same goal resource. As the
CLEAN-MACHINE (CM) FILL-CONTAINER (FC) DELIVER-XENONITE (DX) DELIVER (D) START-MACHINE (SM) R2D2 FC D SM CM D SM CM DX EVE FC D SM CM D SM CM DX FC D SM CM D SM CM DX WALL-E FC D SM CM D SM CM DX FC D SM CM D SM CM DX 0 50 100 150 200 250 300 Time (a) A run without promises. Each goal formulation must wait until the previous goal has finished. CLEAN-MACHINE (CM) FILL-CONTAINER (FC) DELIVER-XENONITE (DX) DELIVER (D) START-MACHINE (SM) R2D2 FC SM D CM DX FC SM D CM DX EVE FC D CM DX FC D CM DX SM WALL-E FC D SM CM D SM CM D SM SM CM D SM CM D SM CM D SM CM DX 0 50 100 150 200 250 300 Time (b) A run with promises. The striped goals have been formulated based on promises. Compared to the run without promises, the promise- dependent goals (e.g., the first C LEAN M ACHINE goal by E VE) are dispatched earlier, because a future effect of another goal (i.e. the second S TART M ACHINE goal by WALL -E) is considered for goal formulation and expansion, thus leading to an overall better performance. Figure 3: A comparison of two exemplary runs in the Xenonite domain. goal is promise-dependent, it first acquires the resource cooperation on task efficiency. The three robots first collect promised-M1. This resource is currently held by no other raw materials and then refine them stepwise by passing them agent, thus it can be acquired and the goal dispatched. R2D2 through a series of two machines in sequence. Once all con- first executes the action move(BASE, M1). As before, the tainers are filled and stored, the objective is completed. next action collect-processite(M1) remains pend- ing until WALL -E has completed its goal. At this point, We compare the performance of the three robots using WALL -E releases the resource M1, which is then acquired POPF to expand goals. Figure 2 shows how promises can by R2D2. Should this resource handover fail, e.g., because lead to faster execution of goals by starting a goal (which the goal S TART M ACHINE by WALL -E never releases its re- always starts with a move action) while the goal’s precondi- sources, the action eventually times out and R2D2’s goal is tion (e.g., that the machine is in a certain state) is not fulfilled aborted, as shown in S CENARIO 4 of Figure 2. yet. Figure 3 shows exemplary runs of three robots produc- ing and delivering 5 containers of Xenonite. The expected behavior as highlighted in Figure 2 can indeed be observed 6 Evaluation in the simulation runs. We evaluate our prototypical implementation1 of promises for multi-agent cooperation as an extension of the CX in the To compare the performance, we simulated each scenario same simplified production logistics scenario that we used five times. All experiments were carried out on an Intel Core as a running example throughout this paper. In our evalua- i7 1165G7 machine with 32 GB of RAM. Without promises, tion scenario, three robots were tasked with filling n = 5 the three robots needed 303 sec to fill and store all con- containers with the materials Xenonite and to deliver it to tainers, compared to (284.40 ± 0.55) sec with promises. At a storage area. The number of containers was intentionally each call, POPF needed less than 1 sec to generate a plan chosen to be larger than the number of robots to simulate an for the given goal. This shows, at least in this simplified sce- oversubscription problem and highlight the effects of better nario, that promises lead to more effective collaboration, and that the planner can make use of promises when expanding 1 https://doi.org/10.5281/zenodo.6610426 a goal.
7 Conclusion Davis, R.; and Smith, R. G. 1983. Negotiation as a Metaphor for Distributed Problem Solving. Artificial Intelligence, 20(1): 63– In multi-agent goal reasoning scenarios, effective collabora- 109. tion is often difficult, as one agent is unaware of the goals Dias, M.; Zlot, R.; Kalra, N.; and Stentz, A. 2006. Market-Based pursued by the other agents. We have introduced promises, Multirobot Coordination: A Survey and Analysis. Proceedings of a method for intention sharing in a goal reasoning frame- the IEEE, 94(7): 1257–1270. work. When dispatching a goal, each agent promises a set Dignum, F.; Dunin-Keplicz, B.; and Verbrugge, R. 2001. Agent of literals that will be true at some future timepoint, which Theory for Team Formation by Dialogue. In Castelfranchi, C.; and can be used by another agent to formulate and dispatch Lespérance, Y., eds., Intelligent Agents VII Agent Theories Archi- goals early. We have described how promises can be de- tectures and Languages, Lecture Notes in Computer Science, 150– fined based on goal operators, which describe goals similar 166. Berlin, Heidelberg: Springer. ISBN 978-3-540-44631-6. to how action operators describe action instances. By trans- Edelkamp, S.; and Hoffmann, J. 2004. PDDL2.2: The Language lating promises to timed initial literals, we allow the PDDL for the Classical Part of the 4th International Planning Competition. planner to make use of promises during goal expansion. The Technical report, Technical Report 195, University of Freiburg. evaluation showed that using promises with our prototypical Ghallab, M.; Nau, D.; and Traverso, P. 2016. Automated Planning implementation improved the performance of a team of three and Acting. Cambridge University Press. ISBN 978-1-139-58392- robots in a simplified logistics scenario. For future work, we 3. plan to use and evaluate promises in a real-world scenario Grant, J.; Kraus, S.; and Perlis, D. 2002. A Logic-Based Model from the RoboCup Logistics League (RCLL) (Niemueller of Intentions for Multi-Agent Subcontracting. In Proceedings of et al. 2016) and compare it against our distributed multi- the Eighteenth National Conference on Artificial Intelligence, 320– agent goal reasoning system (Hofmann et al. 2021). 325. Grosz, B. J.; and Kraus, S. 1996. Collaborative Plans for Complex Acknowledgements Group Action. Artificial Intelligence, 86(2): 269–357. Hertle, A.; and Nebel, B. 2018. Efficient Auction Based Coordi- This work is funded by the Deutsche Forschungsgemein- nation for Distributed Multi-agent Planning in Temporal Domains schaft (DFG, German Research Foundation) – 2236/1, the Using Resource Abstraction. In Trollmann, F.; and Turhan, A.-Y., EU ICT-48 2020 project TAILOR (No. 952215), and Ger- eds., KI 2018: Advances in Artificial Intelligence, volume 11117, many’s Excellence Strategy – EXC-2023 Internet of Produc- 86–98. Cham: Springer International Publishing. ISBN 978-3-030- tion – 390621612. 00110-0 978-3-030-00111-7. Hoffmann, J.; and Brafman, R. I. 2005. Contingent Planning via References Heuristic Forward Search with Implicit Belief States. In Proceed- Aha, D. W. 2018. Goal Reasoning: Foundations, Emerging Appli- ings of the 15th International Conference on Automated Planning cations, and Prospects. AI Magazine, 39(2): 3–24. and Scheduling (ICAPS), 71–80. ISBN 1-57735-220-3. Albrecht, S. V.; and Stone, P. 2018. Autonomous Agents Modelling Hoffmann, J.; and Brafman, R. I. 2006. Conformant Planning via Other Agents: A Comprehensive Survey and Open Problems. Ar- Heuristic Forward Search: A New Approach. Artificial Intelli- tificial Intelligence, 258: 66–95. gence, 170(6-7). Hofmann, T.; Niemueller, T.; Claßen, J.; and Lakemeyer, G. 2016. Brenner, M.; and Nebel, B. 2009. Continual Planning and Acting Continual Planning in Golog. In Proceedings of the 30th Confer- in Dynamic Multiagent Environments. Autonomous Agents and ence on Artificial Intelligence (AAAI). Phoenix, AZ; USA. Multi-Agent Systems, 19(3). Hofmann, T.; Viehmann, T.; Gomaa, M.; Habering, D.; Niemueller, Carberry, S. 2001. Techniques for Plan Recognition. User Model- T.; and Lakemeyer, G. 2021. Multi-Agent Goal Reasoning with the ing and User-Adapted Interaction, 11(1): 31–48. CLIPS Executive in the RoboCup Logistics League. In Proceed- Coles, A.; Coles, A.; Fox, M.; and Long, D. 2010. Forward- ings of the 13th International Conference on Agents and Artifical Chaining Partial-Order Planning. In Proceedings of the 20th In- Intelligence (ICAART). ternational Conference on International Conference on Automated Holvoet, T.; and Valckenaers, P. 2006. Beliefs, Desires and Inten- Planning and Scheduling (ICAPS), ICAPS’10, 42–49. AAAI Press. tions through the Environment. In Proceedings of the Fifth Inter- Cox, J. S.; and Durfee, E. H. 2005. An Efficient Algorithm for national Joint Conference on Autonomous Agents and Multiagent Multiagent Plan Coordination. In Proceedings of the Fourth Inter- Systems - AAMAS ’06, 1052. Hakodate, Japan: ACM Press. ISBN national Joint Conference on Autonomous Agents and Multiagent 978-1-59593-303-4. Systems, AAMAS ’05, 828–835. New York, NY, USA: Association Horling, B.; and Lesser, V. 2004. A Survey of Multi-Agent Orga- for Computing Machinery. ISBN 978-1-59593-093-4. nizational Paradigms. The Knowledge Engineering Review, 19(4): Cox, J. S.; Durfee, E. H.; and Bartold, T. 2005. A Distributed 281–316. Framework for Solving the Multiagent Plan Coordination Problem. Iocchi, L.; Nardi, D.; Piaggio, M.; and Sgorbissa, A. 2003. Dis- In Proceedings of the Fourth International Joint Conference on tributed Coordination in Heterogeneous Multi-Robot Systems. Au- Autonomous Agents and Multiagent Systems, AAMAS ’05, 821– tonomous Robots, 15(2): 155–168. 827. New York, NY, USA: Association for Computing Machinery. Jin, L.; Li, S.; La, H. M.; Zhang, X.; and Hu, B. 2019. Dynamic ISBN 978-1-59593-093-4. Task Allocation in Multi-Robot Coordination for Moving Target Cox, M. T. 2016. A Model of Planning, Action, and Interpretation Tracking: A Distributed Approach. Automatica, 100: 75–81. with Goal Reasoning. In Proceedings of the 4th Annual Conference Jonsson, A.; and Rovatsos, M. 2011. Scaling Up Multiagent Plan- on Advances in Cognitive Systems, 48–63. Palo Alto, CA; USA: ning: A Best-Response Approach. In Twenty-First International Cognitive Systems Foundation. Conference on Automated Planning and Scheduling.
Kaelbling, L. P.; and Lozano-Pérez, T. 2011. Hierarchical Task and Shivashankar, V.; Alford, R.; Kuter, U.; and Nau, D. 2013. The Motion Planning in the Now. In 2011 IEEE International Confer- GoDeL Planning System: A More Perfect Union of Domain- ence on Robotics and Automation, 1470–1477. Independent and Hierarchical Planning. In Proceedings of the Kraus, S. 2001. Automated Negotiation and Decision Making in Twenty-Third International Joint Conference on Artificial Intelli- gence, IJCAI ’13, 2380–2386. Beijing, China: AAAI Press. ISBN Multiagent Environments. In Luck, M.; Mařı́k, V.; Štěpánková, 978-1-57735-633-2. O.; and Trappl, R., eds., Multi-Agent Systems and Applications: 9th ECCAI Advanced Course, ACAI 2001 and Agent Link’s 3rd Shivashankar, V.; Kuter, U.; Nau, D.; and Alford, R. 2012. A Hi- European Agent Systems Summer School, EASSS 2001 Prague, erarchical Goal-based Formalism and Algorithm for Single-agent Czech Republic, July 2–13, 2001 Selected Tutorial Papers, Lec- Planning. In Proceedings of the 11th International Conference ture Notes in Computer Science, 150–172. Berlin, Heidelberg: on Autonomous Agents and Multiagent Systems, AAMAS ’12, Springer. ISBN 978-3-540-47745-7. 981–988. Richland, SC: International Foundation for Autonomous Agents and Multiagent Systems. ISBN 978-0-9817381-2-3. Kraus, S.; Sycara, K.; and Evenchik, A. 1998. Reaching Agree- Sukthankar, G.; Geib, C.; Bui, H.; Pynadath, D.; and Goldman, ments through Argumentation: A Logical Model and Implementa- R. P. 2014. Plan, Activity, and Intent Recognition: Theory and tion. Artificial Intelligence, 104(1): 1–69. Practice. Newnes. ISBN 978-0-12-401710-8. McGann, C.; Py, F.; Rajan, K.; Thomas, H.; Henthorn, R.; and Vail, D.; and Veloso, M. 2003. Dynamic Multi-Robot Coordi- McEwen, R. 2008. A Deliberative Architecture for AUV Control. nation. In Multi-Robot Systems: From Swarms to Intelligent Au- In 2008 IEEE International Conference on Robotics and Automa- tomata, volume 2, 87–100. tion, 1049–1054. Vattam, S.; Klenk, M.; Molineaux, M.; and Aha, D. W. 2013. Muñoz-Avila, H.; Jaidee, U.; Aha, D. W.; and Carter, E. 2010. Breadth of Approaches to Goal Reasoning: A Research Survey. In Goal-Driven Autonomy with Case-Based Reasoning. In Bichin- 1st CogSys Workshop on Goal Reasoning. daritz, I.; and Montani, S., eds., Case-Based Reasoning. Research Verma, J. K.; and Ranga, V. 2021. Multi-Robot Coordination Anal- and Development, Lecture Notes in Computer Science, 228–241. ysis, Taxonomy, Challenges and Future Scope. Journal of Intelli- Berlin, Heidelberg: Springer. ISBN 978-3-642-14274-1. gent & Robotic Systems, 102(1): 10. Niemueller, T.; Hofmann, T.; and Lakemeyer, G. 2018. CLIPS- Wilson, M.; and Aha, D. W. 2021. A Goal Reasoning Model for based Execution for PDDL Planners. In 2nd ICAPS Workshop on Autonomous Underwater Vehicles. In Proceedings of the Ninth Integrated Planning, Acting, and Execution (IntEx). Delft, Nether- Goal Reasoning Workshop. Online. lands. Wygant, R. M. 1989. CLIPS — A Powerful Development and De- Niemueller, T.; Hofmann, T.; and Lakemeyer, G. 2019. Goal Rea- livery Expert System Tool. Computers & Industrial Engineering, soning in the CLIPS Executive for Integrated Planning and Execu- 17(1-4): 546–549. tion. In Proceedings of the 29th International Conference on Au- Yu, H.; Shen, Z.; Leung, C.; Miao, C.; and Lesser, V. R. 2013. A tomated Planning and Scheduling (ICAPS), 1, 754–763. Berkeley, Survey of Multi-Agent Trust Management Systems. IEEE Access, CA, USA. 1: 35–50. Niemueller, T.; Karpas, E.; Vaquero, T.; and Timmons, E. 2016. Planning Competition for Logistics Robots in Simulation. In 4th ICAPS Workshop on Planning and Robotics (PlanRob). Parker, L. 1998. ALLIANCE: An Architecture for Fault Tolerant Multirobot Cooperation. IEEE Transactions on Robotics and Au- tomation, 14(2): 220–240. Pinyol, I.; and Sabater-Mir, J. 2013. Computational Trust and Rep- utation Models for Open Multi-Agent Systems: A Review. Artifi- cial Intelligence Review, 40(1): 1–25. Roberts, M.; Apker, T.; Johnson, B.; Auslander, B.; Wellman, B.; and Aha, D. W. 2015. Coordinating Robot Teams for Disaster Re- lief. In Proceedings of the 28th Florida Artificial Intelligence Re- search Society Conference. Hollywood, FL: AAAI Press. Roberts, M.; Hiatt, L.; Coman, A.; Choi, D.; Johnson, B.; and Aha, D. 2016. ActorSim, a Toolkit for Studying Cross-Disciplinary Challenges in Autonomy. In AAAI Fall Symposium on Cross- Disciplinary Challenges for Autonomous Systems. Roberts, M.; Hiatt, L. M.; Shetty, V.; Brumback, B.; Enochs, B.; and Jampathom, P. 2021. Goal Lifecycle Networks for Robotics. In The International FLAIRS Conference Proceedings, volume 34. Roberts, M.; Vattam, S.; Alford, R.; Auslander, B.; Karneeb, J.; Molineaux, M.; Apker, T.; Wilson, M.; McMahon, J.; and Aha, D. W. 2014. Iterative Goal Refinement for Robotics. In 1st ICAPS Workshop on Planning in Robotics (PlanRob). Sarratt, T.; and Jhala, A. 2016. Policy Communication for Co- ordination with Unknown Teammates. In The Workshops of the Thirtieth AAAI Conference on Artificial Intelligence, 567–573.
You can also read