HTN Planning and Game State Management in Warcraft II
←
→
Page content transcription
If your browser does not render page correctly, please read the page content below
HTN Planning and Game State Management in Warcraft II Noah Brickman Nishant Joshi University of California at Santa Cruz University of California at Santa Cruz noah@superlight.net nishant@cse.ucsc.edu ABSTRACT compelling game experience for the player. A more ‘intelligent’ AI that performs real planning would make a A Hierarchical Task Network is a planning more compelling opponent. system that uses a hierarchy of primitive and compound Our project takes the approach of using a tasks to define a planning domain. A HTN is created Hierarchical Task Network (HTN) to encode the game using the JSHOP2 system which implements a high level domain and the various commands and operations a planning system for Warcraft II build orders. The player may have available to them. HTNs are a type of resulting plans are executed in the ABL/Wargus automated planning algorithm that encode a problem environment, a reactive planner coupled with a developer domain and then produce a plan to accomplish a given interface for the game Warcraft II. An ABL agent task from an initial world state. A HTN encodes a implements resource and unit tracking, handling the low problem domain in a hierarchy of primitive and level execution of the generated plan, sensitive to changes compound tasks. At the highest level a compound task in the environment by making real-time decisions. could be ‘build-town’. The ‘build-town’ task can be broken down into subtasks like ‘build-castle’ or ‘build- 1. INTRODUCTION barracks’. Each of these tasks has unique preconditions requiring availability of resources and worker units. There are a variety of ways that one may Eventually compound tasks are broken down to primitive approach the problem of developing an AI for a real time operators which give specific commands to train an strategy game (RTS). As with many AI design approaches individual unit at a building or command a worker to for a RTS, the problem can be divided into both high and build a specific type of building. low level tasks. At the high level we have general game The HTN for this project encodes some basic strategy: what buildings and units to build, level of rules about unit training, resource gathering, and building resource gathering, upgrade priorities, etc. At the low construction. It produces a plan, consisting of a sequence level is the specific instructions given to units: move a of commands that the game engine should carry out. given unit to a specific position, build a building at (:operator gather-resource specific location, train a unit at a specific building, etc. :parameters (?w ?l) :precondition (:and (worker ?w) One traditional approach to constructing the high (:not(worker-full ?w)) level AI for a RTS domain has been to construct a finite (worker-location ?l ?w) state machine (FSM) which encapsulates various game (resource-at ?l) ) states and the appropriate commands to effect the desired :effect (worker-full ?w)) transitions between states. Such an approach has the Figure 1: A UCPOP Operator advantage of being relatively simple to implement and understand. At any given time the game will be in one of the various ‘states’ encoded in the FSM. Various pre- A lot of game AI research is focused on making planned scripted actions are carried out depending on the the characters and the narrative exhibit behavioral game state and the state of the FSM. patterns that have meaning within the context of the The problem with this approach is that only game. We can imagine a bot in Unreal exhibiting a those states hardcoded in the FSM are available to the AI. vengeful behavior and repeatedly pursuing the human Only those actions pre-scripted by the developer can be player once attacked by her, or a defensive one who executed by the AI. In this sense, the AI is rather limited prefers to maintain a low profile. in its ability to generate intelligent plans with respect to One tool for authoring such behaviors is ABL, a the game world. At a first glance, the AI may be able to behavior description language. It provides the author with play effectively against a novice player, but it will be a useful abstraction, namely, authoring character AI in limited in its ability to generate novel sequences of terms of goals and behavioral patterns. An agent written commands for an arbitrary game state. This may result in in ABL pursues a high-level goal which can be achieved predictable actions carried out by the AI, and a less by pursuing predefined behaviors. These behaviors in turn
define sub goals which have their own associated developed at Stanford in the early 70s’[2]. In a STRIPS behaviors. “Pick up gun” would therefore be a sub-goal style planning system, an initial state and goal state for for the parent behavior “kill enemy”. the world is specified. A series of transformation rules are HTN planners like SHOP2 provide us with the defined, each with a set of pre and post conditions. The ability to use a domain theory specified in terms of initial transformation rules take world state predicates as state, tasks and methods (with preconditions and effects), variable arguments, and result in a transformed world and get as a result a plan that would achieve the goal state, if the preconditions are satisfied. The algorithm given the initial state and the operators and methods attempts to find a sequence of operations on the world available. state that will transform the initial state to the goal state. We can therefore have a set of behaviors In F.E.A.R. this technique was used to plan actions and animation sequences that would result in realistic behavior of enemy agents in the world. Though this approach was useful in F.E.A.R., the STRIPS style system is itself not sufficiently expressive to encapsulate the planning system designed for this project. An attempt was made to implement the RTS planning system using UCPOP (Fig. 1), a STRIPS style planning system. In UCPOP, a series of operators are harvest gold train peon harvest lumber train peon harvest gold Figure 2: A soldier executing the ‘patrol’ task in train peon F.E.A.R. [2] harvest lumber train peon build farm implemented in ABL and let the HTN treat them as action build barracks operators available for making state transitions. Planning build lumber_mill train grunt is traditionally employed in the action operator space, but build blacksmith is equally applicable in the behavior space, which is much train grunt closer to the way humans formulate plans. upgrade blacksmith weapons ABL can be used to create simple (no sub-goals) train grunt upgrade blacksmith shield and complex behaviors based on the primitive actions of the domain. These behaviors can be realized in the game world by communicating with the game engine. This Figure 3: The game commands allows us to delegate the task of plan actuation to ABL, generated by the JSHOP2 planner. and worry about plan formulation using the HTN planner. Using this approach, the entire planner could be replaced specified, each with variable arguments, preconditions, with a different system, provided it outputs its plan in the and world effects. Though some interesting results were format that ABL is expecting. achieved with this system, it was not sufficiently However, ABL does more than just realizing the expressive to encapsulate the RTS planning domain we plan in the game world; it is a reactive planning language desired. Features like tracking time and resource which allows us to respond to changes in the game world inventories cannot be expressed using the UCPOP in real-time. Thus, using an HTN planner like SHOP2 language. Due to the fact that the algorithm plans using with a reactive planner like ABL allows us to separate the backwards-chaining, the evolution of such numerical high-level strategic planning from the real-time reactive features cannot be planned by the algorithm. Additionally, planning that’s necessary for successfully executing the all the operators exist on the same level hierarchically, plan in the game world. making it impossible to break a problem into high and low level tasks and operators. 2. RELATED WORK Another planning system we studied was SquadSmart[3]. In SquadSmart an HTN (called an Automated planning algorithms have been used Hierarchical Transition Network) was used to plan actions in other gaming environments. In the game F.E.A.R.[1], a for the squad as a whole. High level tasks like ‘patrol’ and first-person shooter released in 2005, the AI made use of ‘defend’ were broken down into individual unit an automated solver similar to the STRIPS system assignments at a lower level. In this case it did not matter to the high level operators which squad member was
assigned to a given tasks, this behavior was assigned to A proxy implemented in Java is used to handle lower level task operators that carried out the high level all communication between ABL and Wargus. This task goals. SquadSmart was an example of a HTN involves retrieving game state (maps, units, resources planning system applied to a first-person shooter etc.) and sending action commands to the game engine. environment (Unreal Tournament). (Moving units, training units, harvesting resources etc.) All this information is stored inside Working 3. METHODS Memory Elements (WME) which an agent can access when it is making decisions. Each WME has a The first step in developing our planning system corresponding sensor that is responsible for sensing was the selection of a planner. For this project we chose information from the game world at fixed time intervals the JSHOP2 Java based implementation of the SHOP2[4] and storing it inside the WME. (Simple Hierarchical Ordered Planner) planning system Thus, an agent defined in ABL uses its developed at University of Maryland. A planning domain knowledge of the game world to make decisions, and acts was developed to issue a sequence of commands that accordingly. The sensors are responsible for maintaining implement a basic build order to construct a basic town. currency which makes the agent reactive to changes in the In the SHOP2 planner, as implemented in JSHOP2, a game world. developer writes methods, operators, and axioms using a A JSHOP2 planning domain was developed to range of logical expressions, variable assignments, and implement ‘build-town’ task. Individual methods and function calls. A SHOP2 problem consists of a high level operators were written to implement sub-tasks like ‘build- task to accomplish. building’, ‘train-unit’, and ‘research-upgrade’ The JSHOP2 problem specified the game state, generally ‘unit ; harvest resources peon’ to indicate the player has a single worker unit. The (:operator ; head goal task was the ‘build-town’ task. The planner would (!harvest ?res_type) then generate a plan (Figure 3) consisting of a sequence of ; pre grounded operators that represented commands to the ((unit peon free)) Wargus engine. These commands consisted of orders to ; delete list ((unit peon free)) train units, gather resources, and build specific buildings ; add list in pursuit of the goal task of building a town. ((unit peon harvest ?res_type)) The generated sequence of game commands was ) then passed to the ABL interface to the Wargus engine. Figure 4: A JSHOP2 Operator The planner output is parsed to get the individual SHOP2 methods are higher level tasks consisting of a set of preconditions and associated sub- task lists. The first precondition group for a given method that is found to be true has its associated task list executed. The task list for a method can contain other tasks or operators, which are the lowest level of method type. Operators take a set of variable parameters and have a precondition that must be satisfied by a set of grounded variables in order for the operator to be able to execute. Each operator has an add and delete list. Logical predicates from the add list are added to the world state, and predicates from the delete list are removed from the world state. In this way, much like in the STRIPS automated planning system, a world state is transformed to accomplish the goal task. The goal task developed for this planning domain implements a build order for the game Warcraft Figure 5: A town built with the JSHOP2 plan. II. We had available to us a version of Warcraft II implemented in the Stratagus[5] open-source gaming commands which are behaviors defined in the ABL agent. engine. The Stratagus implementation of Warcraft II, A command like “harvest gold” triggers an ABL behavior called Wargus, allows the AI that was released with the that finds a worker and assigns it to mining gold. These game to be replaced with a user developed engine. This behaviors take care of the execution time decision making allowed us to send our own commands to the game engine necessary to successfully execute a plan. For example, a and query the engine for game state. behavior that builds a barrack checks for the availability of sufficient resources, places a hold on the resources
required (to ensure that the resources don’t get used up by preconditions had to be defined carefully to ensure that some other behavior), finds a worker to do the the assumptions that the HTN planner was making were construction, searches the map for a good place to build correctly represented in the agent code. the barrack, and then sends a command to the Wargus We faced a few problems while implementing game engine to assign the selected worker to do the the ABL behaviors for the actions that resulted in the construction, releasing the hold on resources. improvement of the agent, making it more robust. One of An important thing to observe is that the model these was that resources allocated for use in one behavior of the game available to the HTN is at a higher level of were being used up inadvertently in another, which abstraction compared to the ABL agent, and so, the sometimes failed the other behavior. This was solved by actions that the planner outputs are not as accurately using the idea of ‘holding resources’ for an operation specified as their execution in the game world demands. while its underway. For example, while doing a build Which is another advantage of using this approach since task, a hold was placed on the resources required for the ABL can pursue these actions using multiple behaviors specific building, which was released later when the depending on the current state of the world. Each of these building was complete. This was possible because ABL behaviors would have a different set of preconditions. The allows us to create Working Memory Elements that are HTN, in turn, does not need to worry about low level not sensed, that is, they do not retrieve any data from the operations such as selecting which of the various free game world, instead, they serve as the local memory for workers to assign to a task, or which specific location to the agent. We used one such WME called a select for constructing a building. The HTN concerns ResourceHoldWME to maintain a hold on resources that itself with the high level planning, and the ABL interface were required for some task. manages the execution of the planner’s commands. Another problem faced was the inability of the agent to make correct decisions about the location of 4. RESULTS buildings while making decisions about their construction. So, it often ended up trying to build a building at an invalid location, which would result in the Simple plans for a basic build order problem were game engine’s failure to actuate the command, and hence generated by the HTN planner. The planner was given the behavior would fail without even realizing it. Another the initial state of the game world, a domain theory in the WME called TerrainWME was used in this case, to search form of methods, operators and axioms, and a goal to the map for a good location to do the construction. Unlike pursue. It used these inputs to come up with a fully the ResourceHoldWME, this WME has a sensor that is ordered sequence of actions which were then executed by responsible for periodically updating the WME using the the ABL agent. information it gets from the game engine. Originally it was planned to add features like We are currently able to simulate basic build time and resource tracking to the HTN. It was later orders in the game world. So, given a town with one determined that such features were counterproductive to villager and a town hall, we can get to a point where the the planner unless two way communication with the game town is significantly bigger, with most of the basic engine was also implemented. Otherwise, the planner buildings (mills, farms, barracks, blacksmith) , a small would make assumptions about the game state and army (4 grunts) , and the basic weapon and shield availability of resources that would not necessarily be researches. Figure 5 shows the town as it looks like once correct. The HTN that was eventually implemented does the plan execution is complete. We tried out different not plan at such a low level, instead relying on the ABL planning problems, assuming the initial state to be interface to track game state. The planner focuses on different each time, which resulted in plans tailored to producing a sequence of high level commands, such as specific situations. one person might describe a game strategy to another, The success that we achieved in this without planning with respect to time or resource experimental setting has motivated us to further explore availability. this approach (see future work) and also leads us to the The execution phase was successful in that it conclusion that it can be successfully implemented in handled the run-time decision making correctly to RTS games. actually produce the behavior that the higher level planning was expected to produce. Thus a train command issued by the HTN planner translated to a host of tasks 5. FUTURE WORK performed in execution time, such as checking for the availability of the required building, waiting for the The JSHOP2/ABL/Wargus system developed required resources, etc. here successfully develops and implements a build order Every behavior had a set of preconditions that plan. The overall flexibility of the system is rather limited was checked before it was chosen for executions, these and lacks some key features necessary to turn it into a
robust AI capable of playing against human opponents. [6] Mateas, M. and Stern, A. In H. Prendinger and M. Several additional features should be added to the overall Ishizuka (Eds), Life-like Characters. Tools, Affective system to facilitate such game play. Currently, the HTN Functions and Applications, Springer, 2004 planner does not have two way communication with the ABL agent. A complete AI using the HTN planner needs [7] ABL Documentation. to be able to receive game state from the gaming engine http://mothership.cc.gt.atl.ga.us/abl/index.php/Main_Page through the agent and incorporate this into its planning approach, which is critical to do re-planning in the event of plan failures. The planner also needs to be able to model the passage of time and the accumulation and consumption of resources. Ideally the planner should operate at the strategic level, not concerning itself with the minutiae of moving specific units around the gaming environment. Still, the planner needs some level of knowledge of the low level game state in order to make useful plans. Knowing how long a given command will take to execute and how many resources it will consume is important to the planner’s ability to make efficient and effective plans. Tracking features like time and resource consumption will allow more detailed and optimized plans to be generated Plan invalidation and re-planning are crucial to any good game AI implementation. The game should be able to invoke the planner multiple times with a different set of goals to plan for. The planner should be able to look at the current state of the game and come up with an optimal strategy for achieving the goals. For incorporating this re-planning model, its important to have two way communication between the HTN planner and the ABL agent, which is the most obvious next step here. One of the reasons the JSHOP2 system was chosen was that it was written in Java, and allows external Java functions to be called in the plan operators. Though not implemented in this project, such function calls would allow the planner to talk directly to the ABL agent in order to execute a plan. 6. REFERENCES [1] R. Fikes and N. Nilsson. 1971. STRIPS: A New Approach to the Application of Theorem Proving to Problem Solving. Artificial Intelligence, 2:189-208. [2] J Orkin. 2006. Three States and a Plan: The AI of F.E.A.R. Game Developers Conference 2006 [3] P. Gorniak and I. Davis. 2007. SquadSmart: Hierarchical Planning and Coordinated Plan Execution for Squads of Characters. AAAI Press. 14. [4] D. Nau, T. Au, O. Ilghami, et. al. 2003. SHOP2: AN HTN Planing System. Journal of Artificial Intelligence Research 20 (2003) 379-404. [5] Stratagus. http://www.stratagus.org/
You can also read