Evolving Behaviour Trees for the Mario Bros Game Using Grammatical Evolution
←
→
Page content transcription
If your browser does not render page correctly, please read the page content below
Evolving Behaviour Trees for the Mario Bros Game Using Grammatical Evolution Miguel Nicolau1 and Diego Perez2 1 Natural Computing Research & Applications Group University College Dublin Dublin, Ireland 2 University Carlos III Madrid, Spain Miguel.Nicolau@ucd.ie, diego.perez.liebana@gmail.com Abstract. 1 Introduction 2 The 2010 Mario AI Competition The 2010 Mario AI Competition is a contest organized by Julian Togelius and Sergey Karakovskiy, and it is the successor to the competition held by the same organizers in 2009 [JSR10]. The 2010 competition took place in four different events: EvoStar 2010, World Congress on Computational Intelligence (WCCI) 2010, Conference on Computational Intelligence and Games (CIG) 2010 and Games Innovation Conference (GIC) 2010. The participants of the competition are requested to submit a bot that can participate in up to three different tracks: gameplay, learning and level gener- ation. The bot presented by the authors for this competition took part in the gameplay track of the CIG’10, where the bots are evaluated in levels that have not been seen previously by the competitors. The score of each bot is based on the distance run by Mario (the bot), plus the sum of some other factors, like collected items, enemies killed and time left. The evaluation is made over several executions, varying level length, enemy types and difficulty, so the final score is the sum of all these evaluations. The bot that gets the highest score becomes the winner of the competition. 2.1 The Mario Bros Benchmark The Mario Bros Benchmark, used for running the competition, is an open source software, written in Java, and developed by Julian Togelius, Sergey Karakovsky, Tom Schaul and Jan Koutnik [MBW]. This benchmark allows to create an agent that plays the Mario Bros game, just witting a small Java class that overrides two methods: one to retrieve in- formation about the level enemies and geometry, and the other to specify the actions used to move the bot. Both functions are called by the engine, in this order, every execution cycle.
Environment information All the information that can be used to analyse the world around Mario is given in two bi-dimensional arrays (21x21). Each one of them is in charge of providing data about the geometry of the level and the enemies that populate it. These arrays are centred in Mario, so 10 grid cells in each direction from the position of Mario can be processed every cycle. Additionally, three different levels of detail can be specified to retrieve data in both arrays, depending on the information we are looking for: – Zoom 2: The data is represented in a binary array. For enemies, 0 means that there is no enemy on that position, while 1 means there is some enemy. Likewise, for the level scene, 1 means that there is an obstacle and 0 that Mario can pass through. – Zoom 1: This zoom levels represents the data with an integer, gathering groups of objects with a common identifier. For the enemy information, 0 means no enemy at all, 2 represents an enemy that can be killed by Mario by jumping on it and 9 are those enemies that can be killed just shooting at them. For the level scene, different identifiers represent types of blocks, like those that can or can not be broken, contain hidden items or can spawn enemies. – Zoom 0: This zoom level is a very close view of the internal representation of the engine, where every kind of enemy or block in the level has its own identifier, different from any other entity in the game. Apart from this information, more useful input can be used to represent the current state of the game: – Mario position: A pair of float position values that indicates the coordi- nates of Mario in the level. – Mario status: It informs about the state of the game: running, win or dead. – Mario mode: Mario can be small or big, with or without being able to fire. – Mario state indicators: They inform about facts like the ability of Mario to shoot and jump, the time left for the level and whether Mario is on the ground or not. – Mario kills: Statistics about the enemies killed by Mario, indicating how they were killed (by stomp, by fire or hitting them with a shell). Mario effectors The actions that can be performed by Mario are all the dif- ferent inputs that a human player could use with a control pad. They are rep- resented as a boolean array, where each control has a concrete index assigned. The controls to use are the following: – Directions: One different for each direction: Left, Right Up and Down. – Jumping: To make Mario jump. – Speed and Fire: Mario can fire, if he is in the proper mode, by using this control. This input can also be used to make Mario go faster, but it only works if he is moving right or left. Jumps with this button pressed can make Mario reach farther places.
3 Grammatical Evolution Grammatical Evolution (GE) [OR03] is an evolutionary approach that specifies the syntax of possible solutions through a context-free grammar, which is then used to map binary strings onto functional and syntactically correct solutions. Those binary strings can therefore be evolved by any search algorithm; typically, a variable-length genetic algorithm is used. One of the main advantages of GE is that the syntax of the resulting solutions is specified through a grammar. This facilitates the application of GE to a variety of problems with relative ease, and explains its usage for the current application. GE basically works as a genotype-to-phenotype mapping process. Variable- length integers are created by an evolutionary process (typically a genetic algo- rithm [Hol75,Gol89]), and then used to choose production rules from a grammar, which creates a functional program, syntactically correct for the problem domain. Finally, this program is evaluated, and its fitness is returned to the evolutionary algorithm. 3.1 Example Mapping Process To illustrate the mapping process, consider the grammar shown in Fig. 1, speci- fying a generic gameplay behaviour, and the following integer string: (4, 5, 3, 6, 8, 5, 9, 1). The first integer is used to choose one of the two productions of the start symbol , through the formula 4%2 = 0, i.e. the first production is chosen, so the mapping string becomes . The following integer is then used with the first unmapped symbol in the mapping string, so through the formula 5%2 = 1 the symbol is replaced by , and thus the mapping string becomes . The mapping process continues in this fashion, so in the next step the map- ping string becomes through the formula 3%2 = 1, and through 6%5 = 2 it becomes moveRight;. After all symbols are mapped, the final program becomes moveRight; if(enemyAhead) then shoot;, which could be executed in an endless loop. Sometimes the integer string may not have enough values to fully map a syntactic valid program; several options are available, such as reusing the same integers (in a process called wrapping[OR03]), assigning the individual the worst possible fitness, or replacing it with a legal individual. In this study, an unmapped individual is replaced by its parent. 4 Behaviour Trees 4.1 Introduction Behaviour Trees (BTs) were introduced a few years ago as a means to encode formal system specifications [Dro04,Col07]. Recently, they have been shown to provide a means to encode game AI in a modular, scalable and reusable man- ner [CDC10]. They have been used in high-revenue commercial games, such as
::= | ::= | ::= if(obstacleAhead) then ; | if(enemyAhead) then ; ::= moveLeft; | moveRight; | jump; | crouch; | shoot; Fig. 1. Example grammar for simple approach to generic shooting game. “Façade” [MS04], “Halo 2” [Isl05] and “Halo 3”, “Spore” [Mch07], and many other unpublished commercial uses [CDC10], which illustrate their flexibility and growing importance in the commercial game AI world. BTs are simply a hierarchical way of organising behaviours in a descending order of complexity; broad behavioural tasks are at the top of the tree, and these are broken down into several sub-tasks. For example, a soldier in a first- person shooter game might have a behaviour AI that breaks down into patrol, investigate and attack tasks. Each of these can then be further broken down: attacking for example will no doubt require moving tactics, weapon management, and aiming algorithms. These can be further broken down, right up to the level of playing sounds or animation sprites. 4.2 Behaviour Trees for Mario FIXME DIEGO 4.3 Incorporation into GE FIXME Grammar encoding. First option giving full syntax to GE was not good, second option fixing the structure of the grammar to resemble an and-or tree (ref) much more successful. Extensions to standard GE. FIXME MIGUEL Another novel approach was the encoding of crossover points in the grammar. This is a technique pre- sented recently for GE [ND06], in which a specific symbol is used in the grammar, to label crossover points; the evolutionary algorithm then only slices an individ- ual according to these points. This made a lot of sense in the work presented here: many of the parameters passed to jenn3d specify styling options, which can therefore be exchanged as a whole between different structures (a 2-point crossover operator was used). This makes crossover act solely as an exploitation operator; standard point mutation still ensures the exploration of novel param- eter values.
5 Experiments FIXME MIGUEL The experimental parameters used are shown in Table 1. Note that to ensure all individuals in the initial population were valid, a form of population initialisa- tion known as Sensible Initialisation [RA03] was used. A variation of tournament selection was used, which ensures that each individual participates at least in one tournament event. Also, the mutation rate was set such that, on average, one mutation event occurs per individual (its probability is therefore variable, and dependent on the length of each individual). Finally, note that there is no maximum number of generations; evolution will always continue, until the user decides to terminate the execution. Table 1. Experimental Setup Initial Population Size 20 Evolutionary Population Size 10 Derivation-tree Depth (for initialisation) 10 Tail Ratio (for initialisation) 20% Selection Tournament Size 2 Elitism (for generational replacement) 20% Crossover Ratio 90% Average Mutation Events per Individual 1 5.1 Results 6 Conclusions References [MBW] Mario AI Benchmark, http://code.google.com/p/marioai/ [MAI10] 2010 Mario AI Championship, http://www.marioai.org [JSR10] Julian Togelius, Sergey Karakovskiy and Robin Baumgarten: The 2009 Mario AI Competition. In: IEEE Congress on Evolutionary Computation, Proceedings. pp. FIXME–FIXME IEEE Press (2010) [CDC10] Champandard, A., Dawe, M., Cerpa, D. H.: Behavior Trees: Three Ways of Cultivating Strong AI. In: Game Developers Conference, Audio Lecture. (2010) [Col07] Colvin, R., Hayes, I. J.: A Semantics for Behavior Trees. ARC Centre for Complex Systems, tech. report ACCS-TR-07-01. (2007) [Dro04] Dromey, R. G.: From Requirements to Design: Formalizing the Key Steps. In: International Conference on Software Engineering and Formal Methods, Proceed- ings. (2004) [Gol89] Goldberg, D. E.: Genetic Algorithms in Search, Optimization and Machine Learning. Addison Wesley (1989)
[Hol75] Holland, J. H.: Adaptation in Natural and Artificial Systems. University of Michigan Press (1975) [Isl05] Isla, D.: Managing Complexity in the Halo 2 AI System. In: Game Developers Conference, Proceedings. (2005) [Mch07] McHugh, L.: Three Approaches to Behavior Tree AI. In: Game Developers Conference, Proceedings. (2007) [MS04] Mateas, M., Stern, A.: Managing Intermixing Behavior Hierarchies. In: Game Developers Conference, Proceedings. (2004) [ND06] Nicolau, M., Dempsey, I.: Introducing Grammar Based Extensions for Gram- matical Evolution. In: IEEE Congress on Evolutionary Computation, Proceedings. pp. 2663–2670 IEEE Press (2006) [OR03] O’Neill, M., Ryan, C.: Grammatical Evolution: Evolutionary Automatic Pro- gramming in a Arbitrary Language. Kluwer Academic Publishers (2003) [RA03] Ryan, C., Azad, R.M.A.: Sensible initialisation in grammatical evolution. In: Barry, A.M. (ed.) GECCO 2003: Proceedings of the Bird of a Feather Workshops, Genetic and Evolutionary Computation Conference. pp. 142–145. AAAI (July 2003)
You can also read