Sushil J. Louis1
Chris Miles1
Nicholas Cole1
John McDonnell2
We use case injected genetic algorithms to learn how to competently play computer strategy games that involve long range planning and complex dynamics [11]. Such games inspire, and are inspired by, military training simulations on which trainees spend considerable time honing their skills. By instrumenting the game interface, we unobtrusively acquire knowledge in the form of cases from human experts playing the game and use case-injected genetic algorithms to incorporate this knowledge in evolving competent game players. The games we have focused on have two sides, Blue and Red, and an evolved player can thus serve two purposes. A competent player for Blue serves as a decision aid for a Blue trainee. At the same time, a competent Blue player serves as a training opponent for a Red trainee. Results in the context of a strike force planning game show that with an appropriate representation, case injection is effective at biasing the genetic algorithm towards producing competent plans that contain important strategic elements used by human players.
Our research focuses on a strike force asset allocation game which maps to a broad category of resource allocation problems in industry and the military. Genetic algorithms can be used in our game to robustly search for effective allocation strategies, but these strategies may may not necessarily approach real world optima as the game (the evaluation function) is an imperfect reflection of reality. Humans with past experience playing the real world game tend to include external knowledge (gained through their experience) when producing strategies for the real-world problem underlying the game. Acquiring and incorporating knowledge from the way these humans play should allow us to carry over some of this external knowledge into our own play. Results show that case injection combined with a flexible representation biases the genetic algorithm toward producing strategies similar to those learned from human players. Beyond playing similarly on the same mission, the genetic algorithm can use strategic knowledge across a range of similar missions to continue to play as the human would. The left side of Figure 1 shows a screen shot of the game we are designing while the right side shows the scenario considered in this abstract.
Our Genetic Algorithm Player (GAP) develops strategies for the attacking strike force, including routing and weapon targeting for all available aircraft. When confronted with popups, GAP responds by replanning with the case injected genetic algorithm to produce a new plan of action. Beyond producing near optimal strategies we would like to bias it toward producing solutions similar to those it has seen used by humans playing Blue in the past. Generating competent opponents is hard and using case-injected genetic algorithms to acquire and use knowledge from human experts should allow us to develop better training opponents and decision aids. Specifically we want to show that when using case injection the genetic algorithm more frequently produces plans similar to those a human has generated in the past on the same mission. We would also like to show that GAP can continue to use this information even when confronted with a different mission.
Previous work in strike force asset allocation has been done in optimizing the allocation of assets to targets, the majority of it focusing on static pre-mission planning [4,10,21,13]. A large body of work exists in which evolutionary methods have been applied to games [3,16,15,6,17]. However the majority of this work has been applied to board, card, and other well defined games which have many differences from popular real time strategy (RTS) computer games such as Starcraft, Total Annihilation, and Homeworld[1,2,5]. Entities in our game exist and interact over time in continuous three dimensional space. Sets of algorithms control these entities, seeking to fulfill goals set by the player leading them. This adds a level of abstraction not found in traditional games. In most RTS games, players not only have incomplete knowledge of the game state, but even identifying the domains of this incomplete knowledge is difficult. John Laird [7,9,8] surveys the state of research in using Artificial Intelligence (AI) techniques in interactive computers games. He describes the importance of such research and provides a taxonomy of games. Note that several military simulations share some of our game's properties [20,18,14], but the purpose of our game is not to provide a perfect reflection of reality. Rather we want our strike planning game to both provide a platform for research in strategic planning and training, and to have fun.
The simple mission being played out is shown in on the right side of
Figure 1. This mission was chosen to be simple
enough to have easily analyzable results, and to allow the GA to learn
external knowledge from the human. As many games show similar
dynamics, this mission is a good arena for examining the general
effectiveness of using case injection for learning from humans. The
mission takes place in Northern Nevada and California - Lake Tahoe is
visible near the bottom of the map. Blue possesses one platform armed
with eight (8) assets (weapons) and the platform takes off from and
returns to the lower left hand corner of the map. Red possesses eight
(8) targets distributed in the top right region of the map, and six
threats to defend these targets. The first stage in Blue's planning
is determining the allocation of the eight assets to the eight
targets. Each asset can be allocated to any of the eight targets,
giving
allocations. The second stage in Blue's
planning involves finding routes for each of the platforms to follow
during their mission. These routes should be short and simple but
still minimize exposure to risk. We categorize Blue's possible routes
into two categories. Yellow routes fly through the corridor between
the threats, while green routes fly around (see figure
1). The evaluator has no direct knowledge of
potential danger presented to platforms inside the corridor area - it
has no conception of traps. Because of this, the evaluator optimal
solution is the yellow route, as it is the shortest. The human expert
however, understands the potential for danger in the corridor.
Knowing this, the green route is the human preferred solution.
Teaching GAP to learn from the human and produce green strategies even
though yellow strategies have higher fitness is one of the goals of
our current work.
|
This paper considered two broad issues in developing competent opponents for computer gaming - knowledge acquisition from expert players and biasing genetic search using information outside the evaluation function (acquired from humans). We showed that GAP learns to prefer solutions similar to injected cases acquired from humans and to deal with the external to the evaluator concept of a trap. Our preliminary results thus indicate that case injection seems to be a promising approach towards acquiring knowledge and biasing genetic search toward human preferred solutions. We are currently using these techniques in developing a training simulation (computer game) for strike planning.
This document was generated using the LaTeX2HTML translator Version 2002-2-1 (1.71)
Copyright © 1993, 1994, 1995, 1996,
Nikos Drakos,
Computer Based Learning Unit, University of Leeds.
Copyright © 1997, 1998, 1999,
Ross Moore,
Mathematics Department, Macquarie University, Sydney.
The command line arguments were:
latex2html -split 1 abstract
The translation was initiated by Sushil Louis on 2005-01-06