1) PAPER TITLE GP-Rush: Using Genetic Programming to Evolve Solvers for the Rush Hour Puzzle 2) AUTHORS Ami Hauptpman, amiha@cs.bgu.ac.il Achiya Elyasaf, achiya.e@gmail.com Moshe Sipper, sipper@cs.bgu.ac.il Assaf Karmon, assafkar@gmail.com Physical address for all: Department of Computer Science, Ben-Gurion University, Beer-Sheva 84105, ISRAEL 3) CORRESPONDING AUTHOR Moshe Sipper 4) ABSTRACT We evolve heuristics to guide IDA* search for the 6x6 and 8x8 versions of the Rush Hour puzzle, a PSPACE-Complete problem, for which no efficient solver has yet been reported. No effective heuristic functions are known for this domain, and---before applying any evolutionary thinking---we first devise several novel heuristic measures, which improve (non-evolutionary) search for some instances, but hinder search substantially for many other instances. We then turn to genetic programming (GP) and find that evolution proves immensely efficacious, managing to combine heuristics of such highly variable utility into composites that are nearly always beneficial, and far better than each separate component. GP is thus able to beat both the human player of the game and also the human designers of heuristics. 5) CRITERIA (B) The result is equal to or better than a result that was accepted as a new scientific result at the time when it was published in a peer-reviewed scientific journal. (D) The result is publishable in its own right as a new scientific result---independent of the fact that the result was mechanically created. (F) The result is equal to or better than a result that was considered an achievement in its field at the time it was first discovered. (G) The result solves a problem of indisputable difficulty in its field. (H) The result holds its own or wins a regulated competition involving human contestants (in the form of either live human players or human-written computer programs). 6) WHY THE RESULT SATISFIES THE CRITERIA Why the result satisfies criterion (B) -------------------------------------- We were able to evolve a killer application for the Rush Hour Puzzle, a game proven to be PSPACE-complete. Our evolved strategy requires far less nodes AND less time to solve the most difficult known instances, as compared to any other published, unpublished, or breathing method: (a) Binary decision diagrams (BDDs), (b) Novel hand-crafted heuristics, and (c) human contestants. (The reader is welcome to try their hand at http://www.puzzles.com/products/rushhour.htm) Why the result satisfies criteria (D,F,G) ----------------------------------------- Our evolved solvers for the Rush Hour Puzzle are the first successful reported attempt to solve this difficult problem with search, and thus the result is publishable in its own right. Our method is better than other methods for several reasons: (1) Scalability: Our algorithms for solving 6x6 instances of Rush Hour are directly applicable to solving 8x8 (and possibly larger) instances. We solved instances of these problems with great success---no mean feat for a PSPACE-complete problem. This is in contrast to other methods, such as BDDs, which are not directly scalable (e.g., Colette et al.: "symbolic model-checking of board games... leads to the explosion of the size of symbolic data-structures.") (2) Generating new game instances: Our search method not only solves given instances, but easily finds new, more difficult ones, as shown in the paper. Thus, we were able to evolve the most difficult known Rush Hour instances, and we did so efficiently (in terms of time & space). Why the result satisfies criterion (H) -------------------------------------- Victory over humans is three-fold: (1) We have developed the best-known algorithm for the Rush Hour problem, better than any algorithm designed by humans. (2) Evolution beats the human designers of heuristics (i.e., the authors--- whose self-proclaimed intelligence is manifested in the hand-crafted game policies, described in the paper). These heuristics are themselves better than brute force, which is better than BDDs (another approach applied to Rush Hour). (3) Our evolved strategies' performance far surpasses that of human players, in terms of game time: Over 10 times faster. This is important because it means that the algorithm is efficient in every way, both using less nodes during the search AND running in less time. 7) CITATION A. Hauptman, A. Elyasaf, M. Sipper, and A. Karmon. "GP-Rush: Using Genetic Programming to Evolve Solvers for the Rush Hour Puzzle." Genetic and Evolutionary Computation Conference. July 2009. Montreal, Canada. GECCO 2009. 8) STATEMENT OF PRIZE DISTRIBUTION Any prize money, if any, is to be given to Moshe Sipper. 9) COMPARISON TO OTHER HUMAN-COMPETITIVE ENTRIES PUSHING EVOLUTION FURTHER ------------------------- This is the most difficult single-player search (i.e., *planning*) problem solved (so successfully) with evolution thus far. Even 6x6 Rush Hour is more difficult than all other planning problems solved (with evolution)---mainly due to the difficulty of designing a suitable representation coupled with the huge (and difficult to navigate) search space. Moreover, we even solved yet harder 8x8 boards, never tackled before to our knowledge. SEVERAL DEGREES (AND MODALITIES) OF IMPROVEMENT ----------------------------------------------- The popular Enhanced Iterative Deepening algorithm was surpassed by our hand-crafted heuristics, which were in turn surpassed by hand-crafted policies---all of which were beaten by GP-evolved policies. Evolution managed to take our best designed ingredients of limited performance and transform them into HIGHLY successful strategies. GP not only beat human AI researchers but also all human players of Rush Hour on record. SOLVING A DIFFICULT PROBLEM WITH A LONG HISTORY ----------------------------------------------- Difficult puzzles---involving search and planning problems---have a longstanding tradition in the AI community. Rush Hour was until recently considered an open problem for which no efficient solvers had been designed, although the field of planning is a very fertile one (See Kendall et al.'s review paper from 2008, cited in our paper; of note is not only Rush Hour's open status but also its complexity--- PSPACE-complete---superseding 23 other games described in the paper, which are "only" NP-Complete). Our evolutionary algorithm "closed" Rush Hour's open status, in addition exhibiting the ability to scale up to new, more difficult problems---themselves discovered through evolution.