(1) List of papers ================== [a] Constructing low star discrepancy point sets with genetic algorithms [b] Evolutionary optimization of low-discrepancy sequences [c] A new randomized algorithm to approximate the star discrepancy based on threshold accepting (2) List of authors [author of which papers listed in (1)] ========================================================== - François-Michel De Rainville [a,b] Département de génie électrique et de génie informatique Université Laval Québec, Québec, Canada G1V 0A6 francois-michel.de-rainville.1@ulaval.ca 1 (418) 656-2131 x4786 - Carola Doerr (née Winzen) [a,c] Max Planck Institute for Informatics Université Paris Diderot - Paris 7 Department 1: Algorithms and Complexity LIAFA, 4th floor, room B 458 Campus E1 4, Room 313 Case 7014 66123 Saarbrücken 75205 Paris, Cedex 13 Germany France carola.doerr@mpi-inf.mpg.de +49 681 9325 1013 - Christian Gagné [b] Département de génie électrique et de génie informatique Université Laval Québec, Québec, Canada G1V 0A6 christian.gagne@gel.ulaval.ca 1 (418) 656-2131 x3556 - Michael Gnewuch [c] Technische Universität Kaiserslautern Fachbereich Mathematik Postfach 3049 67653 Kaiserslautern Germany mig@numerik.uni-kiel.de +49 631 205 5325 - Denis Laurendeau [b] Département de génie électrique et de génie informatique Université Laval Québec, Québec, Canada G1V 0A6 denis.laurendeau@gel.ulaval.ca 1 (418) 656-2979 - Olivier Teytaud [b] Équipe TAO -- INRIA Saclay -- Île-de-France LRI, Bat 490 Univ. Paris-Sud 91405 Orsay Cedex France olivier.teytaud@inria.fr +33 1 69 15 63 97 - Magnus Wahlström [c] Max Planck Institute for Informatics Department 1: Algorithms and Complexity Campus E1 4, Room 311B 66123 Saarbrücken Germany wahl@mpi-inf.mpg.de +49 681 9325 1056 (3) Corresponding Author ======================== François-Michel De Rainville francois-michel.de-rainville.1@ulaval.ca (4) Abstracts of the Papers =========================== [a] Constructing low star discrepancy point sets with genetic algorithms Geometric discrepancies are standard measures to quantify the irregularity of distributions. They are an important notion in numerical integration. One of the most important discrepancy notions is the so-called star discrepancy. Roughly speaking, a point set of low star discrepancy value allows for a small approximation error in quasi-Monte Carlo integration. It is thus the most studied discrepancy notion. In this work we present a new algorithm to compute point sets of low star discrepancy. The two components of the algorithm (for the optimization and the evaluation, respectively) are based on evolutionary principles. Our algorithm clearly outperforms existing approaches. To the best of our knowledge, it is also the first algorithm which can be adapted easily to optimize inverse star discrepancies. [b] Evolutionary optimization of low-discrepancy sequences Low-discrepancy sequences provide a way to generate quasi-random numbers of high dimensionality with a very high level of uniformity. The nearly orthogonal Latin hypercube and the generalized Halton sequence are two popular methods when it comes to generate low-discrepancy sequences. In this article, we propose to use evolutionary algorithms in order to find optimized solutions to the combinatorial problem of configuring generators of these sequences. Experimental results show that the optimized sequence generators behave at least as well as generators from the literature for the Halton sequence and significantly better for the nearly orthogonal Latin hypercube. [c] A new randomized algorithm to approximate the star discrepancy based on threshold accepting We present a new algorithm for estimating the star discrepancy of arbitrary point sets. Similar to the algorithm for discrepancy approximation of Winker and Fang [SIAM J. Numer. Anal., 34 (1997), pp. 2028–2042] it is based on the optimization algorithm threshold accepting. Our improvements include, amongst others, a nonuniform sampling strategy, which is more suited for higher-dimensional inputs and additionally takes into account the topological characteristics of given point sets, and rounding steps which transform axis-parallel boxes, on which the discrepancy is to be tested, into critical test boxes. These critical test boxes provably yield higher discrepancy values and contain the box that exhibits the maximum value of the local discrepancy. We provide comprehensive experiments to test the new algorithm. Our randomized algorithm computes the exact discrepancy frequently in all cases where this can be checked (i.e., where the exact discrepancy of the point set can be computed in feasible time). Most importantly, in higher dimensions the new method behaves clearly better than all previously known methods. (5) Criteria Satisfied by the Work ================================== B, D, E, F, and G (6) Statement on the Human Competitiveness Criteria =================================================== Since low discrepancy point sets enable the ubiquitous task of numerical integration with small approximation error, they are of major interest both in industrial and in academic applications. They are, for example, an important component in state-of-the-art tools to quote financial derivatives. It is, since the early works of Weyl in 1916, one of the most challenging questions in this area to compute point sets of low star discrepancy value. A long research track record on this subject exists, and geometric discrepancies are studied intensively in the mathematics and computer science literature. One of the difficulties that complicates the generation of low discrepancy point sets is the fact that the computation of star discrepancy values is NP-hard [1] and even W(1)-hard in the dimension [2]. It is therefore unlikely already for moderate dimensions (e.g., d=10) that they can be computed efficiently. In our most recent work [a], we develop --by adapting and combining the two genetic algorithms from [b] and [c]-- a new tool to create and to evaluate low discrepancy point sets. The results are very compelling: our algorithm clearly outperforms previous approaches by magnitudes. By providing explicit low discrepancy point sets, we are also able to answer some of the open questions posted in the literature (Open Problems 42 in [3]). It therefore nicely proves the strengths of general-purpose genetic algorithms in an research area that has been dominated so far by strongly problem-tailored algorithmic approaches. For example, since the introduction of the so-called generalized Halton sequences by Braaten and Wheller in 1979 many authors proposed successively better and better configurations, crafted using highly sophisticated mathematical tools and clever intuitions, to name but a few works: Kocis and Whiten (1997), Wang and Hickernell (2000), Atanassov and Durchova (2003), Chi et al. (2005), Vandewoestyne and Cools (2006), Faure and Lemieux (2009). The work in [b] (which was previously presented at GECCO 2009 and is the winner of a best paper award in the 2009 RWA track) gave already extraordinary results for point sets of low L2-discrepancy: in all tested dimensions, the computed L2-discrepancies are lower by a large margin than previous figures reported in the literature. In dimension 100, the results are up to six times better than the previously best known values. In addition, the produced sequences are also competitive with sequences from the literature when applied to a standard numerical integration benchmark found in [4, 5 and 6]. As mentioned above, computing star discrepancies is proven to be a difficult task. Several algorithms to attack this problem have been developed, see, e.g., the works by Winker and Fang (1997), Thiémard (2000 and 2001), and Shah (2010). However, it is only since the development of our algorithm proposed in [c] that such values can be approximated efficiently also in moderate and in high dimensions. As shown in [c], our (1+1) evolutionary algorithm outperforms previous works by magnitudes that seem to grow exponentially with the problem size. The computed star discrepancy values are the exact values on all tested instances for which this can be checked, and they are clearly much better than previous estimated for all other test instances. (An adapted version of) the algorithm from [b] is used in [a] to generate candidate point sets for the discrepancy setting at hand. The algorithm from [c] is used to guide this search; it enables the fitness evaluation of the candidates. As shown in [a], by combining the two algorithms, we are able to create point sets that are much better than previously known ones. They beat, in particular, the Faure (0, m, s)-nets by Niederreiter (1987), the scrambled Faure nets by Thiémard (1998), the scrambled Sobol sequence by Press et al. (1988), and the dependent randomized rounding point sets of Doerr et al. (2010). The results from [a] are available on http://qrand.gel.ulaval.ca. We intend to maintain this database in future, adding to it new point configurations on the demand of industrial and academic applications. Summarizing the above, our work has proven to be human-competitive in the following five of the eight criteria: (B) The results are equal to or better than a result that was accepted as a new scientific result at the time when it was published in a peer-reviewed scientific journal. (D) The results are publishable in its own right as new scientific results independent of the fact they were mechanically created. (E) The results are equal to or better than the most recent human-created solution to a long-standing problem for which there has been a succession of increasingly better human-created solutions. (F) The results are equal to or better than a result that was considered an achievement in its field at the time it was first discovered. (G) The result solves a problem of indisputable difficulty in its field. References ---------- [1] Gnewuch, Srivastav, Winzen Finding optimal volume subintervals with k points and calculating the star discrepancy are NP-hard problems. Journal of Complexity, Volume 25, pages 115-127, Elsevier, 2009. [2] Giannopoulos, Knauer, Wahlström, Werner Hardness of discrepancy computation and epsilon-net verification in high dimension Journal of Complexity, Volume 28, pages 162-176, Elevier, 2009. [3] Novak, Wozniakowski Tractability of Multivariate Problems. Vol. 2: Standard Information for Functionals EMS Tracts in Mathematics, European Mathematical Society (EMS), 2010. [4] Atanassov, Durchova Generating and testing the modified Halton sequences Numerical Methods and Applications, Lecture Notes in Computer Science Volume 2542, pages 91-98, Springer, 2003. [5] Faure, Lemieux Generalized Halton sequences in 2008: A comparative study ACM Transactions on Modeling and Computer Simulation, Volume 19, Number 4, pages 1-43, 2009. [6] Papageorgiou, Traub Faster evaluation of multidimensional integrals Computers in Physics Volume 11, Issue 6, pages 574–579, American Institute of Physics Inc., 1997. (7) Full Citation of the Papers =============================== [a] Carola Doerr, François-Michel De Rainville Constructing low star discrepancy point sets with genetic algorithms In: Proceedings of the Genetic and Evolutionary Computation Conference, GECCO ’13 Association for Computing Machinery (ACM), 2013. To appear [b] François-Michel De Rainville, Christian Gagné, Olivier Teytaud, Denis Laurendeau Evolutionary optimization of low-discrepancy sequences In: ACM Transactions on Modeling and Computer Simulation, 22(2):9:1--9:25 Association for Computing Machinery (ACM), 2012 [c] Michael Gnewuch, Magnus Wahlström, Carola Winzen A new randomized algorithm to approximate the star discrepancy based on threshold accepting In: SIAM Journal on Numerical Analysis, 50:781--807 Society for Industrial and Applied Mathematics (SIAM), 2012 (8) Distribution of the Prize Money =================================== Any prize money, if any, is to be divided equally among François-Michel De Rainville and Carola Doerr. (9) Why Should the Judges Consider This Entry ============================================= The proposed entry establishes important progress in several disciplines of computer science and mathematics (e.g., numerical analysis, optimization, and learning). This success is based heavily on the power of genetic and evolutionary principles, which are used in several components of our algorithm. Since search heuristics have not been studied intensively in the previous literature on geometric discrepancies, our work also paves the way for further interdisciplinary research in this direction. In fact, our algorithms are in use already in several applications, and we strongly believe that our work has spurred further research on using genetic algorithms for constructing uniformly distributed point sets and for evaluating geometric discrepancies. Our entry distinguish itself from other human-competitive entries not only by the quality of the results obtained and the difficulty of the problems solved, but also by the originality of its conception. In fact, it took two evolutionary computation techniques working in symbiosis to accomplish these impressive results. Both methods did very well on their own as shown independently in several publications, but it is only through their combination that significant progress could be made for what is one of the most important open problems in numerics: the computation of point sets that allow integration with good error guarantees.