1) PAPER TITLE Evolving Human Competitive Spectra-Based Fault Localisation Techniques 2) AUTHOR Shin Yoo, shin.yoo@ucl.ac.uk Centre for Research on Evolution, Search, and Testing Department of Computer Science, University College London Gower Street, London, WC1E 6BT United Kingdom Tel: +44 207 679 2031 3) CORRESPONDING AUTHOR Shin Yoo, shin.yoo@ucl.ac.uk 4) ABSTRACT We evolve risk evaluation formulas that are used in Spectra-Based Fault Localisation (SBFL). SBFL is an automated debugging technique that applies risk evaluation formulas to execution traces and ranks program statements according to the predicted risk. Effectiveness of SBFL techniques depends heavily on the formula used. Designing a risk evaluation formula has been a manual process driven by intuition: we present a genetic programming approach. Our empirical evaluation using 92 faults from four Unix utilities showed that GP evolved equations can consistently outperform many of the widely studied human-designed formulas, such as Tarantula, Ochiai, Jaccard, Ample, and Wong1/2, by up to 5.9 times. More importantly, the performance of GP evolved formulas were equal or better than that of Op2, a state of the art risk evaluation formula that has been proved to be optimal against a specific program structure (If-Then-Else-2). 5) CRITERIA (B) The result is equal to or better than a result that was accepted as a new scientific result at the time when it was published in a peer-reviewed scientific journal. (D) The result is publishable in its own right as a new scientific result 3⁄4 independent of the fact that the result was mechanically created. (E) The result is equal to or better than the most recent human-created solution to a long-standing problem for which there has been a succession of increasingly better human-created solutions. (F) The result is equal to or better than a result that was considered an achievement in its field at the time it was first discovered. (G) The result solves a problem of indisputable difficulty in its field. 6) WHY THE RESULT SATISFIES THE CRITERIA Why the result satisfies criteria B, F -------------------------------------- We were able to evolve multiple risk evaluation formulas that consistently outperformed many of the widely studied formulas designed manually. These include Tarantula, Ochiai, Jaccard, Wong1, Wong2, and Wong3. All of these metrics have been extensively studied in software engineering community for automated debugging. Tarantula has been particularly successful and applied to various types of faults, including concurrency bugs. Why the result satisfies criterion D ------------------------------------ The evolved formulas stand alone in terms of their performance, regardless of the fact that they are automatically generated by Genetic Programming. The empirical study in our paper confirms their relative performance. In fact, the empirical study concerns subject programs that are significantly larger and more complext than the usual candidates in the literature (e.g. Siemens suite). Why the result satisfies criterion E ------------------------------------ The most recent development in SBFL is that of Naish et al. ("A model for spectra-based software diagnosis", ACM TOSEM, 20(3):11:1–11:32, 2011): they have proven that two formulas (Op1 and Op2) they presented are "optimal" when the fault lies in a specific program structure called If- Then-Else-2 (ITE2). Their empirical evaluation of Op1 and Op2 showed that the optimal formulas can indeed outperform most of the existing formulas, even though the optimality was proven against a specific structure. The evolved formulas in our paper can either perform equally as Op1 and Op2 or outperform them for specific faults. Why the result satisfies criterion G ------------------------------------ Fault localisation is an active research topic in software engineering. Novel techniques are still being investigated (e.g. Baah et al. 2010, Xu et al. 2011, Naish et al. 2011). Despite the partial optimality proof of Naish et al., no formula has been known to be optimal against all possible combinations of faults, programs, and test suites. 7) CITATION Shin Yoo "Evolving Human Competitive Spectra-Based Fault Localisation Techniques", Technical Report RN-12-03 Department of Computer Science University College London http://www-typo3.cs.ucl.ac.uk/research/research_notes/ (Currently under peer review at the 4th International Symposium on Search Based Software Engineering, notification due on 15th June) 8) STATEMENT OF PRIZE DISTRIBUTION Any prize money, if any, is to be received by Shin Yoo. 9) COMPARISON TO OTHER HUMAN-COMPETITIVE ENTRIES SOLVING A DIFFICULT PROBLEM WITH A LONG HISTORY ----------------------------------------------- Fault localisation in the context of automated debugging has been one of the long standing goals in software engineering. It is not only important for human debuggers, but also increasingly important for enabling other automated techniques, such as the automatic patch generation by Forrest et al. (2009 HUMIES winner). To date, desinging risk evaluation formulas for Spectra-Based Fault Localisation has been an entirely labour intensive manual process, based on either intuition or significant endeavour in human intelligence for mathematical proofs. We present a novel alternative: automated, data-driven evolution using GP. MORE EFFICIENT DESIGN PROCESS ----------------------------- We believe that GP is the right way to go forward when it comes to designing risk evaluation formulas for SBFL. GP can effectively perform the same iterative design process adopted by human designer, but with a significantly larger volume of program spectrum datasets. Human designers can still contribute their intuitions by introducing novel GP operators or sub-trees. TAILORED FAULT LOCALISATION --------------------------- Our work also opens up the possibility of evolving project specific risk evaluation formulas. The predominant effort in automated debugging community has been put to finding a technique that works well universally. The use of genetic programming opens up the possibility of evolving tailored fault localisation techniques for specific projects. GOING BEYOND STATE Of THE ART ----------------------------- GP allows the possibility of evolving a fault localisation technique that goes beyond program spectra: any other metric or signal from software artefacts can be used by GP. For example, GP can incorporate last modification date or program dependency fault localisation techniques. We only need to make the data available to the evolutionary process. REPLICABILITY & OPEN RESEARCH ----------------------------- We have made the program spectra data used in the empirical study publicly available. Additionally, all the plots and statistical details that did not fit in the paper are available online. Please visit the following URL: http://www.cs.ucl.ac.uk/staff/s.yoo/evolving-sbfl.html