1) PAPER TITLE 

Evolving Human Competitive Spectra-Based Fault Localisation Techniques
   
2) AUTHOR

Shin Yoo, shin.yoo@ucl.ac.uk

Centre for Research on Evolution, Search, and Testing
Department of Computer Science,
University College London
Gower Street, London, WC1E 6BT
United Kingdom

Tel: +44 207 679 2031 

3) CORRESPONDING AUTHOR

Shin Yoo, shin.yoo@ucl.ac.uk

4) ABSTRACT

We evolve risk evaluation formulas that are used in Spectra-Based 
Fault Localisation (SBFL). SBFL is an automated debugging technique 
that applies risk evaluation formulas to execution traces and ranks 
program statements according to the predicted risk. Effectiveness 
of SBFL techniques depends heavily on the formula used.
 Designing a risk evaluation formula has been a manual process driven 
by intuition: we present a genetic programming approach. Our empirical 
evaluation using 92 faults from four Unix utilities showed that GP 
evolved equations can consistently outperform many of the widely studied 
human-designed formulas, such as Tarantula, Ochiai, Jaccard, Ample, and 
Wong1/2, by up to 5.9 times. More importantly, the performance of GP 
evolved formulas were equal or better than that of Op2, a state of 
the art risk evaluation formula that has been proved to be optimal 
against a specific program structure (If-Then-Else-2).


5) CRITERIA
 
(B) The result is equal to or better than a result that was accepted as a 
new scientific result at the time when it was published in a peer-reviewed 
scientific journal.

(D) The result is publishable in its own right as a new scientific result 
3⁄4 independent of the fact that the result was mechanically created.

(E) The result is equal to or better than the most recent human-created 
solution to a long-standing problem for which there has been a succession 
of increasingly better human-created solutions.

(F) The result is equal to or better than a result that was considered an 
achievement in its field at the time it was first discovered.

(G) The result solves a problem of indisputable difficulty in its field.


6) WHY THE RESULT SATISFIES THE CRITERIA

Why the result satisfies criteria B, F
--------------------------------------
We were able to evolve multiple risk evaluation formulas that consistently 
outperformed many of the widely studied formulas designed manually. These 
include Tarantula, Ochiai, Jaccard, Wong1, Wong2, and Wong3. All of these 
metrics have been extensively studied in software engineering community 
for automated debugging. Tarantula has been particularly successful and
applied to various types of faults, including concurrency bugs.

Why the result satisfies criterion D
------------------------------------
The evolved formulas stand alone in terms of their performance, regardless 
of the fact that they are automatically generated by Genetic Programming. 
The empirical study in our paper confirms their relative performance. In 
fact, the empirical study concerns subject programs that are significantly 
larger and more complext than the usual candidates in the literature (e.g. 
Siemens suite).

Why the result satisfies criterion E
------------------------------------
The most recent development in SBFL is that of Naish et al. ("A model for 
spectra-based software diagnosis", ACM TOSEM, 20(3):11:1–11:32, 2011): 
they have proven that two formulas (Op1 and Op2) they presented are 
"optimal" when the fault lies in a specific program structure called If-
Then-Else-2 (ITE2). Their empirical evaluation of Op1 and Op2 showed that 
the optimal formulas can indeed outperform most of the existing formulas, 
even though the optimality was proven against a specific structure.

The evolved formulas in our paper can either perform equally as Op1 and 
Op2 or outperform them for specific faults.

Why the result satisfies criterion G
------------------------------------
Fault localisation is an active research topic in software engineering. 
Novel techniques are still being investigated (e.g. Baah et al. 2010, 
Xu et al. 2011, Naish et al. 2011). Despite the partial optimality proof 
of Naish et al., no formula has been known to be optimal against all 
possible combinations of faults, programs, and test suites.

     
7) CITATION

Shin Yoo
"Evolving Human Competitive Spectra-Based Fault Localisation Techniques", 
Technical Report RN-12-03
Department of Computer Science
University College London
http://www-typo3.cs.ucl.ac.uk/research/research_notes/

(Currently under peer review at the 4th International Symposium on Search
Based Software Engineering, notification due on 15th June)

8) STATEMENT OF PRIZE DISTRIBUTION

Any prize money, if any, is to be received by Shin Yoo.


9) COMPARISON TO OTHER HUMAN-COMPETITIVE ENTRIES

SOLVING A DIFFICULT PROBLEM WITH A LONG HISTORY
-----------------------------------------------
Fault localisation in the context of automated debugging has been one of 
the long standing goals in software engineering. It is not only important 
for human debuggers, but also increasingly important for enabling other 
automated techniques, such as the automatic patch generation by Forrest 
et al. (2009 HUMIES winner). To date, desinging risk evaluation formulas 
for Spectra-Based Fault Localisation has been an entirely labour intensive 
manual process, based on either intuition or significant endeavour in human 
intelligence for mathematical proofs. We present a novel alternative: 
automated, data-driven evolution using GP.

MORE EFFICIENT DESIGN PROCESS
-----------------------------
We believe that GP is the right way to go forward when it comes to designing 
risk evaluation formulas for SBFL. GP can effectively perform the same 
iterative design process adopted by human designer, but with a significantly 
larger volume of program spectrum datasets. Human designers can still contribute 
their intuitions by introducing novel GP operators or sub-trees.

TAILORED FAULT LOCALISATION
---------------------------
Our work also opens up the possibility of evolving project specific risk 
evaluation formulas. The predominant effort in automated debugging community
has been put to finding a technique that works well universally. The use of 
genetic programming opens up the possibility of evolving tailored fault 
localisation techniques for specific projects.

GOING BEYOND STATE Of THE ART
-----------------------------
GP allows the possibility of evolving a fault localisation technique that goes 
beyond program spectra: any other metric or signal from software artefacts can 
be used by GP. For example, GP can incorporate last modification date or program 
dependency fault localisation techniques. We only need to make the data available 
to the evolutionary process.

REPLICABILITY & OPEN RESEARCH
-----------------------------
We have made the program spectra data used in the empirical study publicly 
available. Additionally, all the plots and statistical details that did 
not fit in the paper are available online. Please visit the following URL:

http://www.cs.ucl.ac.uk/staff/s.yoo/evolving-sbfl.html