Addressing the Saturation Effect in Compiler Testing

Abstract

Compiler testing techniques such as Csmith have shown remarkable results, with hundreds of bugs being found in mature compilers such as GCC and LLVM. Despite their success, these techniques show signs of saturation, i.e. they are less able to generate programs that trigger further compiler bugs.

In the context of compilers for languages with extensive undefined behaviour, such as C/C++, we identify two key reasons for this saturation effect: the restrictive nature of the programs they generate, which must meet various constraints; and the blackbox nature of the testing techniques, which receive no feedback from the compilers being tested.

In this talk, I present two approaches that address this saturation effect in compiler testing. First, we show that by relaxing the constraints imposed during generation, we can create programs that find bugs which are beyond the reach of the original techniques. Second, we show that greybox fuzzing of compilers for languages with extensive undefined behaviour, particularly C/C++, is possible, by devising custom semantics-aware mutations.

This is based on joint work with Karine Even-Mendoza, Arindam Sharma and Alastair Donaldson.