CSMITHEDGE: More Effective Compiler Testing by Handling Undefined Behaviour Less Conservatively


Compiler fuzzing techniques require a means of generating programs that are free from undefined behaviour (UB) to reliably reveal miscompilation bugs. Existing program generators such as CSMITH achieve UB-freedom by heavily restricting the form of generated programs. The idiomatic nature of the resulting programs risks limiting the test coverage they can offer, and thus the compiler bugs they can discover. We investigate the idea of adapting existing fuzzers to be less restrictive concerning UB, in the practical setting of C compiler testing via a new tool, CSMITHEDGE, which extends CSMITH. CSMITHEDGE probabilistically weakens the constraints used to enforce UB-freedom, thus generated programs are no longer guaranteed to be UB-free. It then employs several off-the-shelf UB detection tools and a novel dynamic analysis to (a) detect cases where the generated program exhibits UB and (b) determine where CSMITH has been too conservative in its use of safe-math wrappers that guarantee UB-freedom for arithmetic operations, removing the use of redundant ones. The resulting UB-free programs can be used to test for miscompilation bugs via differential testing. The non-UB-free programs can still be used to check that the compiler under test does not crash or hang. Our experiments on recent versions of GCC, LLVM and the Microsoft Visual Studio Compiler show that CSMITHEDGE was able to discover 7 previously unknown miscompilation bugs (5 already fixed in response to our reports) that could not be found via intensive testing using CSMITH and 2 compiler-hang bugs fixed independently shortly before we considered reporting them. CSMITHEDGE also achieves substantial differences in code coverage on GCC and LLVM compared with regular CSMITH.

Joint work with Cristian Cadar and Alastair Donaldson.