We study the effect of relaxing too conservative conditions for generating UB-free compiler test-cases of Csmith’s code-generation and code-execution time solutions.

Overview

Methods for randomized testing of compilers to find miscompilation bugs typically require a way to generate programs that are free from undefined behaviour (UB). Tools such as Csmith achieve UB-freedom by heavily restricting the form of generated programs. This leads to highly idiomatic programs, and we hypothesise that this limits the thoroughness with which compilers are tested. Our idea is that researchers should investigate ways to generate less restricted programs that are still UB-free—programs that get closer to the edge of undefined behaviour, but that do not quite cross the edge. We present experiments investigating one instance of idea via a prototype tool, CsmithEdge, that uses a simple dynamic analysis to detect where Csmith has been too conservative in its use of “safe math” wrappers that guarantee UB-freedom for arithmetic operations, eliminating redundant wrappers. By reducing the use of safe math wrappers, CsmithEdge was able to discover two new miscompilation bugs in GCC that could not be found via intensive testing using regular Csmith, as well as achieving substantial differences in code coverage on GCC compared with regular Csmith.

Download CsmithEdge

CsmithEdge is available here. Instructions on how to install and use the tool and scripts is available here.

Research Support

This work is supported by the EPSRC through grant EP/R011605/1.

Publications

Talks

  • CsmithEdge: More Effective Compiler Testing by Handling Undefined Behaviour Less Conservatively

    Karine Even-Mendoza

    Talk @ ASE JF 2022

  • Closer to the Edge: Testing Compilers More Thoroughly by Being Less Conservative About Undefined Behaviour

    Karine Even-Mendoza

    Talk @ ASE NIER 2020