We propose a coverage-directed, mutation-based approach for fuzzing C compilers and code analysers, inspired by the success of this type of greybox fuzzing in other application domains.
Fuzzing of compilers and code analysers has led to a large number of bugs being found and fixed in widely-used frameworks such as LLVM, GCC and Frama-C. Most such fuzzing techniques have taken a blackbox approach, with compilers and code analysers starting to become relatively immune to such fuzzers. The challenge of applying mutation-based fuzzing in this context is that naive mutations are likely to generate programs that do not compile. Hence, we have designed a novel greybox fuzzer for C compilers and analysers, to address this challenge, by: (1) developing a new set of mutations to target com- mon C constructs, (2) controlling the aggressiveness of the mutation activation so that generated programs mostly pass compilation, and (3) transforming fuzzed programs so that they produce meaningful output, allowing differential testing to be used as a test oracle, and paving the way for fuzzer-generated programs to be integrated into compiler and code analyser regression test suites. We have implemented our approach in GrayC, a new open- source LibFuzzer-based tool, and present experiments showing that it provides more coverage on the middle- and back-end stages com- pared to other mutation-based approaches. We have used GrayC to identify 29 confirmed compiler and code analyser bugs: 24 previously unknown bugs (with 22 of them already fixed in response to our reports) and 5 confirmed bugs reported independently shortly before we found them. Apart from the results above, we have contributed 23 simplified versions of coverage-enhancing test cases produced by GrayC to the Clang/LLVM test suite, targeting 86 previously uncovered functions in the LLVM codebase.
GrayC is available here. Instructions on how to install and use the tool and scripts will be made available shortly.
This work was supported by EPSRC (EP/R011605/1 and EP/R006865/1).
GrayC: Greybox Fuzzing of Compilers and Analysers for C
Karine Even-Mendoza, Arindam Sharma, Cristian Cadar, Alastair Donaldson
ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA 2023)