Compiler Fuzzing: How Much Does It Matter?

Abstract

If compilers are central to software development, they are typically not part of the usual suspects when the cause of a software failure must be investigated. Nevertheless, hundreds of bugs are actually fixed each month in popular compilers like GCC and LLVM. These can lead to miscompilations and thus make the miscompiled application fail. As a consequence, the recent years have seen the development of several fuzzing tools (like CSmith or EMI), aimed at extensively searching compilers for bugs. These tools, which implement some flavours of random compiler testing, were able to report not less than 2K bugs in GCC and LLVM. However, the ability of the individuated bugs to impact real applications in practice remains an open question. This is a significant threat to the validity of compiler fuzzers as worthwhile tools, not only because common sense suggests that compiler bugs seldom cause software failures, but also because fuzzers do find bugs in artificial programs that have been randomly generated or modified. In this work (in progress), we propose a practical approach to estimate the impact of a compiler bug over hundreds of common applications. We apply this approach to study the actual impact of bugs found by state-of-the-art fuzzers and to compare it with the impact of bugs manually reported by compiler end-users.

Talk at S-REPLS 10.