Repair-Driven Greybox Fuzzing

Abstract

We present repair-driven greybox fuzzing, a new approach to greybox fuzzing that combines the strengths of unstructured, byte-level input mutation and grammar-guided input generation. By increasing input diversity while preserving input validity, repair-driven fuzzing promises to improve bug-finding ability for systems under test such as programming language interpreters that consume highly structured inputs.

Our idea is to first mutate an input using a standard byte-level mutator, typically leading to an invalid input, and then repair the input using a grammar. Aggressively breaking and then repairing an input provides an effective way to reach parts of the input space that would be left unexplored by both byte-level and idiomatic grammar-based mutations.

We put this idea into practice via RepairFuzz, a new greybox fuzzer based on AFL++, leveraging the byte level mutations of AFL++ and using the CPCT+ error recovery algorithm for input repair. We present experiments applying RepairFuzz to six SUTs covering four programming language input formats (Lua, PHP, JavaScript, and Ruby), and present an experimental comparison with AFL++, Grammarinator, and Nautilus, the state-of-the-art in standard greybox fuzzing and grammar-guided fuzzing.

Our evaluation shows that RepairFuzz was able to find 13 confirmed bugs that were previously-unknown, including 9 that could not be found using AFL++, Grammarinator or Nautilus. Further, RepairFuzz yields absolute increases in code coverage for several SUTs and substantial complementary code coverage across all.