Grammar Mutation for Testing Input Parsers (Registered Report)

Abstract

Grammar-based fuzzing is an effective method for testing programs that consume structured inputs, particularly input parsers. A prerequisite of this method is to have a specification of the input format in the form of a grammar. Consequently, the success of a grammar-based fuzzing campaign is highly dependent on the available grammar. If the grammar does not accurately represent the input format, or if the system under test (SUT) does not conform strictly to that grammar, there may be an impedance mismatch between inputs generated via grammar-based fuzzing and inputs accepted by the SUT. Even if the SUT has been designed to strictly conform to the grammar, the SUT parser may exhibit vulnerabilities that would only be triggered by slightly invalid inputs. Grammar-based fuzzing, by construction, will not yield such edge case inputs.

To overcome these limitations, we present Gmutator, an approach that mutates an input grammar and leverages the Grammarinator fuzzer to produce inputs conforming to the mutated grammars. As a result, Gmutator can find inputs that do not conform to the original grammar but are (wrongly) accepted by an SUT. In addition, Gmutator-generated inputs have the potential to increase SUT code coverage compared with the standard approach. We present preliminary results applying Gmutator to two JSON parsing libraries, where we are able to identify a few inconsistencies and observe an increase in covered code. We propose a plan for a full experimental evaluation over four different input formats—JSON, XML, URL and Lua—and twelve SUTs (three per input format).