Language-Based Software Testing

with José Antonio Zamudio Amaya, Marius Smytzek, Valentin Huber, Addison Crump, Alexi Turcotte, and many others

Random test input generators (fuzzers) have become the prime detectors of vulnerabilities in software. While generic fuzzers easily adapt to arbitrary programs under test, they offer very little possibilities to control or shape the generated inputs. In this talk, I present FANDANGO, a novel language-based fuzzer that combines grammars with predicates over input elements to produce inputs that satisfy all the given predicates. Examples of what such predicates can express include

  • input format constraints (“The length field should be equal to the length of the payload”)
  • code features (“Any variable used must be declared beforehand”)
  • statistical distributions (“Across all inputs, the voltage field must follow a Gaussian distribution, but never exceed 20 mV”)
  • data collections (“The credit-card-number field should come from the Python faker library”)

and more – actually, any property that can be expressed in a Python expression.

In our experiments, FANDANGO efficiently solved complex file formats and satisfied demanding predicates, up to full-fledged programming languages as test inputs for compilers. This opens the door towards personalized fuzzing, where testers can make use of their own knowledge and LLM knowledge to very effectively fuzz systems. Includes live demos!

Fandango is available at https://fandango-fuzzer.github.io/

Andreas Zeller is faculty at the CISPA Helmholtz Center for Information Security and professor for Software Engineering at Saarland University. His research on automated debugging, mining software archives, specification mining, and security testing has won several awards for its impact in academia and industry. Zeller is an ACM Fellow, holds an ACM SIGSOFT Outstanding Research Award, and has won two ERC Advanced Grants, Europe’s highest funding for individual researchers.