Strengthening the Evolving Software Stack: From Compiler Fuzzing to Patch Testing

Modern software development hinges on the correctness and robustness of the software stack. Yet as software evolves at an ever-accelerating pace, even minor flaws within the stack can introduce pernicious bugs that slip through conventional quality-assurance pipelines. This thesis presents techniques to bolster compiler reliability and automate patch testing at scale, targeting two important components of the aforementioned stack. We begin by designing and implementing GrayC, a novel mutation-based, greybox fuzzing system that integrates syntactic and semantic program transformations to systematically explore deep parts of the compiler. Unlike existing mutation-based approaches, GrayC guarantees the generation of well-formed C programs that target deep compiler internals, uncovering subtle defects that evade purely random or black-box mutators. Building on this foundation, we introduce a unified patch-testing methodology comprising two components: PaZZer, which accelerates reachability-guided fuzzing within continuous integration pipelines by leveraging lightweight, incremental analyses; and P3 with patch specifications, an automatic product program generator that merges pre- and post-patch code with patch specifications to enable differential testing with off-the-shelf engines. Together, these tools automate the analysis for a given code change. The contributions of this thesis establish a practical, scalable path toward a more trustworthy software stack with better compiler ecosystems and safer software evolution.

Arindam Sharma is a PhD student at Imperial College London. His research focuses on improving the reliability of software updates and compilers through scalable patch testing and compiler fuzzing techniques.