Testing and Analysis in the AI Era -Software Reliability Group

Testing and Analysis in the AI Era

Software development is undergoing a profound transformation, with developers increasingly relying on AI assistants and incorporating AI-generated code. These are seismic changes: software systems are evolving faster than ever before, and the core activity of software developers is shifting more and more from writing code to reviewing and validating it.

However, human-scale code review and validation cannot keep up with the speed and volume of AI-driven development. As a result, automated software analysis methods — ranging from fuzzing to symbolic execution to formal verification — are becoming critical safeguards in this AI era. To remain effective, these methods must scale along multiple dimensions: they must handle larger codebases, keep up with increasingly rapid development cycles, operate across a growing diversity of programming languages and system architectures, and infer and reason about higher-level semantic properties.

In this talk, I argue that testing and analysis techniques can themselves take advantage of AI. I will present our recent research efforts in this direction, including our work on designing an agentic concolic executor and an LLM-based technique for inferring the intent of software patches. These systems build on traditional analysis techniques but leverage large language models and agentic AI frameworks to extend their capabilities along the dimensions discussed above. I end the talk by reflecting on the comparative strengths and weaknesses of traditional and AI-based testing and analysis techniques and what this means for building trustworthy software in an era increasingly shaped by AI-generated code.

This is a dry run for an upcoming keynote at FM 2026.

Cristian Cadar is a Professor in the Department of Computing at Imperial College London, where he leads the Software Reliability Group (http://srg.doc.ic.ac.uk), working on automatic techniques for increasing the reliability and security of software systems. Prof. Cadar’s research has been recognised by several prestigious awards, including the EuroSys Jochen Liedtke Award, HVC Award, BCS Roger Needham Award, IEEE TCSE New Directions Award, Humboldt Research Award, and two test of time awards. Many of the research techniques he co-authored have been open-sourced and used in both academia and industry. In particular, he is co-author and maintainer of the KLEE symbolic execution system, a popular system with a large user base. Prof Cadar has a PhD in Computer Science from Stanford University, and undergraduate and Master’s degrees from the Massachusetts Institute of Technology.