Two empirical studies a decade apart, and the underlying infrastructure for analyzing code, test and coverage evolution in C/C++ software.
Overview
Software repositories provide rich information about the construction and evolution of software systems. While static data that can be mined directly from version control systems has been extensively studied, dynamic metrics concerning the execution of the software have received much less attention, due to the inherent difficulty of running and monitoring a large number of software versions.
We present Covrig, a flexible infrastructure that can be used to run each version of a system in isolation and collect static and dynamic software metrics, using a lightweight virtual machine environment that can be deployed on a cluster of local or cloud machines.
We use Covrig to conduct two empirical study, a decade apart:
1) A study published in 2014 examining how code and tests co-evolve in six popular open-source systems.
We report the main characteristics of software patches, analyse the evolution of program and patch coverage,
assess the impact of nondeterminism on the execution of the test suite, and investigate whether the coverage of code containing bugs and bug fixes is higher than average.
2) A study published in 2025 which significantly expands the analysis to nine mature C/C++ projects and a combined period of 78 years of development time. Our focus is on understanding how development practices have changed and how these changes have impacted the way in which software is tested. We report on the co-evolution of code and tests; the adoption of CI, coverage, and fuzzing services; the changes to the overall code coverage achieved by developer test suites; the distribution of patch coverage across revisions; how different kinds of code changes impact coverage; and the occurrence and evolution of flaky tests. Our large-scale study paints a mixed picture in terms of how software development and testing have changed over the past decade. While developers put more emphasis on software testing and the overall code coverage achieved by developer test suites has increased in most projects, coverage and fuzzing services are not widely adopted, many patches are still poorly tested, and the fraction of flaky tests has increased.
Download
First study: The raw data for the study is available, but as it is large, please email Cristian Cadar for the link.
Second study: Our artifact can be found at https://zenodo.org/records/10937123
Research Support
This first study was generously sponsored by the UK EPSRC through a DTA studentship and the grant EP/J00636X/1, and by Google through a European Doctoral Fellowship.
The second study was generously sponsored through an ERC Consolidator Grant (grant agreement 819141).
Publications
-
Code, Test and Coverage Evolution in Mature Software Systems: Changes over the Past Decade
Thomas Bailey, Cristian Cadar
IEEE International Conference on Software Testing, Verification, and Validation (ICST 2025)
-
Covrig: A Framework for the Analysis of Code, Test, and Coverage Evolution in Real Software
Paul Dan Marinescu, Petr Hosek, Cristian Cadar
International Symposium on Software Testing and Analysis (ISSTA 2014)
Talks
-
Covrig: A Framework for the Analysis of Code, Test, and Coverage Evolution in Real Software
Conference talk @ International Symposium on Software Testing and Analysis (ISSTA 2014)