Skeletal Program Enumeration for Rigorous Compiler Testing
A program can be viewed as a syntactic structure P (syntactic skeleton) parameterized by a collection of the identifiers V (variable names). This paper introduces the skeletal program enumeration (SPE) problem: Given a fixed syntactic skeleton P and a set of variables V , enumerate a set of programs P exhibiting all possible variable usage patterns within P. It proposes an effective realization of SPE for systematic, rigorous compiler testing by leveraging three important observations: (1) Programs with different variable usage patterns exhibit diverse control- and data-dependence information, and help exploit different compiler optimizations; (2) most real compiler bugs were revealed by small tests (i.e., small-sized P) — this “small-scope” observation opens up SPE for practical compiler validation; and (3) SPE is exhaustive w.r.t. a given syntactic skeleton and variable set, offering a level of guarantee absent from all existing compiler testing techniques.
The key challenge of SPE is how to eliminate the enormous amount of equivalent programs w.r.t. α-conversion. Our main technical contribution is a novel algorithm for computing the canonical (and smallest) set of all non-α-equivalent programs. To demonstrate its practical utility, we have applied the SPE technique to test C/C++ and Scala compilers using syntactic skeletons derived from their own regression test-suites. Our evaluation results are extremely encouraging. In less than six months, our approach has led to 217 confirmed GCC/Clang bug reports, 119 of which have already been fixed, and the majority are long latent bugs despite the extensive prior testing efforts (e.g., Csmith and EMI). The results also show that our algorithm for enumerating non-α-equivalent programs provides six orders of magnitude reduction, enabling processing the GCC test-suite in under a month versus 40K+ years with a naïve enumeration. Moreover, in about three weeks, our technique has also found 29 CompCert crashing bugs and 42 bugs in two Scala optimizing compilers. These results demonstrate our SPE technique’s generality and further illustrate its effectiveness.
Mon 19 JunDisplayed time zone: Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna change
16:10 - 17:50 | Dynamic Analysis and TestingPLDI Research Papers at Actes, Civil Engineering Chair(s): Michael Pradel TU Darmstadt | ||
16:10 25mTalk | Achieving High Coverage for Floating-point Code via Unconstrained Programming PLDI Research Papers Media Attached | ||
16:35 25mTalk | Instruction Punning: Lightweight Instrumentation for x86-64 PLDI Research Papers Buddhika Chamith Indiana University, Luke Dalessandro Indiana University, Bo Joel Svensson Chalmers University of Technology, Sweden, Ryan R. Newton Indiana University Media Attached | ||
17:00 25mTalk | Low Overhead Dynamic Binary Translation on ARM PLDI Research Papers Amanieu d'Antras University of Manchester, Cosmin Gorgovan University of Manchester, Jim Garside University of Manchester, Mikel Luján Media Attached | ||
17:25 25mTalk | Skeletal Program Enumeration for Rigorous Compiler Testing PLDI Research Papers Qirun Zhang University of California, Davis, Chengnian Sun University of California, Davis, Zhendong Su University of California, Davis Media Attached |