Similarity of Binaries through re-Optimization
We present a scalable approach for establishing similarity
between stripped binaries (with no debug information). The main challenge is to establish similarity even when the code has been compiled using different compilers, with different optimization levels, or has been modified. Overcoming this challenge, while avoiding false positives, is invaluable to the process of reverse engineering, locating vulnerable code, and identifying \ac{IP} theft and plagiarism.
Finding similarity in binaries presents a natural tradeoff between the scalability of the approach, and its ability to identify semantic similarity which is crucial for precision. Previous techniques have been mostly heavily biased towards one of the ends of this spectrum. We present a technique that is scalable, precise and architecture-agnostic. It works by decomposing binary procedures to comparable segments, lifting segments to a \emph{canonical, optimized form} which allows for efficient semantic comparison, and then focusing comparisons on segments that are \emph{statistically significant} for establishing similarity.
We have implemented our technique in a tool called GitZ and performed an extensive evaluation. We show that GitZ is able to perform millions of comparisons efficiently, and find similarity with high accuracy.
Mon 19 JunDisplayed time zone: Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna change
10:50 - 12:30 | Learning and ProbabilisticPLDI Research Papers at Actes, Civil Engineering Chair(s): Swarat Chaudhuri Rice University | ||
10:50 25mTalk | DemoMatch: API Discovery from Demonstrations PLDI Research Papers Media Attached | ||
11:15 25mTalk | Similarity of Binaries through re-Optimization PLDI Research Papers | ||
11:40 25mTalk | Synthesizing Program Input Grammars PLDI Research Papers Osbert Bastani Stanford University, Rahul Sharma Microsoft Research, Alex Aiken Stanford University, Percy Liang Stanford University Media Attached | ||
12:05 25mTalk | Compiling Markov Chain Monte Carlo Algorithms for Probabilistic Modeling PLDI Research Papers Daniel Huang Harvard University, Jean-Baptiste Tristan Oracle Labs, Greg Morrisett Cornell University Media Attached |