Mon 19 Jun 2017 11:15 - 11:40 at Actes, Civil Engineering - Learning and Probabilistic Chair(s): Swarat Chaudhuri

We present a scalable approach for establishing similarity
between stripped binaries (with no debug information). The main challenge is to establish similarity even when the code has been compiled using different compilers, with different optimization levels, or has been modified. Overcoming this challenge, while avoiding false positives, is invaluable to the process of reverse engineering, locating vulnerable code, and identifying \ac{IP} theft and plagiarism.

Finding similarity in binaries presents a natural tradeoff between the scalability of the approach, and its ability to identify semantic similarity which is crucial for precision. Previous techniques have been mostly heavily biased towards one of the ends of this spectrum. We present a technique that is scalable, precise and architecture-agnostic. It works by decomposing binary procedures to comparable segments, lifting segments to a \emph{canonical, optimized form} which allows for efficient semantic comparison, and then focusing comparisons on segments that are \emph{statistically significant} for establishing similarity.

We have implemented our technique in a tool called GitZ and performed an extensive evaluation. We show that GitZ is able to perform millions of comparisons efficiently, and find similarity with high accuracy.

Mon 19 Jun
10:50 - 12:30: PLDI Research Papers - Learning and Probabilistic at Actes, Civil Engineering
Chair(s): Swarat ChaudhuriRice University
