Function Clone Detection Evaluation

Abstract

While reverse engineering unknown binaries, one often finds already known or duplicate binary functions. These functions from the unknown binaries are similar or equal to known functions of other known binaries. This comparison and detection can of course only be done on assembly code level as we do not have access to the original source code of which the binary was compiled from, which in return makes the detection difficult as the assembly code can change often and heavily. With the rise of machine learning, many new promising approaches are published that detect these function clones, which help and speed up the manual work during binary analysis. But many of these approaches do not have public available implementations and are not utilized by commonly used reverse engineering and binary analysis tools. In this work I implement and compare four state-of-the-art function clone detection approaches, which are selected out of many recently proposed ones, and implement the best one in open-source software (OSS) tools, so it can be used practically for reverse engineering.

Publication
TU Wien