GIANLUCA CAPOZZI

Dottore di ricerca

ciclo: XXXVII

supervisore: Giuseppe Antonio Di Luna
co-supervisore: Irene Amerini

Titolo della tesi: Attacking Binary Function Similarity Systems

The widespread diffusion of IoT devices and the growing availability of open-source software are amplifying the demand for autonomous solutions in the field of binary code analysis. As devices running software continue to increase, they generate various amounts of data that need to be analyzed for various purposes, such as security assessments, performance optimization, and code validation. This increasing volume of data, combined with the complexity of modern software and firmware, necessitates efficient tools and methodologies to assist researchers and analysts in binary code analysis. To address these challenges, the scientific community is moving towards Deep Learning-based solutions for binary analysis. These solutions typically provide end-to-end capabilities for handling complex tasks and can alleviate the workload of human analysts. Among these tasks, Binary Function Similarity (BFS) detection is gaining more and more importance. This assesses whether two binary functions are compiled from the same source code. Despite the effort in proposing and systematizing DNN-based solutions for BFS, it is unclear what their resiliency would be against adversarial attacks. Indeed, a major drawback of DNN-based solutions is their sensitivity to adversarial attacks. This thesis investigates the robustness of Binary Function Similarity (BFS) systems against adversarial attacks, presenting two main contributions. First, we introduce black-box and white-box approaches to assess the resilience of BFS systems against adversarial attacks when comparing two functions directly. Our findings demonstrate that these systems are vulnerable to both targeted and untar- geted attacks with respect to similarity objectives. We conduct extensive experiments on three state-of-the-art BFS solutions, revealing that they are more susceptible to black-box attacks than white-box ones while exhibiting greater resilience against targeted attacks. Second, we conduct a comprehensive evaluation of eight state-of-the-art BFS systems, assessing their resilience to adversarial attacks when used as search engines to retrieve functions from a given pool that are most similar to a certain query. Here, we propose a simple black-box method that alters both the topology and the content of the Control Flow Graph (CFG) of the attacked functions. Our findings reveal a critical insight: top performance on clean data does not necessarily correlate with superior robustness, underscoring the performance-robustness trade-offs that must be carefully considered when deploying such models.

Produzione scientifica

11573/1739397 - 2025 - On the Lack of Robustness of Binary Function Similarity Systems

Capozzi, Gianluca; Tang, Tong; Wan, Jie; Yang, Ziqi; D'elia, Daniele Cono; Di Luna, Giuseppe Antonio; Cavallaro, Lorenzo; Querzoni, Leonardo - 04b Atto di convegno in volume

congresso: IEEE European Symposium on Security and Privacy (Venezia)

libro: 10th IEEE European Symposium on Security and Privacy - ()

11573/1724676 - 2024 - Adversarial Attacks against Binary Similarity Systems

Capozzi, Gianluca; D'elia, Daniele Cono; Luna, Giuseppe Antonio Di; Querzoni, Leonardo - 01a Articolo in rivista

rivista: IEEE ACCESS (Piscataway NJ: Institute of Electrical and Electronics Engineers) pp. 161247-161269 - issn: 2169-3536 - wos: WOS:001349752400001 (0) - scopus: 2-s2.0-85208236138 (1)