Odelia Melamed - Machine Unlearning: A Theoretical Framework for Practical Unlearning Techniques

Machine Unlearning aims to enable trained models to “forget’’ specific data upon request, a growing requirement motivated by privacy regulations and the pursuit of trustworthy AI. A widely used approach is gradient ascent, which is intuitively believed to reverse the learning effect of a data point by stepping in the opposite direction of its training gradient. While simple and effective in practice, its theoretical foundations remain largely unexplored, offering an exciting opportunity to connect applied methods with a principled understanding of what it truly means for a model to forget. In this talk, I will begin with a brief overview of the applied achievements and theoretical progress in machine unlearning, and then present our theoretical framework that bridges these perspectives. We define successful unlearning through the lens of Karush–Kuhn–Tucker (KKT) conditions, which characterize the solutions that training naturally converges to. Under this definition, a model has effectively forgotten data when it reaches a KKT point of the retained dataset. We prove that a single gradient ascent step on the forgotten data is sufficient to achieve this goal. I believe this represents an important step toward providing theoretical guarantees for certifiable and trustworthy unlearning mechanisms in real-world systems.

05/11/2025

when and where: Wed. 5 Nov at 14:00 - aula 101, Building D RM112, Via Regina Elena

Odelia Melamed is a PhD student advised by Professor Adi Shamir at the Department of Applied Mathematics and Computer Science, Weizmann Institute of Science, Israel. Her research interests include Adversarial Examples in Machine Learning and Theoretical Machine Learning. Particularly, she is interested in Adversarial Examples and the Neural Networks’ properties that lead to their existence in the theoretical aspect, the geometric and algebraic properties of trained Neural Networks due to their architecture, the data set, and the training process. Interested in the transferability of adversarial examples and the nature of defense methods against these examples. Focus on implicit low-dimensional data sets in high-dimensional input space, and combine theory and applied research.