Alessandro Epasto - "Privacy and fairness in clustering"


Clustering is a fundamental unsupervised machine learning problem that lies at the core of several applications. With the increasing use of clustering in real applications it is becoming apparent the need to include privacy and fairness considerations in the design of clustering algorithms. In this talk, I will review some work on the recent area of designing clustering algorithms that include constraints on protecting the privacy of the data and on ensuring a fair representation of the users. First, I will review recent work (SODA22, KDD21) on clustering enforcing size-constraints which has applications in real-world anonymization applications (such as the Chrome Privacy Sandbox project at Google). I will present an O(1)-approximate algorithm for the r-gather problem in O(log n) MPC rounds. This result is based on a connection between the r-gather problem and maximal independent set / ruling set of the power graph in the MPC model which I will discuss. I will also show evidence on the effectiveness of the algorithm on real advertisement applications. Then, I will briefly review work (WSDM22, Neurips20, AISTATS20) on clustering where fairness constraints require rethinking classical approaches to protect the interest of the users. References: M. Almanza, A. Epasto, A. Panconesi, G. Re. "k-Clustering with Fair Outliers" WSDM, 2022 A. Epasto, M. Mahdian, V. Mirrokni, P. Zhong "Massively Parallel and Dynamic Algorithms for Minimum Size Clustering". SODA, 2022 A. Epasto, A. Muñoz Medina, et al. "Clustering for Private Interest-based Advertising". KDD Ads Track, 2021 S. Ahmadian, A. Epasto, M. Knittel, R. Kumar, M. Mahdian, B. Moseley, P. P., S. Vassilvitskii, Y. Wang, "Fair Hierarchical Clustering". NeurIPS, 2020 S. Ahmadian, A. Epasto, M. Mahdian, R. Kumar, "Fair Correlation Clustering". AISTATS 2020

24/03/2022



© Università degli Studi di Roma "La Sapienza" - Piazzale Aldo Moro 5, 00185 Roma