Strong coresets for k-median and subspace clustering: Goodbye dimension

I will start my talk with an introduction to the area of coresets for clustering problems, give basic definitions and explain how coresets can be used to obtain distributed and streaming algorithms. During my talk, I will mostly focus on a commonly used coreset definition that has been introduced by Har-Peled and Mazumdar: A coreset is a weighted set S of points such that for all sets C of k centers we have (1-eps) cost(P,C) <= cost(S,C) <= (1+eps) cost(P,C), where cost(P,C) is the sum of distances of the points in P to their closest center in C. Then I will present a new algorithm to compute coresets that have a similar for-all guarantee as in the above definition and that consist of a number of points that is independent of the size of the input point set P and the dimension of the input space d. Joint work with David Woodruff.


11am, building F, 2nd floor (viale Regina Elena 295/B).

Christian Sohler received his PhD from University of Paderborn, Germany. After positions as assistant professor in Paderborn and associate Professor in Bonn and Dortmund, he has been promoted to full professor at TU Dortmund. Currently, he is visiting professor at Google Zurich. His research interests include the development and analysis of algorithms and data structures for processing very large data sets. His focus lies on the design of randomized algorithms for the analysis of very large graphs, streaming algorithms and clustering algorithms.

© Università degli Studi di Roma "La Sapienza" - Piazzale Aldo Moro 5, 00185 Roma