Algorithmic Frontiers of Big Data
Instructors: Chris Schwiegelshohn and Francesco d'Amore Duration: 20 hours
When: 27-28-29/10 from 10 to 13:30, 3-4-5-6-7/11 from 10:30 to 12:30
Where: Room B203, DIAG, Via Ariosto 25
Abstract
Large and high-dimensional data sets are a staple of modern data analysis. There are two approaches to address the challenge of dealing with them. In order to make existing algorithms viable, we are interested in reducing the size and number of dimensions as much as possible. Alternatively, we can use different computational models as studied in distributed or streaming algorithms to mitigate limitations of our existing hardware. In this PhD class, we will offer an introduction into both of these directions.
In the first part of the course, we will see sparsification techniques for memory-efficient algorithms. The techniques that we introduce, such as random projections, PCA and sensitivity sampling, also lend themselves to data analysis directly, which we will also cover.
The second part of the course focuses on distributed computing, a concept that is at the core of modern computational infrastructures—ranging from the internet and data centers to blockchain networks. We will present the LOCAL model of computation (Linial, FOCS '87), a synchronous distributed model in which the cost of an algorithm is measured by the number of communication rounds, under the assumption of unlimited local computation and unbounded message size. The LOCAL model captures the following fundamental question: how far must information propagate in a network to solve a given task? We will introduce the basic concepts of distributed algorithms and the principles of the LOCAL model, review classical and more modern complexity results, and explore recent developments in its quantum and even “super-quantum” extensions.
The course is aimed at PhD students and is self-contained, with no particular background necessary other than familiarity with some basic linear algebra and probability theory.
Short Bios
Chris Schwiegelshohn is an associate professor from Aarhus University who is nostalgic about his time as part of the Sapienza faculty. He works mainly in algorithms, with a focus on approximation algorithms and learning theory.
Francesco d’Amore is a postdoctoral researcher at the Gran Sasso Science Institute (GSSI). His research primarily focuses on the theory of distributed computation. He received his PhD from Université Côte d’Azur and Inria, and later held postdoctoral positions at Aalto University and Bocconi University, before joining GSSI.
---
Physics-Informed Statistical Learning for Spatial and Functional Data
Abstract. This course offers an introduction to a family of physics-informed statistical learning methods designed for spatial and functional data. These models build upon nonparametric and semiparametric regression frameworks with roughness penalties. The penalties incorporate differential operators—ranging from simple second derivatives to more complex Partial Differential Equations—encoding the physics of the underlying phenomena, and complying with the geometry of the domain over which the data are observed. The methods can handle spatial and spatio-temporal data, as well as functional data, observed over multidimensional domains that can have complex shapes, such as non-convex planar regions, curved surfaces, irregular volumes, and linear networks. Moreover, the use of unstructured mesh discretization endows the methods with high flexibility, enabling the capture of highly localized signals, strong anisotropies, and non-stationary patterns.
The course will explore these methods through real-world applications in environmental and life sciences, demonstrating their effectiveness in modeling intricate spatial and functional data structures. Practical lab sessions will utilize the Python package fdaPDE.
When: February 2026
Where
Zoom – id: 867 5387 8695 – code: 627930 [link]
Sapienza University of Rome
Department of Computer, Control and Management Engineering (DIAG)
Via Ariosto, 25, 00185 Roma
Sala riunioni B101 – 1st floor
Schedule
09.00 – 10.00 | Session 1
10.00 – 10.20 | break
10.20 – 11.20 | Session 2
11.20 – 11.40 | break
11.40 – 12.40 | Session 3
12.40 – 14.20 | break
14.20 – 15.20 | Session
---
Title: Federated Learning - from data harmonization to federated AI training
Abstract:
This hackathon introduces participants to the process of developing and training AI models in a distributed, privacy-compliant research environment. During the event, students gain hands-on experience with cutting-edge federated learning technologies. They will learn how to build a decentralized infrastructure and harmonize heterogeneous clinical datasets using ontology-based methods. Participants will also build and operate a real federated network based on the insights of the Horizon Europe projects dAIbetes (grant agreement number 101136305) and Microb-AI-ome (grant agreement number 101079777). Participants will further learn how secure data exchange enables cross-institutional collaboration. Finally, participants will train machine learning models on harmonized, clinic-local data without centralizing sensitive information. By the end of the hackathon, participants will have a solid understanding of the technical fundamentals, challenges, and research potential of federated AI for biomedical applications, as well as an implemented, working end-to-end federated learning workflow.
Lecturers: Simon Süwer and Julian Klemm
Short bio for Simon:
Simon has been a PhD student at CoSyBio since 2024 working on the Horizon Europe project dAIbetes. He completed a Bachelor of Science in Applied Computer Science at the University of Applied Sciences and Arts Hannover, followed by a Master of Science in Computer Science at the University of Vienna, specialising in Data Science. In his master thesis he worked on the combination of session- and sequence-based recommender systems in a dynamic Graph Neural Network (GNN), in particular on the development of hierarchical dynamic GNNs.
Simon's current research aims at developing privacy-preserving tools that make federated collaboration not only easier, but almost effortless. By merging theory and practice, his goal is to create innovative solutions to complex challenges that redefine the way we think about data collaboration and privacy.
Short bio for Julian:
Julian has been a PhD student at CoSyBio since April 2023 as part of the Horizon Europe projects FeatureCloud and Microb-AI-ome. He received a Bachelor of Science in Biology as well as a Master of Science in Bioinformatics from the University Hamburg. In his master thesis he focused on privacy-enhancing techniques for federated learning, especially differential privacy, while in hi PhD his focus is on a federated data warehouse to allow easier data sharing in a privacy-preserving manner. Generally, Julian is focused on bringing privacy preserving tools for almost effortless federated collaboration.