ANDREA MONTERUBBIANO

Dottore di ricerca

ciclo: XXXVI

co-supervisore: Prof. Salvatore Pontarelli

Titolo della tesi: Advanced Data Structures for efficient and precise Per-Flow Measurements

A vast scientific literature has been produced in the last decades that investigates optimal ways to monitor network traffic. Due to the rise in interest, scale, and complexity of telecommunication networks, this task has become more challenging and exciting at the same time. The requirements to perform such a task often collide with the limited amount of resources available on modern network devices, like switches and routers, confronted with the amount of network flows to be monitored. Part of the academic and industrial research then focused on the usage of approximated data structures: instead of keeping per-flow reserved statistics, these structures make their computational and memory resources shared among many flows or they keep track of just a limited subset of the overall traffic, exchanging a reduced memory footprint for the accuracy of their estimate. The recent blooming of Machine Learning research has raised attention on the possibility of combining machine learning models with approximate data structures, with new promising solutions already proposed to the academy. In this thesis we try to analyze the existing state of the art on these "learned" data structures for network monitoring; we then propose a novel system that tries to employ a single ML model that, beyond enhancing the performances of a flow monitoring system, attempts to optimize other networking tasks. By distributing the burden of the ML model overhead over multiple applications, we make users more willing to pay the price for the increased cost and complexity due to ML introduction. Besides this research direction, the analysis of the approximated data structure literature led to the conclusion that they are, in practice, heavily underutilized, i.e., sparse, thus wasting a significant amount of memory. We introduce a solution to such waste in the form of a data structure representation that leverages sparsity to reduce the memory footprint of per-flow monitoring systems while preserving their original accuracy. Ultimately, part of our research effort dealt with classical data structures: as part of this work, we show a novel low-error flow size estimate system based on a classical approximate data structure.

Produzione scientifica

ANDREA MONTERUBBIANO

Connessione ad iris non disponibile