Titolo della tesi: Toward Robust and Fair Personalized Federated Learning under Quantified Non-IID Data
In a Fourth Industrial Revolution landscape, information flows across institutions—banks, hospitals, public agencies—and IoT devices such as smartphones, wearables, home assistants, and connected vehicles. However, centralized data storage to build machine learning (ML) models is often constrained by privacy laws, regulatory requirements, bandwidth, and governance. Federated Learning (FL) enables collaborative model training without centralizing data, but real deployments face non-independent and identically distributed (non-IID) data that slows or destabilizes convergence, and harms both global and per-client models’ performance. This thesis develops a metrics-driven, fairness-aware approach to robust (personalized) FL under non-IID conditions. First, it presents FedArtML, a toolkit that partitions centralized datasets into federated clients with tunable levels of non-IID data across label, attribute (feature), quantity, and spatiotemporal skews, and includes metrics to quantify the level of non-IID data. Using FedArtML, a large-scale empirical study over eight datasets shows that label and spatiotemporal skews are most damaging, while attribute and quantity skews are comparatively less harmful. For label skew, we identify two degradation regimes at HD≈0.5 and HD≥0.75, where accuracy drops accelerate and rounds-to-accuracy increase; transfer-learning models are particularly sensitive.
Building on these findings, the thesis proposes three mitigation methods. FedLECC combines client clustering with loss-aware selection, improving accuracy while reducing communication overhead (up to 15× vs. FedAvg) and selecting ∼ 20% of clients without loss of performance. PSI-PFL adapts the Population Stability Index (PSI) to label skew with data-driven thresholds; its weighted variant (WPSI) is more discriminative than other state-of-the-art metrics and yields higher global accuracy and stronger client fairness. Clust-PSI-PFL forms PSI-guided client clusters and trains cluster-specific models, consistently improving global accuracy and reducing local disparity (up to 37% average distance reduction) across Dirichlet and Similarity partitions. Two real-world case studies validate feasibility and document performance/runtime trade-offs under IID and non-IID regimes: (i) multi-hospital 12-lead ECG arrhythmia classification, and (ii) a public-administration chatbot leveraging large language models (LLMs). Overall, the results demonstrate that quantifying, characterizing, and aligning selection/personalization to measured non-IID data produces more robust and fair FL.