Dottoressa di ricerca

ciclo: XXXVI

supervisore: Laura Palagi
co-supervisore: Stefano Leonardi

Titolo della tesi: Toward Personalized Medicine and Transparent Decision-Making with Machine Learning Models for Complex Diseases: A Focus on HIV-1 and Type 2 Diabetes

Machine Learning (ML) applications have significantly impacted healthcare, providing an opportunity for a predictive approach in precision medicine. Time-series data sets, such as electronic medical records and registries, provide valuable information covering the patient's entire lifespan, capturing genetic and lifestyle risks, disease onset, comorbidities, treatment plans, and their effectiveness. This thesis focuses on applications of ML in the context of complex diseases, particularly Human immunodeficiency virus type 1 (HIV-1) and type 2 Diabetes (T2D). For HIV-1, the thesis leverages the Euresist Integrated Database, one of the world's largest repositories for HIV drug resistance data. The proposed ML system combines the strengths of rule-based genotypic interpretation systems (GIS) and ML algorithms to provide predictions of the outcome of antiretroviral therapies. It integrates information from historical and current genotypic tests, differently from GIS in the literature that uses only mutations detected in the last genotypic resistance test. A weighting factor for mutation is proposed. For each patient, mutation-specific weights are computed based on several factors that various experts believe to be influential for drug resistance, such as mutation observation time, viral load level at the time of detection, and Stanford scores related to mutation-drug pairs indicating drug susceptibility to the virus. This method not only enhances the predictive accuracy but also provides a framework for investigating the significance of various factors involved in the construction of the mutations' weights. However, additional challenges arise in predicting outcomes of antiretroviral therapies involving drugs not included in the training dataset, as there is insufficient data on therapies that contain these drugs or are newly launched. To address this problem, this thesis develops and asses a unique joint-fusion model, the MIX model, which combines a Graph Neural Network (GNN) with a Fully Connected Neural Network. The GNN is used to incorporate the clinical knowledge incorporated in the Stanford drug mutation resistance scores. In this way, the MIX model integrates drug-mutation relationships and genotypic data, allowing for improved accuracy of predictions of therapy outcomes and the ability to make accurate predictions of the outcome of therapies containing drugs on which the model has not been trained. In the context of T2D, the thesis uses a dataset from the Italian Associazione Medici Diabetologi to focus on primary prevention, with the aim of short-term prediction of diabetic retinopathy (DR). DR is one of T2D's most common complications for diabetes. We developed an Extreme Gradient Boosting model, which encompasses the unbalanceness of the dataset. Further, to address the crucial interpretability issue that is often compromised in tree ensemble (TE) methods, we introduce MIRET, a model based on mixed-integer linear programming (MILP). It builds an optimal tree that mimics the target TE and improves interpretability while maintaining predictive accuracy. In summary, this thesis substantiates the capabilities of ML in healthcare when tackling complex and long-lasting diseases like HIV-1 and T2D. By innovatively incorporating historical data, knowledge bases, and model interpretability, the research paves the way for more accurate treatment predictions and individualized care strategies.

Produzione scientifica

11573/1669860 - 2024 - Unboxing Tree ensembles for interpretability: A hierarchical visualization tool and a multivariate optimal re-built tree
Di Teodoro, Giulia; Monaci, Marta; Palagi, Laura - 01a Articolo in rivista
rivista: EURO JOURNAL ON COMPUTATIONAL OPTIMIZATION (Heidelberg: Springer) pp. - - issn: 2192-4406 - wos: WOS:001166643700001 (0) - scopus: 2-s2.0-85182604515 (0)

11573/1697819 - 2024 - Incorporating temporal dynamics of mutations to enhance the prediction capability of antiretroviral therapy's outcome for HIV-1
Di Teodoro, Giulia; Pirkl, Martin; Incardona, Francesca; Vicenti, Ilaria; Sönnerborg, Anders; Kaiser, Rolf; Palagi, Laura; Zazzi, Maurizio; Lengauer, Thomas - 01a Articolo in rivista
rivista: BIOINFORMATICS ([Oxford] : Oxford University Press) pp. - - issn: 1367-4811 - wos: (0) - scopus: (0)

11573/1676841 - 2023 - A machine-learning based bio-psycho-social model for the prediction of non-obstructive and obstructive coronary artery disease
Raparelli, Valeria; Romiti, Giulio Francesco; Di Teodoro, Giulia; Seccia, Ruggiero; Tanzilli, Gaetano; Viceconte, Nicola; Marrapodi, Ramona; Flego, Davide; Corica, Bernadette; Cangemi, Roberto; Pilote, Louise; Basili, Stefania; Proietti, Marco; Palagi, Laura; Stefanini, Lucia - 01a Articolo in rivista
rivista: CLINICAL RESEARCH IN CARDIOLOGY (Heidelberg, Darmstadt: Springer Medizin) pp. 1263-1277 - issn: 1861-0684 - wos: WOS:000962357500001 (3) - scopus: 2-s2.0-85168781941 (3)

11573/1682789 - 2023 - Cohort Profile: A European Multidisciplinary Network for the Fight against HIV Drug Resistance (EuResist Network)
Rossetti, Barbara; Incardona, Francesca; Di Teodoro, Giulia; Mommo, Chiara; Saladini, Francesco; Kaiser, Rolf; Sönnerborg, Anders; Lengauer, Thomas; Zazzi, Maurizio - 01a Articolo in rivista
rivista: TROPICAL MEDICINE AND INFECTIOUS DISEASE (Basel: MDPI AG, 2016-) pp. - - issn: 2414-6366 - wos: WOS:000997064900001 (2) - scopus: 2-s2.0-85160402248 (3)

11573/1644089 - 2022 - Spectrum of Atazanavir‐Selected Protease Inhibitor‐Resistance Mutations
Rhee, S. -Y.; Boehm, M.; Tarasova, O.; Di Teodoro, G.; Abecasis, A. B.; Sonnerborg, A.; Bailey, A. J.; Kireev, D.; Zazzi, M.; Shafer, R. W. - 02a Capitolo o Articolo
libro: Current Research on HIV Drug Resistance - ()

11573/1656623 - 2022 - Molecular Epidemiology of HIV-1 in Eastern Europe and Russia
Van De Klundert, Maarten A. A.; Antonova, Anastasiia; Di Teodoro, Giulia; Ceña Diez, Rafael; Chkhartishvili, Nikoloz; Heger, Eva; Kuznetsova, Anna; Lebedev, Aleksey; Narayanan, Aswathy; Ozhmegova, Ekaterina; Pronin, Alexander; Shemshura, Andrey; Tumanov, Alexandr; Pfeifer, Nico; Kaiser, Rolf; Saladini, Francesco; Zazzi, Maurizio; Incardona, Francesca; Bobkova, Marina; Sönnerborg, Anders - 01a Articolo in rivista
rivista: VIRUSES (Basel: MDPI) pp. - - issn: 1999-4915 - wos: WOS:000875409700001 (9) - scopus: 2-s2.0-85140776976 (9)

11573/1644111 - 2021 - A machine-learning-based bio-psycho-social model for the prediction of non-obstructive and obstructive coronary artery disease
Raparelli, V; Proietti, M; Romiti, G. F.; Seccia, R; Di Teodoro, G.; Tanzilli, G; Marrapodi, R; Flego, D; Corica, B; Cangemi, R; Palagi, L; Basili, S; Stefanini, L - 01h Abstract in rivista
rivista: EUROPEAN HEART JOURNAL (Oxford : Oxford University Press United Kingdom: Harcourt Publishers, Fax: 011 44 20 83085876) pp. 3064-3064 - issn: 0195-668X - wos: WOS:000720456903365 (0) - scopus: (0)

© Università degli Studi di Roma "La Sapienza" - Piazzale Aldo Moro 5, 00185 Roma