Dottoressa di ricerca

ciclo: XXXVI

relatore: Gian Gaetano Tartaglia

Titolo della tesi: Machine learning methods applied to classify complex diseases using genomic data

Complex diseases present challenges in disease prediction due to their multifactorial nature. In this work, I explored the prediction of four different complex diseases, multiple sclerosis, Alzheimer’s disease, schizophrenia, and Parkinson’s disease using machine learning methods. The primary objective of this research is to investigate the robustness and variability of machine learning models constructed using genomic data in the context of predicting complex diseases. Different models will be tested to classify affected and healthy individuals, and their performance will be compared with the results obtained using polygenic risk score. The secondary goal is to apply explainability methods to extract the features considered more informative by the models. This is because understanding which genomic variants are considered informative for disease discrimination during the training process can provide significant insights into the underlying genetic basis of the diseases and identify potential targets for further investigation.

Produzione scientifica

11573/1666765 - 2023 - The PRALINE database: protein and Rna humAn singLe nucleotIde variaNts in condEnsates
Vandelli, Andrea; Arnal Segura, Magdalena; Monti, Michele; Fiorentino, Jonathan; Broglia, Laura; Colantoni, Alessio; Sanchez De Groot, Natalia; Torrent Burgas, Marc; Armaos, Alexandros; Tartaglia, Gian Gaetano - 01a Articolo in rivista
rivista: BIOINFORMATICS ([Oxford] : Oxford University Press) pp. - - issn: 1367-4811 - wos: WOS:001025519200032 (0) - scopus: 2-s2.0-85145954607 (0)

© Università degli Studi di Roma "La Sapienza" - Piazzale Aldo Moro 5, 00185 Roma