Thesis title: Multimodal Deep Learning for Medical Imaging
As interpreting medical findings is multimodal by its very nature, artificial intelligence needs to be able to interpret multimodal biomedical data to understand the complex mechanisms underlying a disease to progress towards higher informative clinical decision-making, supporting the diagnosis and prognosis process. In this respect, this work presents novel multimodal deep learning (DL) methodologies useful for interpreting data modalities. In particular, it investigates methods for studying when, which and how to learn shared representations of multimodal data. Furthermore, it introduces an optimization algorithm that automatically selects the learning paradigms that should be fused in a pool of many, thus moving forward with respect to the typical handcrafted approaches where researchers and practitioners combine the learners on the basis of the nature of the data, the task at hand, the networks’ structure, and their experience. Finally, this work also studies methods to explain the decisions taken by the multimodal networks. The proposed methodologies are validated using multimodal data of patients affected by COVID-19, fusing radiomic and clinical data for diagnostic and prognostic targets, obtaining state-of-the-art results.