ALESSIO FAGIOLI

Dottore di ricerca

ciclo: XXXV


co-supervisore: Prof. Luigi Cinque

Titolo della tesi: Towards Human-like Neural Networks: A Neuroscience Approach

Developing ever-improving neural network architectures is paramount to addressing complex tasks in heterogeneous fields such as medical image analysis, emotion recognition, and person re-identification. However, designing a model from scratch (or defining an upgrade for an existing one) can be a daunting experience. Current approaches manage this problem by devising new design choices such as manipulating the loss functions, employing a diverse learning strategy, exploiting gradient evolution at training time, optimizing the network hyper-parameters, or increasing the architecture depth. Although most of these solutions adequately address the tasks mentioned above, they still fall behind compared to expert human abilities. This is an unsurprising result since neural networks are based on an approximation of human neurons. In this context, this thesis focuses on implementing more accurate neuroscience concepts to improve existing or create new high-performing architectures, independently of their application field. In detail, the human brain's biological workings provided novel ideas to implement (or update) several models, advancing the state-of-the-art across a variety of topics such as signature verification, 3D hand and shape reconstruction, object recognition, and others. The main idea behind this work was to define a general model leveraging several neuroscience notions that encompass distinct cognitive functionalities, with a focus on the human visual system, and use this model to derive new ideas and solutions to different topics. More specifically, images observed by a human are broken down into low-level features by neurons residing in the eye. Then, it is possible to derive intermediate-level and high-level features through a series of abstractions performed by groups of neurons, similar to a deep neural network structure. Such representations enable a person, for instance, to recognize known objects or infer how to use new ones simply by looking at their appearance. All these steps were used to define three modules inside the general model, one for each abstraction level, so that neuronal activities could be reproduced up to the highest cognitive functions. What is more, the general model also contains additional components that ``supervise'' this entire process and have a fundamental role in the maintenance of neurons: glial cells. The latter can spawn or destroy neurons, substantially altering neural connections in the brain and directly regulating its plasticity and proficiency to retain memories, abilities generally not included in neural networks. These aspects were collected into a fourth transversal module affecting all of the previous components so that both plasticity and memory could be included in some of the devised solutions. In conclusion, all ideas were tested accordingly to the addressed task, and, in all cases, there were improvements upon existing state-of-the-art approaches using task-specific evaluation metrics, strongly indicating that neuroscience concepts can, in fact, inspire new solutions given a proper modeling phase.

Produzione scientifica

Connessione ad iris non disponibile

© Università degli Studi di Roma "La Sapienza" - Piazzale Aldo Moro 5, 00185 Roma