FEDERICO FONTANA

PhD Graduate

PhD program:: XXXVIII


supervisor: Prof. Luigi Cinque

Thesis title: Efficient Deep Learning For Computer Vision

The advancement of modern computer vision, while driven by the remarkable performance of Deep Neural Networks (DNNs), is increasingly confronted by the practical limitations imposed by their own escalating complexity. A prevailing trend toward ever-larger and more computationally demanding models has created a significant barrier to the deployment of sophisticated artificial intelligence on resource-constrained platforms, such as mobile devices, wearables, and autonomous drones. This thesis directly confronts this critical challenge by investigating and advancing techniques for efficient deep learning, with a central focus on two powerful and aggressive model compression strategies: Binary Neural Networks (BNNs) [29] and neural network pruning. The core hypothesis guiding this work is that extreme model compression can be achieved while maintaining high performance across a diverse set of real-world computer vision problems, thereby enabling the deployment of advanced AI in practical, resource-limited scenarios. Methodologically, this research first addresses the foundational problems that have impeded the widespread adoption of these efficient models. This includes the development of CycleBNN [ 50], a novel cyclic precision training methodology designed to significantly reduce the substantial computational overhead associated with training BNNs. In a complementary investigation into sparsity, this work also introduces Distilled Gradual Pruning with Pruned Fine-tuning (DG2PF) [ 51], a comprehensive algorithm that synergizes network pruning with knowledge distillation to achieve high levels of model compression with minimal degradation in accuracy. Having established these core techniques, the thesis proceeds to demonstrate their practical viability through their application to a series of challenging, latency- sensitive tasks. Highly efficient BNN-based solutions are presented for hand gesture recognition (BNNAction-Net)[47], deepfake detection (Faster Than Lies) [ 107 ], and cross-view geolocalization for UAVs (BiCrossNet) [49]. In each case, the proposed models achieve performance competitive with their full-precision counterparts while offering reductions in computational complexity and memory footprint that span orders of magnitude. Furthermore, this research extends the study of deepfake detection beyond the paradigm of static classification by reframing it as a continual learning problem. Through a novel chronological evaluation framework that simulates the real-world evolution of generative technologies, a fundamental limitation in the generalization capabilities of current detectors is identified. This critical finding leads to the proposal of the Non-Universal Deepfake Distribution Hypothesis[ 48], which posits that each deepfake generator imprints a unique, non-transferable signature, thereby underscoring the absolute necessity of continuous model adaptation for any robust, long-term detection strategy. The collective results of this thesis empirically validate that BNNs and neural network pruning are not merely theoretical concepts but are powerful, practical tools for developing high-performance, resource-efficient computer vision systems. By systematically bridging the gap between state-of-the-art accuracy and real-world deployability, this work contributes to a more accessible, sustainable, and scalable future for the field of artificial intelligence.

Research products

11573/1732914 - 2025 - Semantically Guided Representation Learning for Action Anticipation
Diko, Anxhelo; Avola, Danilo; Prenkaj, Bardh; Fontana, Federico; Cinque, Luigi - 04b Atto di convegno in volume
conference: European Conference on Computer Vision (Milan)
book: 18th European Conference on Computer Vision, ECCV 2024 - (978-3-031-73390-1)

11573/1759307 - 2025 - BiCrossNet: resource-efficient cross-view geolocalization with binary neural networks
Fontana, Federico; Jantos, Thomas; Steinbrener, Jan; Cinque, Luigi; Foresti, Gian Luca; Rinner, Bernhard - 01a Articolo in rivista
paper: MACHINE LEARNING: SCIENCE AND TECHNOLOGY (Bristol: IOP Publishing) pp. - - issn: 2632-2153 - wos: WOS:001555826100001 (0) - scopus: 2-s2.0-105013760106 (0)

11573/1727934 - 2025 - SATEER: Subject-Aware Transformer for EEG-Based Emotion Recognition
Lanzino, Romeo; Avola, Danilo; Fontana, Federico; Cinque, Luigi; Scarcello, Francesco; Luca Foresti, Gian - 01a Articolo in rivista
paper: INTERNATIONAL JOURNAL OF NEURAL SYSTEMS (World Scientific Publishing Company:PO Box 128, Farrer Road, Singapore 912805 Singapore:011 65 6 4665775, EMAIL: journal@wspc.com.sg, INTERNET: http://www.wspc.com.sg, http://www.worldscinet.com, Fax: 011 65 6 4677667) pp. 2550002-1-2550002-18 - issn: 0129-0657 - wos: WOS:001357863800001 (4) - scopus: 2-s2.0-85209733647 (5)

11573/1748952 - 2024 - UAV Geo-Localization for Navigation: A Survey
Avola, Danilo; Cinque, Luigi; Emam, Emad; Fontana, Federico; Luca Foresti, Gian; Raoul Marini, Marco; Mecca, Alessio; Pannone, Daniele - 01a Articolo in rivista
paper: IEEE ACCESS (Piscataway NJ: Institute of Electrical and Electronics Engineers) pp. 125332-125357 - issn: 2169-3536 - wos: WOS:001316171100001 (0) - scopus: 2-s2.0-85203542361 (4)

11573/1713999 - 2024 - BNNAction-Net: Binary Neural Network on Hands Gesture Recognitions
Fontana, Federico; Di Matteo, Alessandro; Cinque, Luigi; Placidi, Giuseppe; Marini, Marco Raoul - 04b Atto di convegno in volume
conference: ACM SIGGRAPH 2024 Posters (Denver;USA)
book: ACM SIGGRAPH 2024 Posters - ()

11573/1706715 - 2024 - Distilled Gradual Pruning with Pruned Fine-tuning
Fontana, Federico; Lanzino, Romeo; Marini, Marco Raoul; Avola, Danilo; Cinque, Luigi; Scarcello, Francesco; Foresti, Gian Luca - 01a Articolo in rivista
paper: IEEE TRANSACTIONS ON ARTIFICIAL INTELLIGENCE (Piscataway NJ: IEEE) pp. 4269-4279 - issn: 2691-4581 - wos: (0) - scopus: 2-s2.0-85185387701 (15)

11573/1713996 - 2024 - Faster Than Lies: Real-time Deepfake Detection using Binary Neural Networks
Lanzino, Romeo; Fontana, Federico; Diko, Anxhelo; Marini, Marco Raoul; Cinque, Luigi - 04b Atto di convegno in volume
conference: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (Seattle; USA)
book: 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) - (979-8-3503-6547-4; 979-8-3503-6548-1)

11573/1696518 - 2023 - Hand Gesture Recognition Exploiting Handcrafted Features and LSTM
Avola, D.; Cinque, L.; Emam, E.; Fontana, F.; Foresti, G. L.; Marini, M. R.; Pannone, D. - 04b Atto di convegno in volume
conference: Proceedings of the 22nd International Conference on Image Analysis and Processing, ICIAP 2023 (ita)
book: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) - (978-3-031-43147-0; 978-3-031-43148-7)

11573/1672470 - 2022 - Use of an antagonist of HMGB1 in mice affected by malignant mesothelioma: a preliminary ultrasound and optical imaging study
Venturini, Massimo; Mezzapelle, Rosanna; La Marca, Salvatore; Perani, Laura; Spinelli, Antonello; Crippa, Luca; Colarieti, Anna; Palmisano, Anna; Marra, Paolo; Coppola, Andrea; Fontana, Federico; Carcano, Giulio; Tacchetti, Carlo; Bianchi, Marco; Esposito, Antonio; Crippa, Massimo P. - 01a Articolo in rivista
paper: EUROPEAN RADIOLOGY EXPERIMENTAL (Springer) pp. - - issn: 2509-9280 - wos: WOS:000752295400001 (4) - scopus: 2-s2.0-85124295028 (4)

11573/1696623 - 2021 - Hemorrhoids Embolization: State of the Art and Future Directions
Rebonato, Alberto; Maiettini, Daniele; Patriti, Alberto; Giurazza, Francesco; Tipaldi, Marcello Andrea; Piacentino, Filippo; Fontana, Federico; Basile, Antonio; Venturini, Massimo - 01a Articolo in rivista
paper: JOURNAL OF CLINICAL MEDICINE (MDPI Publishing, Basel, Switzerland) pp. - - issn: 2077-0383 - wos: WOS:000689297200001 (5) - scopus: 2-s2.0-85112145367 (7)

© Università degli Studi di Roma "La Sapienza" - Piazzale Aldo Moro 5, 00185 Roma