MASSIMILIANO MANCINI

PhD Graduate

PhD program:: XXXII


supervisor: Barbara Caputo

Thesis title: Towards Recognizing New Semantic Concepts in New Visual Domains

Deep learning is the leading paradigm in computer vision. However, deep models heavily rely on large scale annotated datasets for training. Unfortunately, labeling data is a costly and time-consuming process and datasets cannot capture the infinite variability of the real world. Therefore, deep neural networks are inherently limited by the restricted visual and semantic information contained in their training set. In this thesis, we argue that it is crucial to design deep neural architectures that can operate in previously unseen visual domains and recognize novel semantic concepts. In the first part of the thesis, we describe different solutions to enable deep models to generalize to new visual domains, by transferring knowledge from a labeled source domain(s) to a domain (target) where no labeled data are available. We first address the problem of unsupervised domain adaptation assuming that both source and target datasets are available but as mixtures of multiple latent domains. In this scenario, we propose to discover the multiple domains by introducing in the deep architecture a domain prediction branch and to perform adaptation by considering a weighted version of batch-normalization (BN). We also show how variants of this approach can be effectively applied to other scenarios such as domain generalization and continuous domain adaptation, where we have no access to target data but we can exploit either multiple sources or a stream of target images at test time. Finally, we demonstrate that deep models equipped with graph-based BN layers are effective in predictive domain adaptation, where information about the target domain is available only in the form of metadata. In the second part of the thesis, we show how to extend the knowledge of a pre-trained deep model incorporating new semantic concepts, without having access to the original training set. We first consider the problem of adding new tasks to a given network and we show that using simple task-specific binary masks to modify the pre-trained filters suffices to achieve performance comparable to those of task-specific models. We then focus on the open-world recognition scenario, where we are interested not only in learning new concepts but also in detecting unseen ones, and we demonstrate that end-to-end training and clustering are fundamental components to address this task. Finally, we study the problem of incremental class learning in semantic segmentation and we discover that the performances of standard approaches are hampered by the fact that the semantic of the background changes across different learning steps. We then show that a simple modification of standard entropy-based losses can largely mitigate this problem. In the final part of the thesis, we tackle a more challenging problem: given images of multiple domains and semantic categories (with their attributes), how to build a model that recognizes images of unseen concepts in unseen domains? We also propose an approach based on domain and semantic mixing of inputs and features, which is a first, promising step towards solving this problem.

Research products

11573/1434918 - 2020 - Modeling the background for incremental learning in semantic segmentation
Cermelli, Fabio; Mancini, Massimiliano; Rota Bulò, Samuel; Ricci, Elisa; Caputo, Barbara - 04b Atto di convegno in volume
conference: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2020 (Seattle, WA, USA; Virtual)
book: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) - (978-1-7281-7168-5; 978-1-7281-7169-2)

11573/1442996 - 2020 - Boosting Deep Open World Recognition by Clustering
Fontanel, Dario; Cermelli, Fabio; Mancini, Massimiliano; Rota Buló, Samuel; Ricci, Elisa; Caputo, Barbara - 01a Articolo in rivista
paper: IEEE ROBOTICS AND AUTOMATION LETTERS (USa, Piscataway, NJ: IEEE Robotics and Automation Society) pp. 5985-5992 - issn: 2377-3766 - wos: WOS:000554894900022 (1) - scopus: 2-s2.0-85089352603 (4)

11573/1434915 - 2020 - Boosting binary masks for multi-domain learning through affine transformations
Mancini, Massimiliano; Ricci, Elisa; Caputo, Barbara; Rota&Nbsp;, ; Bulò, Samuel - 01a Articolo in rivista
paper: MACHINE VISION AND APPLICATIONS (Springer-Verlag New York Incorporated:175 Fifth Avenue:New York, NY 10010:(212)460-1500, EMAIL: orders@springer-ny.com, INTERNET: http://www.springer-ny.com, Fax: (212)533-3503) pp. - - issn: 0932-8092 - wos: WOS:000542826800001 (1) - scopus: 2-s2.0-85087046274 (1)

11573/1334128 - 2019 - The RGB-D Triathlon: Towards Agile Visual Toolboxes for Robots
Cermelli, Fabio; Mancini, Massimiliano; Ricci, Elisa; Caputo, Barbara - 04b Atto di convegno in volume
conference: 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (Macau; China)
book: 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) - (978-1-7281-4004-9)

11573/1331861 - 2019 - Knowledge is Never Enough: Towards Web Aided Deep Open World Recognition
Mancini, Massimiliano; Karaoguz, Hakan; Ricci, Elisa; Jensfelt, Patric; Caputo, Barbara - 04b Atto di convegno in volume
conference: 2019 International Conference on Robotics and Automation (ICRA) (Montreal, QC; Canada)
book: 2019 International Conference on Robotics and Automation (ICRA) - (978-153866026-3)

11573/1331867 - 2019 - Discovering Latent Domains for Unsupervised Domain Adaptation Through Consistency
Mancini, Massimiliano; Porzi, Lorenzo; Cermelli, Fabio; Caputo, Barbara - 04b Atto di convegno in volume
conference: 20th International Conference on Image Analysis and Processing, ICIAP 2019 (Trento; Italy)
book: Image Analysis and Processing – ICIAP 2019 - (978-3-030-30645-8)

11573/1331869 - 2019 - Inferring Latent Domains for Unsupervised Deep Domain Adaptation
Mancini, Massimiliano; Porzi, Lorenzo; Rota Bulò, Samuel; Caputo, Barbara; Ricci, Elisa - 01a Articolo in rivista
paper: IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE (IEEE / Institute of Electrical and Electronics Engineers Incorporated:445 Hoes Lane:Piscataway, NJ 08854:(800)701-4333, (732)981-0060, EMAIL: subscription-service@ieee.org, INTERNET: http://www.ieee.org, Fax: (732)981-9667) pp. 1-1 - issn: 0162-8828 - wos: WOS:000607383300008 (16) - scopus: 2-s2.0-85099721367 (18)

11573/1331876 - 2019 - Adding new tasks to a single network with weight transformations using binary masks
Mancini, Massimiliano; Ricci, Elisa; Caputo, Barbara; Rota Bulò, Samuel - 04b Atto di convegno in volume
conference: 15th European Conference on Computer Vision, ECCV 2018 (Munich; Germany)
book: Computer Vision – ECCV 2018 Workshops - (978-3-030-11011-6; 978-3-030-11012-3)

11573/1331873 - 2019 - AdaGraph: Unifying Predictive and Continuous Domain Adaptation Through Graphs
Mancini, Massimiliano; Rota Bulò, Samuel; Caputo, Barbara; Ricci, Elisa - 04b Atto di convegno in volume
conference: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (Long Beach; United States)
book: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) - ()

11573/1331859 - 2019 - Structured Domain Adaptation for 3D Keypoint Estimation
Osterno Vasconcelos, Levi; Mancini, Massimiliano; Boscaini, Davide; Caputo, Barbara; Ricci, Elisa - 04b Atto di convegno in volume
conference: 2019 International Conference on 3D Vision (3DV) (Quebec; Canada)
book: 2019 International Conference on 3D Vision (3DV) - (978-172813131-3)

11573/1189882 - 2018 - Kitting in the Wild through Online Domain Adaptation
Mancini, Massimiliano; Karaoguz, Hakan; Ricci, Elisa; Jensfelt, Patric; Caputo, Barbara - 04b Atto di convegno in volume
conference: 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2018 (Madrid; Spain)
book: 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) - (978-153868094-0)

11573/1189877 - 2018 - Boosting Domain Adaptation by Discovering Latent Domains
Mancini, Massimiliano; Porzi, Lorenzo; Rota Bulò, Samuel; Caputo, Barbara; Ricci, Elisa - 04b Atto di convegno in volume
conference: 31st Meeting of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2018 (Salt Lake City; United States)
book: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition - (978-153866420-9)

11573/1189848 - 2018 - Robust Place Categorization With Deep Domain Generalization
Mancini, Massimiliano; Rota Bulò, Samuel; Caputo, Barbara; Ricci, Elisa - 01a Articolo in rivista
paper: IEEE ROBOTICS AND AUTOMATION LETTERS (USa, Piscataway, NJ: IEEE Robotics and Automation Society) pp. 2093-2100 - issn: 2377-3766 - wos: (0) - scopus: 2-s2.0-85063005745 (40)

11573/1189853 - 2018 - Best Sources Forward: Domain Generalization through Source-Specific Nets
Mancini, Massimiliano; Rota Bulò, Samuel; Caputo, Barbara; Ricci, Elisa - 04b Atto di convegno in volume
conference: 25th IEEE International Conference on Image Processing, ICIP 2018 (Athens; Greece)
book: 2018 25th IEEE International Conference on Image Processing (ICIP) - (978-147997061-2)

11573/975557 - 2017 - Embedding Words and Senses Together via Joint Knowledge-Enhanced Training
Mancini, Massimiliano; Camacho Collados, Jose'; Iacobacci, Ignacio Javier; Navigli, Roberto - 04b Atto di convegno in volume
conference: 21st Conference on Computational Natural Language Learning (CoNLL 2017) (Vancouver; Canada)
book: CoNLL 2017. The 21st Conference on Computational Natural Language Learning. Proceedings of the Conference, August 3 - August 4, 2017 Vancouver, Canada - (978-1-945626-54-8)

11573/1016995 - 2017 - Learning Deep NBNN Representations for Robust Place Categorization
Mancini, Massimiliano; Rota Bulò, Samuel; Ricci, Elisa; Caputo, Barbara - 01a Articolo in rivista
paper: IEEE ROBOTICS AND AUTOMATION LETTERS (USa, Piscataway, NJ: IEEE Robotics and Automation Society) pp. 1794-1801 - issn: 2377-3766 - wos: WOS:000413739500074 (20) - scopus: 2-s2.0-85046444166 (26)

© Università degli Studi di Roma "La Sapienza" - Piazzale Aldo Moro 5, 00185 Roma