RICCARDO ORLANDO

Dottore di ricerca

ciclo: XXXVII


supervisore: Roberto Navigli

Titolo della tesi: Enhancing Semantic Understanding Across Multiple Dimensions: Towards A Unified Framework for Semantic Knowledge Extraction

Natural Language Processing (NLP) is a subfield of Artificial Intelligence (AI), which investigates the interaction between computers and human languages. Despite the tremendous progress that we have witnessed in recent years, largely driven by increasingly sophisticated Deep Learning techniques, NLP systems are still a long way from truly understanding what they process. Within NLP, Natural Language Understanding (NLU) is the area that seeks to enable machine comprehension of human language. One of the key roles of NLU is transforming unstructured text into explicit semantic knowledge, with applications beyond NLP. Nonetheless, modern NLP systems face several challenges that prevent us from achieving true NLU across languages, domains, and applications. These challenges range from the performance disparities between high-resource and low-resource languages, to the increasing model complexity that require specialized expertise, and the heterogeneous mixture of approaches that limits the interaction between different semantic abstractions. In this thesis, we aim to contribute to the field of NLU by addressing each of these challenges. First, we propose novel efficient systems to tackle NLU tasks across multiple languages to mitigate the gap in multilingual performance. Second, we introduce a unified framework to orchestrate different NLU systems while maximizing inference speed and usability, and promoting the integration of semantic knowledge across different domains and applications. Third, we take the first steps towards moving from a unified framework to a unified model for Semantic Knowledge Extraction with an efficient architecture that is capable of handling multiple semantic tasks simultaneously on an academic budget, while also setting a new state of the art. Finally, we also address the multilingual gap from a resource perspective by introducing a novel large-scale multilingual dataset for semantic knowledge extraction. Through these contributions, this thesis makes significant progress towards overcoming the aforementioned challenges, not only from the perspective of benchmark results but also -- and perhaps more importantly -- in terms of enhancing usability, inference speed, and language coverage, which are the keys to enable applications in more domains. We hope our work will foster large-scale research and innovation in NLP, paving the way for the integration of semantics into real-world settings with more accessible, comprehensive, and robust NLU systems.

Produzione scientifica

11573/1726493 - 2024 - MOSAICo: a Multilingual Open-text Semantically Annotated Interlinked Corpus
Conia, Simone; Barba, Edoardo; Martinez Lorenzo, Abelardo Carlos; Huguet Cabot, Pere Lluis; Orlando, Riccardo; Procopio, Luigi; Navigli, Roberto - 04b Atto di convegno in volume
congresso: North American Association for Computational Linguistics (Mexico City; Mexico)
libro: Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers) - (9798891761148)

11573/1727951 - 2024 - ZEBRA: Zero-Shot Example-Based Retrieval Augmentation for Commonsense Question Answering
Molfese, Francesco Maria; Conia, Simone; Orlando, Riccardo; Navigli, Roberto - 04b Atto di convegno in volume
congresso: Empirical Methods in Natural Language Processing (Miami; United States)
libro: Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing - (979-8-89176-164-3)

11573/1726492 - 2024 - ReLiK: Retrieve and LinK, Fast and Accurate Entity Linking and Relation Extraction on an Academic Budget
Orlando, Riccardo; Huguet Cabot, Pere Lluis; Barba, Edoardo; Navigli, Roberto - 04b Atto di convegno in volume
congresso: Association for Computational Linguistics (Bangkok; Thailand)
libro: Findings of the Association for Computational Linguistics: ACL 2024 - ()

11573/1728121 - 2024 - Minerva LLMs: The First Family of Large Language Models Trained from Scratch on Italian Data
Orlando, Riccardo; Moroni, Luca; Huguet Cabot, Pere-Lluís; Barba, Edoardo; Conia, Simone; Orlandini, Sergio; Fiameni, Giuseppe; Navigli, Roberto - 04b Atto di convegno in volume
congresso: Italian Conference on Computational Linguistics (Pisa; Italy)
libro: Proceedings of the Tenth Italian Conference on Computational Linguistics (CLiC-it 2024) - ()

11573/1672374 - 2023 - Universal Semantic Annotator
Navigli, R.; Orlando, R.; Campagnano, C.; Conia, S. - 02a Capitolo o Articolo
libro: European Language Grid - (978-3-031-17257-1; 978-3-031-17258-8)

11573/1685067 - 2023 - Exploring Non-Verbal Predicates in Semantic Role Labeling: Challenges and Opportunities
Orlando, Riccardo; Conia, Simone; Navigli, Roberto - 04b Atto di convegno in volume
congresso: Association for Computational Linguistics (Toronto; Canada)
libro: Findings of the Association for Computational Linguistics: ACL 2023. Proceedings of the Annual Meeting of the Association for Computational Linguistics - (978-1-959429-62-3)

11573/1652975 - 2022 - Universal Semantic Annotator: the First Unified API for WSD, SRL and Semantic Parsing
Orlando, Riccardo; Conia, Simone; Faralli, Stefano; Navigli, Roberto - 04b Atto di convegno in volume
congresso: Language Resources and Evaluation Conference (Marseille; France)
libro: Proceedings of the Thirteenth Language Resources and Evaluation Conference - (9791095546726)

11573/1604131 - 2021 - InVeRo-XL: Making Cross-Lingual Semantic Role Labeling Accessible with Intelligible Verbs and Roles
Conia, Simone; Orlando, Riccardo; Brignone, Fabrizio; Cecconi, Francesco; Navigli, Roberto - 04b Atto di convegno in volume
congresso: Empirical Methods in Natural Language Processing (Punta Cana; Dominican Republic)
libro: Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing: System Demonstrations - (978-195591711-7)

11573/1603234 - 2021 - AMuSE-WSD: an all-in-one multilingual system for easy word sense disambiguation
Orlando, Riccardo; Conia, Simone; Brignone, Fabrizio; Cecconi, Francesco; Navigli, Roberto - 04b Atto di convegno in volume
congresso: Empirical Methods in Natural Language Processing (Punta Cana, Dominican Republic)
libro: Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing: System Demonstrations - ()

© Università degli Studi di Roma "La Sapienza" - Piazzale Aldo Moro 5, 00185 Roma