CESARE CAMPAGNANO

Dottore di ricerca

ciclo: XXXVI


supervisore: Prof. Gabriele Tolomei
co-supervisore: Prof. Fabrizio Silvestri

Titolo della tesi: Foundational Advancements of Large Language Models: Current and Future Implications

The advent of Large Language Models (LLMs) represents a pivotal revolution in the field of Artificial Intelligence. These models have paved the way for a new era where machines can understand and generate language on par with humans in some tasks, showcasing remarkable proficiency across a wide spectrum of linguistic nuances. However, especially for open models, much of the focus has been predominantly on the English language. This thesis delves into this transformative landscape, exploring the current state and potential future trajectories of LLMs. Specifically, this work illustrates a series of novel contributions to the development and implementation of a novel foundational model, assessing models' performance on multilingual downstream tasks, and discussing the revolutionary prospects of integrating AI models into Human-Computer Interactions. The first contribution, DanteLLM, highlights the disparity in language model resources and attempts to bridge this gap by introducing an Italian-centric LLM, setting a new standard for language-specific model development. The second work, XL-WA, introduces a novel benchmark for word alignment, facilitating progress in cross-lingual understanding and translation. Furthermore, SRL4E advances the field of structured emotion classification by proposing a novel standardized formulation and framework for emotion-based semantic role labeling, with a unified emotion taxonomy. Finally, Prompt-to-OS envisions a future where operating systems and user interfaces are fundamentally redefined through the integration of generative AI, emphasizing the transformative potential of LLMs beyond traditional applications. By providing foundational tools for non-English LLMs, this dissertation not only showcases the technical advancements and applications of multilingual LLMs but also aims to establish the initial groundwork towards bridging the linguistic gap in Natural Language Processing. We critically examine the practical, societal and ethical implications of these technologies, hoping to pave the way for a more inclusive, democratized, and ethically aware AI future.

Produzione scientifica

Connessione ad iris non disponibile

© Università degli Studi di Roma "La Sapienza" - Piazzale Aldo Moro 5, 00185 Roma