Thesis title: Un lessico emotivo per l’italiano: dalla teoria linguistica alla pratica computazionale
Language and emotions are inextricably linked. This interconnection has long fascinated linguists, psychologists, and data scientists. Psycholinguistic studies have extensively demonstrated that emotions permeate every aspect of language. In recent years, the advent of natural language processing techniques has allowed for a deeper understanding of this relationship, leading to the development of tools and resources for automatically analyzing and recognizing emotions in written texts.
This doctoral dissertation introduces ELIta (Emotion Lexicon for Italian), a novel linguistic resource dedicated to the analysis of emotions in Italian texts. ELIta, the first Italian lexicon entirely manually annotated by native speakers for more than 6000 items, provides a solid foundation for emotion research in natural language.
The dissertation describes the lexicon creation process, including word selection, annotation methodologies, and comparisons with existing resources. By offering an unaggregated and fully manual lexicon, ELIta addresses a critical gap in Italian language research and can be applied to various fields of study, from psycholinguistics to natural language processing.
ELIta presents a basic Italian lexicon and also includes emojis, enabling the analysis of a wide range of texts, including informal and youthful language.
The dissertation utilizes ELIta to investigate the relationships between emotions, gender, and age. It presents an innovative analysis of the connection between basic emotions and emotion dimensions, a pioneering contribution to Italian literature due to the scarcity of resources offering both annotations. Furthermore, the dissertation conducts a qualitative analysis of word-emotion associations, providing empirical support for grounded cognition theories and distributional semantics.
Finally, the dissertation proposes a groundbreaking study on oxymorons, examining the relationship between their constituents and their emotional perception from both psycholinguistic and computational perspectives.