MICHELE DUSI

PhD Graduate

PhD program:: XXXVIII


supervisor: Alfonso Emilio Gerevini
co-supervisor: Ivan Serina

Thesis title: Analysis and Detection of Social Biases in Deep Neural Language Models

The rising adoption of deep Neural Language Models (NLMs) has caused concerns about the presence of social biases and their potential societal impact. While a substantial body of work has documented biased behaviors in model outputs, the question of where such biases originate within the language modeling pipeline remains only partially understood. This thesis introduces a method for detecting and quantifying social biases in pre-trained language models that aims to balance interpretability and statistical rigor. The proposed approach tests whether the information encoded in embedded representations of protected attributes (e.g., gender, nationality, religion) can be used to predict stereotyped attributes through a simple supervised classification task. Requiring only a minimal labeled dataset, this method provides an accessible way to probe representational biases. Experimental results on several Transformer-based models reveal consistent associations between protected and stereotyped properties. In addition, a complementary visualization-based technique is introduced to support qualitative inspection of bias patterns. Building on this methodological framework, the thesis also conducts a systematic analysis of recent literature on bias in language models, with the aim of exploring the context and mechanisms through which the phenomenon manifests. A central motivation is to move beyond the common assumption that bias is solely a reflection of training data; therefore, the resulting analysis is organized around four complementary perspectives. First, it examines the role of training and fine-tuning corpora in shaping measurable bias. Second, it analyzes how bias can emerge, evolve, or be amplified during the training process itself. Third, it considers model-internal factors, investigating how biases are encoded and propagated through parameters, representations, and architectural components. Finally, it discusses evidence on model scale and complexity, assessing whether certain forms of bias appear or intensify only beyond specific thresholds of capacity or model size. Overall, this thesis provides a structured synthesis of current knowledge on bias origins in language models, while offering a practical tool for their empirical assessment. The ultimate goal is to provide actionable insights that support the development of fairer and more inclusive NLP technologies.

Research products

11573/1725661 - 2024 - Discrimination Bias Detection through Categorical Association in Pre-trained Language Models
Dusi, M.; Arici, N.; Gerevini, A. E.; Putelli, L.; Serina, I. - 01a Articolo in rivista
paper: IEEE ACCESS (Piscataway NJ: Institute of Electrical and Electronics Engineers) pp. 162651-162667 - issn: 2169-3536 - wos: WOS:001351490800001 (2) - scopus: 2-s2.0-85207634263 (5)

11573/1725848 - 2024 - Supervised Bias Detection in Transformers-based Language Models
Dusi, M.; Gerevini, A. E.; Putelli, L.; Serina, I. - 04b Atto di convegno in volume
conference: 2023 International Conference of the Italian Association for Artificial Intelligence Doctoral Consortium, AIxIA-DC 2023 (Roma; Italia)
book: AIxIA-DC 2023 AIxIA Doctoral Consortium 2023 - ()

11573/1725668 - 2023 - Quick Subset Construction
Dusi, M.; Lamperti, G. - 01a Articolo in rivista
paper: SOFTWARE-PRACTICE & EXPERIENCE (John Wiley & Sons Limited:1 Oldlands Way, Bognor Regis, P022 9SA United Kingdom:011 44 1243 779777, EMAIL: cs-journals@wiley.co.uk, INTERNET: http://www.wiley.co.uk, Fax: 011 44 1243 843232) pp. 2092-2132 - issn: 0038-0644 - wos: WOS:001041210000001 (0) - scopus: 2-s2.0-85166549081 (0)

11573/1725875 - 2022 - Graphical Identification of Gender Bias in BERT with a Weakly Supervised Approach
Dusi, M.; Arici, N.; Gerevini, A. E.; Putelli, L.; Serina, I. - 04b Atto di convegno in volume
conference: 6th Workshop on Natural Language for Artificial Intelligence, NL4AI 2022 (Udine; Italy)
book: NL4AI 2022 Sixth Workshop on Natural Language for Artificial Intelligence. Proceedings of the Sixth Workshop on Natural Language for Artificial Intelligence (NL4AI 2022) co-located with 21th International Conference of the Italian Association for Artificial Intelligence (AI*IA 2022) - ()

11573/1725876 - 2021 - Fixing Nondeterminism in Large Discrete-Event Knowledge
Dusi, M.; Lamperti, G. - 04c Atto di convegno in rivista
paper: PROCEDIA COMPUTER SCIENCE (Amsterdam : Elsevier) pp. 407-416 - issn: 1877-0509 - wos: WOS:000720289000041 (0) - scopus: 2-s2.0-85116910555 (0)
conference: 25th International Conference on Knowledge-Based and Intelligent Information & Engineering Systems (KES 2021) (Szczecin; Poland)

11573/1725867 - 2020 - Conservative Determinization of Translated Automata by Embedded Subset Construction
Dusi, M.; Lamperti, G. - 04b Atto di convegno in volume
conference: KES International Conference on Intelligent Decision Technologies (Online; Virtual)
book: Intelligent Decision Technologies. Proceedings of the 12th KES International Conference on Intelligent Decision Technologies (KES-IDT 2020) - (9789811559242; 9789811559259)

© Università degli Studi di Roma "La Sapienza" - Piazzale Aldo Moro 5, 00185 Roma