ELISA GUGLIOTTA

PhD Graduate

PhD program:: XXXIV



Thesis title: Tunisian Arabizi: Linguistic Analyses and Corpus Building using Natural Language Processing

This thesis is about a project focused on Tunisian Arabic encoded in Arabizi, the Latin-based writing system for digital conversations. The project led to the creation of two integrated and independent tools to provide a response to the lack of tools to support research on Tunisian Arabic: a linguistic corpus and a neural network architecture created to annotate the former with various levels of linguistic information (word classification, transliteration, tokenization, POS-tagging, lemmatization). The thesis discusses the choices made in terms of computational and linguistic methodology and the strategies adopted to improve our computational results. In addition, the analyses performed on our corpus data, in order to investigate Tunisian Arabizi, will be outlined.

Research products

11573/1501335 - 2020 - TArC: Incrementally and semi-automatically collecting a Tunisian arabish corpus
Gugliotta, E.; Dinarelli, M. - 04b Atto di convegno in volume
conference: 12th International Conference on Language Resources and Evaluation, LREC 2020 (Palais du Pharo, Marseille)
book: LREC 2020 - 12th International Conference on Language Resources and Evaluation, Conference Proceedings - (979-10-95546-34-4)

11573/1604925 - 2020 - TArC. Un corpus d’arabish tunisien
Gugliotta, Elisa; Dinarelli, Marco - 04c Atto di convegno in rivista
paper: REVUE TAL () pp. 232-240 - issn: 1965-0906 - wos: (0) - scopus: (0)
conference: JEP - TALN - RECITAL 2020 - les 33ème Journées d’Études sur la Parole et la 27ème conférence sur le Traitement Automatique des Langues Naturelles (Nancy, France)

11573/1604927 - 2020 - Multi-Task sequence prediction for Tunisian Arabizi multi-level annotation
Gugliotta, Elisa; Dinarelli, Marco; Kraif, Olivier - 04b Atto di convegno in volume
conference: The Fifth Arabic Natural Language Processing Workshop (WANLP), co-located Online with COLING'2020 (Barcelona, Spain)
book: Proceedings of the Fifth Arabic Natural Language Processing Workshop - (978-1-952148-38-5)

11573/1275840 - 2018 - Arabish come supporto all’apprendimento dei dialetti arabi come LS
Gugliotta, Elisa - 02a Capitolo o Articolo
book: Didattica dell’arabo e certificazione linguistica: riflessioni e iniziative - (978-88-94885-82-8)

11573/1275667 - 2018 - Lahajat: a rule-based converter of standard Arabic lexical databases into spoken Arabic forms
Lancioni, Giuliano; Gugliotta, Elisa; Pettinari, Valeria - 04b Atto di convegno in volume
conference: 4th IEEE International Colloquium on Information Science and Technology (CiSt) (Tangier, Morocco)
book: 2016 4th IEEE International Colloquium on Information Science and Technology (CiSt) - (978-1-5386-4385-3)

© Università degli Studi di Roma "La Sapienza" - Piazzale Aldo Moro 5, 00185 Roma