ELISA GUGLIOTTA

Dottoressa di ricerca

ciclo: XXXIV



Titolo della tesi: Tunisian Arabizi: Linguistic Analyses and Corpus Building using Natural Language Processing

This thesis is about a project focused on Tunisian Arabic encoded in Arabizi, the Latin-based writing system for digital conversations. The project led to the creation of two integrated and independent tools to provide a response to the lack of tools to support research on Tunisian Arabic: a linguistic corpus and a neural network architecture created to annotate the former with various levels of linguistic information (word classification, transliteration, tokenization, POS-tagging, lemmatization). The thesis discusses the choices made in terms of computational and linguistic methodology and the strategies adopted to improve our computational results. In addition, the analyses performed on our corpus data, in order to investigate Tunisian Arabizi, will be outlined.

Produzione scientifica

11573/1501335 - 2020 - TArC: Incrementally and semi-automatically collecting a Tunisian arabish corpus
Gugliotta, E.; Dinarelli, M. - 04b Atto di convegno in volume
congresso: 12th International Conference on Language Resources and Evaluation, LREC 2020 (Palais du Pharo, Marseille)
libro: LREC 2020 - 12th International Conference on Language Resources and Evaluation, Conference Proceedings - (979-10-95546-34-4)

11573/1604925 - 2020 - TArC. Un corpus d’arabish tunisien
Gugliotta, Elisa; Dinarelli, Marco - 04c Atto di convegno in rivista
rivista: REVUE TAL () pp. 232-240 - issn: 1965-0906 - wos: (0) - scopus: (0)
congresso: JEP - TALN - RECITAL 2020 - les 33ème Journées d’Études sur la Parole et la 27ème conférence sur le Traitement Automatique des Langues Naturelles (Nancy, France)

11573/1604927 - 2020 - Multi-Task sequence prediction for Tunisian Arabizi multi-level annotation
Gugliotta, Elisa; Dinarelli, Marco; Kraif, Olivier - 04b Atto di convegno in volume
congresso: The Fifth Arabic Natural Language Processing Workshop (WANLP), co-located Online with COLING'2020 (Barcelona, Spain)
libro: Proceedings of the Fifth Arabic Natural Language Processing Workshop - (978-1-952148-38-5)

11573/1275840 - 2018 - Arabish come supporto all’apprendimento dei dialetti arabi come LS
Gugliotta, Elisa - 02a Capitolo o Articolo
libro: Didattica dell’arabo e certificazione linguistica: riflessioni e iniziative - (978-88-94885-82-8)

11573/1275667 - 2018 - Lahajat: a rule-based converter of standard Arabic lexical databases into spoken Arabic forms
Lancioni, Giuliano; Gugliotta, Elisa; Pettinari, Valeria - 04b Atto di convegno in volume
congresso: 4th IEEE International Colloquium on Information Science and Technology (CiSt) (Tangier, Morocco)
libro: 2016 4th IEEE International Colloquium on Information Science and Technology (CiSt) - (978-1-5386-4385-3)

© Università degli Studi di Roma "La Sapienza" - Piazzale Aldo Moro 5, 00185 Roma