Reserarch Statement and Research Interests
Fabio Massimo Zanzotto is an Associate Professor at Department of Enterprise Engineering of the University of Rome ”Tor Vergata”. Since 1998, he has interests in the research endeavor of Artificial Intelligence. He is active in the area of Natural Language Processing and Machine Learning, mainly working in four topics:
distributed/distributional models for NLP, AI applied to precision medicine, recognizing textual entailment and syntactic parsing for Italian. Additionally, he has recently got interested in the field of Ethics&AI [14].
Area of distributed/distributional models for NLP
Representing syntax and meaning in vectors and tensors is becoming an very active research area in these years due to the recent renaissance of deep neural networks. Zanzotto is then interested in understanding the implication of distributed representations and compositional distributional semantics models (CDSMs) in natural language processing [19, 11, 16, 22]. He has started by proposing one of the first model to estimate full additive CDMSs [19]. Then, he has investigated the relation between syntax and semantics in CDSMs. As a result, he has hypothesized that CDSMs have a clear separation between syntax and semantics [6, 18]. Hence, he has started to investigate how to encode [22] and decode [?] syntactic structures in small vectors called distributed trees. Then, he proposed a way to clearly merge syntactic and semantic information in the same tensor [7]. Finally, he has started to investigate how these vectors can be directly produced from sentences by trying to replace syntactic parsers with distributed syntactic parsers [17, ?]. For this research area, he has been involved in the program committe of GEMS: GEometrical Models of Natural Language Semantics in (2009,2010, 2011) and *Sem (2012).
Area of AI applied to precision medicine
In recent years, the approach to medicine has substantially changed: global approaches have been pressured by a growing availability of electronic health records (EHR) and by the consequent demand to provide precision medicine. Hence, Zanzotto and collegues proposed a different approach from that generally used in the development of risk assessment models based on the arbitrary assignment of a score according to association analyses. To this purpose, Zanzotto used kernel learning machines and random optimization (RO) to produce VTE risk predictors in a population of consecutive ambulatory cancer patients. The risk predictors exploit significant patterns in and can be used in the development of clinical decision support systems [9, 8, 10].
Area of Recognizing Textual Entailment
The term recognizing textual entailment (RTE) has been introduced in NLP to systematically foster studies in defining computational models that replicate the human ability to determine whether or not texts imply sentences. A simple RTE example is determining whether or not “Acme bought BigT” entails “Acme owns BigT”. Zanzotto has been interested in this scientific endeavor since its beginning in 2005 and published a book on this topic [5] co-authored with Ido Dagan Mark Sommons and Dan Roth. He is mainly interested in studying the application of machine learning models to the RTE. After 2005, a first exploratory year [13], during the investigation of the application of ML models to RTE task, Zanzotto produced a major innovation for the specific RTE field [21, 20, 15] as well as for the related fields of natural language processing, machine learning [12], and graph analysis applied to text [23]. It is still under analysis if the innovation can generate an interest in the research area of graph theory as it involves graph isomorphism on a particular class of graphs. The innovation is the following. In the context of machine learning models based on kernel functions, he proposed a novel class of feature spaces encoding first-order rewrite rules. With this new class of feature spaces and the related kernel functions, Zanzotto and his colleagues produced a system which scored at the 3rd place in the 2006 worldwide RTE Challenge (scored 1st among the academic systems) and in the 5th place in 2007 RTE Challenge. The exploration of full potential of this ides is still an on-going research as the class of first-order rewrite rule feature spaces can be applied in many areas of natural language processing (machine translation, document summarization, stylistic control systems, question-answering, and dialogue models) as well as in many other research areas. For these interests, Zanzotto has been invited to the program committees of all the textual entailment recognition workshops and challenges after the first. In 2009, Zanzotto co-chaired the ACL workshop of Applied Textual Inference that has a program committee with researchers in the area of textual entailment recognition and natural language processing. In 2009, he co-organized the Italian chapter of the RTE challenge in Evalita 2009. Zanzotto, together with Ido Dagan and Dan Roth, gave a tutorial titled “Textual Entailment Recognition” in the 45th Association of Computational Linguistics (ACL) Annual Conference. He co-organized two editions of the TextGraphs Workshop series (2010,2011). He steadly participated to the program committees of the TextGraphs (2007, 2008, 2012, 2013) and he partecipated to the program committee of the workshop MLG-2010: Mining and Learning from Graphs. He has been area chair for Textual Entailment in *Sem (2013).
Area of Syntactic Parsing
In the early stages of his career, Zanzotto developed a robust natural language syntactic parser [1, 2, 3, 4] based on two principles: lexicalization and modularization. He proposed a novel model for representing syntactic information for modular parsing system that combined the two alternative theories: the dependency and the constituency based model. The resulting syntactic parser is still one of the only available syntactic parsing system for Italian. For these interests, Zanzotto has been invited as program committee member in the two workshops on RObust Methods for Analyzing Natural language Data (ROMAND) (2004,2006). Due to his expertise in Italian parsing technologies, he participated as co-organizer in the “parsing track” of the Evalita 2007 Challenge, he wrote the parsing technology chapter of an Italian Artificial Intelligence Book, and he gave a tutorial in the Italian Association for Artificial Intelligence in 2001.
General information
Zanzotto is author of more than 100 pubblications. He is a reviewer for the major conferences in the area of NLP and AI (ACL, NAACL, EACL, EmNLP, CoLing, LREC, IJCAI, ECAI, CLEF) and for journals (Computational Lingusitics, Journal of Natural Language Engineering, Cognitive Computation Journal, ACM Transactions on Intelligent Systems, IEEE Data and Knowledge Engineering). He is member of ACL and of the Italian Association on Artificial Intelligence (AIIA)