FEDERICO D'ONOFRIO

Dottore di ricerca

ciclo: XXXVI


supervisore: Laura Palagi

Titolo della tesi: Novel Mixed-Integer Optimization Models and Algorithms for Interpretable SVMs

In the age of Big Data and complex black-box Machine Learning (ML) models, the pursuit of inherently interpretable models has transitioned from a desirable attribute to a critical necessity. Indeed, lack of explanations on the black-box ML decisions constitutes a main difficulty to their adoption in real world decision domains. Interpretable ML aims to develop models that can give explanatory insights into their decision-making process. However, the development of such interpretable models often involves coping with complex optimization problems that demand significant computational resources, particularly when applied to large datasets. Support Vector Machines (SVM) are a powerful ML tool for supervised classification which constructs a hyperplane that optimizes misclassification error and the distance between points in two different classes (margin), offering robust and accurate decision boundaries even in complex datasets. In linear SVMs the hyperplane is constructed directly in the input space of the data, while in nonlinear SVMs more complex data is mapped into a higher dimensional space where such a hyperplane can be constructed, resulting in nonlinear decision boundaries in the input space. SVMs classifiers are usually not interpretable even in the linear case being dense with respect to the number of features used. This thesis focuses on the application of Mixed Integer Programming formulations and algorithms to mitigate the challenges associated with training interpretable SVMs for supervised classification. In particular we study the case in which a cardinality constraint on the maximum number of suitable data features is embedded in the standard optimization models. For linear SVMs, we propose two Mixed Integer Quadratic models and develop both heuristic and exact algorithms which leverage efficient relaxations to solve them faster than off-the-shelf commercial solvers. Regarding the nonconvex Mixed Integer Nonlinear Programming problem arising when feature selection is embedded in nonlinear SVMs models, we develop both local search metaheuristics able to find good solutions in reasonable computational times for general kernels and decomposition approaches based on linearization techniques or submodular functions optimization in the case polynomial kernels are adopted. The value of these algorithms extends beyond SVMs for classification tasks, in particular they can be extended to other ML models embedding the SVM framework or to other nonlinear kernel-based ML models for regression or clustering tasks. As a matter of example, we also propose an extension of one of the proposed heuristic methods for linear SVM feature selection to the case of Optimal Classification Trees which employ SVM splits.

Produzione scientifica

11573/1657211 - 2024 - Margin Optimal Classification Trees
D’Onofrio, Federico; Grani, Giorgio; Monaci, Marta; Palagi, Laura - 01a Articolo in rivista
rivista: COMPUTERS & OPERATIONS RESEARCH (Elsevier Science Limited:Oxford Fulfillment Center, PO Box 800, Kidlington Oxford OX5 1DX United Kingdom:011 44 1865 843000, 011 44 1865 843699, EMAIL: asianfo@elsevier.com, tcb@elsevier.co.UK, INTERNET: http://www.elsevier.com, http://www.elsevier.com/locate/shpsa/, Fax: 011 44 1865 843010) pp. - - issn: 0305-0548 - wos: WOS:001097562000001 (4) - scopus: 2-s2.0-85175167637 (4)

11573/1675899 - 2023 - Optimization-based approaches for learning Optimal Classification Trees
D'onofrio, Federico; Monaci, Marta; Palagi, Laura - 01a Articolo in rivista
rivista: IFORS NEWS (Manila: International Federation of Operational Research Societies) pp. 5-7 - issn: 2223-4373 - wos: (0) - scopus: (0)

© Università degli Studi di Roma "La Sapienza" - Piazzale Aldo Moro 5, 00185 Roma