ILARIA BOMBELLI

Dottoressa di ricerca

ciclo: XXXVI


supervisore: Maurizio Vichi

Titolo della tesi: Fuzzy clustering for complex data structures

Multidimensional phenomena are often represented by complex data structures. With the rapid growth of data availability and complexity, new methodologies are needed to handle these kind of data. Among complex data structures, deep interest has been devoted to three-dimensional data and network data, since many applications can be represented as such. Among methodological techniques, cluster analysis is one of the most popular and successful techniques for data exploration and characterization. However, existing methodologies for describing and analyzing such complex data use a hard approach to clustering, even though many applications show the need to use a fuzzy approach, as it allows for better interpretation of results and greater closeness of results to reality. What is proposed in this thesis are new methodologies for applying fuzzy clustering to complex data structures, such as three-way data and network data. The fuzzy approach to clustering proves extremely useful in the simulations and real-world applications which will be discussed through the chapters. The first chapter introduces the notions of complex data structures and positions the problem, highlighting the rationale behind the proposed methodologies through theoretical discussions and real-world practical examples. The second chapter provides the reader with terminology used throughout the thesis and definitions of basic concepts. From the third to the sixth chapter, four different research works are presented. The first work introduces the notions of three-way three-mode data, as a data array made up by different units-by-variables matrix, each of which refers to a specific occasion (usually time); by applying hierarchical clustering techniques to each units-by-variables data matrix, a set of hierarchies (dendrograms) is obtained. The new methodology proposes to obtain a fuzzy partition of the set of hierarchies and simultaneously, within each class of the partition, identify a consensus hierarchy. The second work can be considered as an extension of the previous one. Given a set of hierarchies, the proposed new methodology makes it possible to obtain a fuzzy partition of them, and within each class of the partition, identify a parsimonious consensus dendrogram. The notion of parsimonious is extensively commented and discussed in the corresponding chapter. However, here it is important to recall that a parsimonious dendrogram is useful for getting a clear and direct idea of how units aggregate into clusters, highlighting only the most important aggregations and deleting misleading ones. The third work introduces a new methodological proposal to obtain a fuzzy partition of a three-way three-mode data array with corresponding consensus matrices for each class in the partition and simultaneously reduce the dimension of the variables in the consensus matrices by applying a disjoint second-order factor analysis. The motivation and theoretical background are discussed in the corresponding chapter. Finally, the last work focuses on how to apply different fuzzy clustering techniques to a set of networks. In particular, the main issue that arises in this kind of problem concerns how to represent networks so that they can be given as input to the clustering algorithms. Several representations of networks involving probability distributions and graph embedding techniques are presented and discussed. The last chapter summarizes the main contents of the thesis, recalling the methodological proposals, emphasizing their relevance and contribution, especially their strength when applied to real scenarios. Finally, the necessity of using a fuzzy approach to clustering and its main advantage are emphasized.

Produzione scientifica

11573/1689871 - 2024 - Representing ensembles of networks for fuzzy cluster analysis: a case study.
Bombelli, I.; Manipur, I.; Guarracino, M. R.; Ferraro, M. B. - 01a Articolo in rivista
rivista: DATA MINING AND KNOWLEDGE DISCOVERY (Kluwer Academic Publishers:Journals Department, PO Box 322, 3300 AH Dordrecht Netherlands:011 31 78 6576050, EMAIL: frontoffice@wkap.nl, kluweronline@wkap.nl, INTERNET: http://www.kluwerlaw.com, Fax: 011 31 78 6576254) pp. 725-747 - issn: 1384-5810 - wos: WOS:001079644300003 (1) - scopus: 2-s2.0-85173968576 (0)

11573/1713239 - 2024 - Parsimonious consensus hierarchies, partitions and fuzzy partitioning of a set of hierarchies
Bombelli, Ilaria; Vichi, Maurizio - 01a Articolo in rivista
rivista: STATISTICS AND COMPUTING (Heidelberg : Springer Dordrecht Netherlands: Kluwer Academic Publishers) pp. 1-19 - issn: 0960-3174 - wos: WOS:001201354600001 (0) - scopus: 2-s2.0-85190304917 (0)

11573/1678389 - 2023 - Consensus and fuzzy partition of dendrograms from a three-way dissimilarity array
Bombelli, I.; Ferraro, M. B.; Vichi, M. - 01a Articolo in rivista
rivista: INFORMATION SCIENCES (Amsterdam; Boston: Elsevier 1968-) pp. 1-21 - issn: 0020-0255 - wos: WOS:000990696300001 (1) - scopus: 2-s2.0-85152604910 (1)

11573/1688911 - 2023 - Mobility trends in Italy during the first wave of Covid-19 pandemic: analysis on Google data
Bombelli, Ilaria; De Rocchi, Daniele - 04b Atto di convegno in volume
congresso: SIS 2023 - Statistical Learning, Sustainability and Impact Evaluation (Ancona; Italia)
libro: SEAS IN Book of the Short Papers - (9788891935618)

11573/1688913 - 2023 - Cluster analysis for networks using a fuzzy approach
Bombelli, Ilaria; Manipur, Ichcha; Ferraro, Maria Brigida - 04b Atto di convegno in volume
congresso: 14-th Scientific Meeting Classification and Data Analysis Group (Salerno)
libro: CLADAG 2023. Book of abstract and short papers - (9788891935632)

11573/1678388 - 2023 - Children's Online Safety: Predictive Factors of Cyberbullying and Online Grooming Involvement
Tintori, Antonio; Ciancimino, Giulia; Bombelli, Ilaria; De Rocchi, Daniele; Cerbara, Loredana - 01a Articolo in rivista
rivista: SOCIETIES (Basel: MDPI) pp. 1-18 - issn: 2075-4698 - wos: WOS:000941481200001 (6) - scopus: 2-s2.0-85148723724 (6)

11573/1657394 - 2022 - Community detection in networks: a heuristic version of Girvan Newman algorithm
Bombelli, Ilaria; Di Rocco, Lorenzo - 04b Atto di convegno in volume
congresso: SIS2022 - 51ST SCIENTIFIC MEETING OF THE ITALIAN STATISTICAL SOCIETY (Caserta; Italy)
libro: Book of the short papers - (9788891932310)

11573/1577171 - 2021 - Graph nodes clustering: a comparison between algorithms
Bombelli, Ilaria - 04b Atto di convegno in volume
congresso: 50th edition of the Scientific Meeting of the Italian Statistical Society. (Pisa; Italy)
libro: Book of short papers - SIS 2021 - (9788891927361)

© Università degli Studi di Roma "La Sapienza" - Piazzale Aldo Moro 5, 00185 Roma