LORENZO LASTILLA

PhD Graduate

PhD program:: XXXIV


supervisor: Silvia Ferrara

Thesis title: Digital Palaeography as a matter of sign representation: from the reconstruction of the objective shape of the inscriptions to their analysis through Deep Neural Networks

Digital palaeography can be synthetically defined as the combination of traditional palaeographic methods (aimed at studying historic writing systems and at analyzing, deciphering, and dating handwritten texts, based on very specific stylistic features) and techniques for digitizing inscriptions, tools and algorithms for data processing and (shared) analysis, and databases for data management. Included in the broader field of Digital Humanities, this discipline has benefited, in recent years, on the one hand from increasingly advanced digitization techniques (such as Reflectance Transformation Imaging, Multispectral Imaging, X-ray Computed Tomography, and 3D modelling), and, on the other, from the development of sophisticated algorithms for data analysis of the (already) digitized inscriptions, mainly including data-driven methodologies, and, more precisely, machine and deep learning solutions. This thesis is part of this relatively recent research field, with a focus on the fundamental problem of the high-quality and correct representation of signs, inscriptions, and handwritten texts in general. In particular, the concept of representation (that is, of re-elaboration through a simplifying model) of the signs is explored from two different perspectives. On the one hand, if we consider the problem of inscription digitization, a high-level representation of an inscription corresponds to a replica of the signs as accurate as possible from a metric and morphological point of view, so that a close inspection is possible even remotely, and that also the smallest details – including those not immediately visible to the naked eye – are highlighted. On the other hand, the same concept has been studied from the point of view of the automatic analysis of the inscriptions (given that they are already available in a digital format): in this case, by optimal representation we mean a transformation of the original data (for example, from an image of a sign to a feature vector) which maximizes the performance of an algorithm aimed at solving a task which involves that data. The first interpretation of this basic concept was deepened by considering the problem of the digitization of four undeciphered writing systems, three of which were in use in the second millennium BCE Aegean (Cretan Hieroglyphic, Linear A, and Cypro-Minoan), plus Rongorongo (of uncertain dating), used on Easter Island (this research direction was investigated within the ERC project INSCRIBE, with Prof. Silvia Ferrara as Principal Investigator). Since these writing systems are generally “three-dimensional”, typically carved on objects of very small dimensions (of the order of 1-5 cm), and characterized by signs a few tenths of a millimeter deep, their digitization was mainly carried out through accurate and high resolution 3D modelling techniques, such as macro-photogrammetry (combined with focus stacking) and structured light scanning, able to guarantee an accuracy in the order of 1-2 hundredths of a millimeter. During the Ph.D., a total of 166 3D models were acquired, on the basis of which accuracy, precision, texture quality, and legibility assessments were made. The results produced clearly demonstrate the high potential of the methodologies adopted (and refined with specific measures) compared to the need to produce representations as faithful as possible to the original inscriptions. If we move to the concept of representation from the point of view of the automatic analysis of the inscriptions, this second line of investigation was deepened by considering two different case studies. The first one involves the signs of the Cretan Hieroglyphic undeciphered writing system: in this case, the goal is to train an encoder (based on a deep residual network) of the signs to produce representations, or feature vectors, suitable for the task of sign classification (according to the Cretan Hieroglyphic sign repertoire). For the second case study, instead, a group of (digitized) medieval and modern manuscripts from the Vatican Apostolic Library was selected, with the aim of training an encoder, also here based on a deep residual network, to extract from the manuscript pages useful representations for the task of handwriting identification, or the subdivision of the manuscripts into parts belonging to distinct scribes, on the basis of the respective handwriting style. However, for both case studies a sufficient amount of annotated or labeled data was missing (a typical problem in the palaeographic domain and known as “data scarcity”). This problem made the direct use of supervised learning techniques impossible, although until recently they have been the most efficient strategy for extracting high-level representations from data. To cope with this, in both cases a self-supervised pretraining solution was tested (using a method available in the literature and based on the reconstruction of Bags of Visual Words), capable of leveraging large amounts of unlabeled data to learn high-level representations. For both case studies, it has been shown how the inclusion of a self-supervised pretraining phase has a beneficial effect on the performance in the task itself, and therefore on the extraction of useful representations from the data. In summary, therefore, this thesis has focused on the concept of optimal representation of inscriptions and handwritten texts in the context of digital palaeography. Among the main contributions, we can list the definition of 3D modelling techniques and practices for the accurate and high resolution reconstruction of inscriptions (plus some innovations in the processing and post-processing stages of the 3D models), and the identification of self-supervised learning as a powerful prompt to solve problems of interest in palaeography through data-driven approaches, although this research field is intrinsically affected (at least at present) by a lack of annotated data.

Research products

  • 11573/1284553 - 2019 - Foss4g date for dsm generation: Sensitivity analysis of the semi-global block matching parameters (04c Atto di convegno in rivista)
    LASTILLA, LORENZO; RAVANELLI, ROBERTA; FRATARCANGELI, FRANCESCA; DI RITA, MARTINA; NASCETTI, ANDREA; CRESPI, MATTIA GIOVANNI
  • 11573/1122667 - 2018 - 3D modelling of archaeological small finds by the structure sensor range camera: comparison of different scanning applications (01a Articolo in rivista)
    RAVANELLI, ROBERTA; LASTILLA, LORENZO; NASCETTI, ANDREA; DI RITA, MARTINA; NIGRO, LORENZO; MONTANARI, DARIA; SPAGNOLI, FEDERICA; CRESPI, MATTIA GIOVANNI
  • 11573/1122691 - 2017 - 3D modelling by low-cost range camera: software evaluation and comparison (04c Atto di convegno in rivista)
    RAVANELLI, ROBERTA; LASTILLA, LORENZO; CRESPI, MATTIA GIOVANNI
  • 11573/1290635 - 2019 - 3D high-quality modeling of small and complex archaeological inscribed objects: Relevant issues and proposed methodology (04c Atto di convegno in rivista)
    LASTILLA, LORENZO; RAVANELLI, ROBERTA
  • 11573/1611118 - 2022 - Self-supervised learning for medieval handwriting identification. A case study from the Vatican Apostolic Library (01a Articolo in rivista)
    LASTILLA, LORENZO; FIRMANI, DONATELLA; SCARDAPANE, SIMONE
  • 11573/1345994 - 2019 - Orthoimage Generation by GÖKTÜRK-1: A Test Case in Rome (04c Atto di convegno in rivista)
    RAVANELLI, ROBERTA; LASTILLA, LORENZO; CRESPI, MATTIA GIOVANNI
  • 11573/1490844 - 2019 - 3D modelling of the Mamari tablet from the Rongorongo corpus. Acquisition, processing issues, and outcomes (04c Atto di convegno in rivista)
    LASTILLA, LORENZO; RAVANELLI, ROBERTA
  • 11573/1495932 - 2021 - DSM Generation from Single and Cross-Sensor Multi-View Satellite Images Using the New Agisoft Metashape: The Case Studies of Trento and Matera (Italy) (01a Articolo in rivista)
    LASTILLA, LORENZO; BELLONI, VALERIA; RAVANELLI, ROBERTA; CRESPI, MATTIA GIOVANNI
  • 11573/1565846 - 2020 - First Test of Agisoft Metashape Satellite Image Processing for DSM Generation. A Case Study in Trento with Pléiades Imagery (04b Atto di convegno in volume)
    LASTILLA, LORENZO; RAVANELLI, ROBERTA; CRESPI, MATTIA GIOVANNI
  • 11573/1581676 - 2021 - Sharing soil and building geophysical data for seismic characterization of cities using CLARA Webgis. A case study of Matera (southern Italy) (01a Articolo in rivista)
    LASTILLA, LORENZO; BELLONI, VALERIA; RAVANELLI, ROBERTA
  • 11573/1611101 - 2021 - Modelling the Rongorongo tablets. A new transcription of the Échancrée tablet and the foundation for decipherment attempts (01a Articolo in rivista)
    LASTILLA, LORENZO; RAVANELLI, ROBERTA
  • 11573/1611110 - 2022 - A high-resolution photogrammetric workflow based on focus stacking for the 3D modeling of small Aegean inscriptions (01a Articolo in rivista)
    RAVANELLI, ROBERTA; LASTILLA, LORENZO
  • 11573/1670826 - 2023 - CycleDRUMS: automatic drum arrangement for bass lines using CycleGAN (01a Articolo in rivista)
    TRAPPOLINI, GIOVANNI; LASTILLA, LORENZO; CAMPAGNANO, CESARE; SILVESTRI, FABRIZIO
  • 11573/1552590 - 2021 - POSE-ID-on—A Novel Framework for Artwork Pose Clustering (01a Articolo in rivista)
    MARSOCCI, VALERIO; LASTILLA, LORENZO

© Università degli Studi di Roma "La Sapienza" - Piazzale Aldo Moro 5, 00185 Roma