Thesis title: Safeguarding Confidentiality through Usage Control over the Process Mining Lifecycle
Data represents a key resource for modern organizations, but its protection has become increasingly challenging. During collection, storage, processing, and dissemination, information is repeatedly exposed to potential breaches of confidentiality and misuse. Traditional security mechanisms, mainly focused on access control, prove insufficient once data is legitimately accessed, as they no longer regulate how information is subsequently used. This limitation becomes particularly critical in collaborative and data-driven contexts, where analytics and optimization tasks are increasingly outsourced to third parties. Although such collaborations foster innovation and efficiency, they also introduce a significant trust gap, since data owners lose visibility and control over how their information is processed and reused beyond their domain.
This tension is especially evident in process mining, a discipline that extracts insights from event data to analyze and optimize business processes. While process mining provides visibility into operational performance, the event logs on which it relies often contain highly sensitive or personal information. Conventional techniques such as anonymization, pseudonymization, and encryption only partially mitigate disclosure risks and often compromise data fidelity.
Ensuring end-to-end protection therefore requires an approach that extends confidentiality, integrity, and accountability across the entire data lifecycle, from input to output. This vision aligns privacy regulations such as the General Data Protection Regulation (GDPR), which emphasize continuous control, traceability, and responsible data usage.
This thesis investigates how Confidential Computing can be leveraged to enable end-to-end data protection and governance in the context of process analysis. The main research question guiding the work is therefore formulated as follows:
RQ: Confidential Computing for End-to-End Data Governance
How can Confidential Computing enable end-to-end data protection and governance for sensitive information treatment in process analysis?
To address this question, we structured the research into three objectives, each targeting a specific phase of the process mining lifecycle.
RO1: Input data
Ensure the confidentiality of shared and exchanged data, protecting it from unauthorized access or exposure.
This objective focuses on guaranteeing that data confidentiality is preserved from the moment data leaves its source, ensuring that only authorized entities can access it within controlled environments.
Once the confidentiality of input data is ensured, the second objective extends protection to the computational phase, where data is actively processed and analyzed.
RO2: Data processing
Enable secure processing of shared data, ensuring that sensitive information remains protected and confidential during computation.
The objective is to guarantee that computation itself does not compromise secrecy, maintaining the confidentiality of the data even when it is in use.
After computation, confidentiality must continue to hold for the results produced. The third objective therefore focuses on the protection and controlled dissemination of the derived information.
RO3: Output information
Regulate the utilization of output information to ensure its confidentiality and prevent unauthorized disclosure.
This objective addresses the fact that confidentiality risks persist even after computation, ensuring that the produced outputs are securely handled until their controlled release.
Together, these objectives establish a continuous protection chain that covers the entire process mining lifecycle, forming the foundation for a comprehensive and verifiable approach to end-to-end data governance through Confidential Computing.
This thesis explores how Confidential Computing can be leveraged to preserve the confidentiality of end-to-end data throughout the process mining lifecycle. Confidential Computing extends data protection beyond data at rest to also include data in use, by relying on reserved, hardware-based, and cryptographically protected environments that prevent unauthorized access, even from privileged system software.
In this regard, the thesis presents a set of approaches designed to ensure end-to-end data confidentiality in cooperative and collaborative settings, where multiple parties jointly pursue a common objective but remain reluctant to share their data. This hesitation arises from the lack of control over the data once it has been transmitted, which exposes it to potential misuse or leakage.
This thesis introduces frameworks that incrementally cover all stages of the process mining lifecycle: from input preservation through a Solid-based architectures and ReGov, to secure data processing with CONFINE, and finally to confidentiality-preserving output management with ProMiSe. Collectively, these contributions provide a coherent pathway for achieving end-to-end data confidentiality across the entire process mining lifecycle.
The Solid-based approach extends the Solid protocol with blockchain and Trusted Execution Environments (TEEs) to ensure that data shared from Solid personal online datastores are accessed and unsed only within secure environment under verifiable policies. ReGov aims to extend and generalizes this concept into a decentralized governance framework, separating on-chain policy management from off-chain confidential enforcing. Together, these solutions enable controlled, auditable, and privacy-preserving of data shared while maintaining full data confidentiality and usage control during the input phase of the process mining lifecycle. The research works we conducted for these works are as follows:
-Basile, D., Ciccio, C.D., Goretti, V., Kirrane, S.: A blockchain-driven architecture for usage control in solid. In: International Conference on Distributed Computing Systems, ICDCS 2023 - Workshops. pp. 19–24. IEEE (2023). doi: 10.1109/ICDCSW60045.2023.00009, https://doi.org/10.1109/ICDCSW60045.2023.00009
-Basile, D., Di Ciccio, C., Goretti, V., Kirrane, S.: Blockchain based resource governance for decentralized web environments. Frontiers in Blockchain 6, 1141909 (May 2023). doi: 10.3389/fbloc.2023.1141909, https://doi.org/10.3389/fbloc.2023.1141909
In CONFINE, we move from secure data provision to the confidential execution of process mining technique. The framework enables multiple organizations to collaboratively analyze process models without revealing their raw event data. CONFINE achieves this by executing all mining operations inside Trusted Execution Environments where event logs are transmitted, processed under verified confidentiality guarantees. The architecture enables each participant to act both as a data provider and as a data requester within the collaborative process analysis. Leveraging Trusted Execution Environment, users can securely share their event data and request those of other collaborators for processing, without disclosing any sensitive information. These interactions are coordinated by the CONFINE protocol, which manages attestation, secure data exchange, and computation. In this way, CONFINE ensures that execution data remain protected throughout the processing phase, enabling accurate process discovery while preserving privacy across organizational boundaries. The CONFINE related research works conducted as part of this thesis are listed below.
-Goretti, V., Basile, D., Barbaro, L., Di Ciccio, C.: Trusted execution environment for decentralized process mining. In: CAiSE. vol. 14663, pp. 509–527. Springer (2024). doi: 10.1007/978-3-031-61057-8_30, https://doi.org/10.1007/978-3-031-61057-8_30
-Goretti, V., Basile, D., Barbaro, L., Di Ciccio, C.: CONFINE: preserving data secrecy in decentralized process mining. In: Doctoral Consortium and Demo Track ICPM 2024. vol. 3783 (2024), https://ceur-ws.org/Vol-3783/paper_324.pdf
With ProMiSe, we shift focus by extending the work to the confidential management of process mining result, ensuring that privacy guarantees are maintained even after computation. The framework enables organizations to perform process discovery within a trusted environment, where data owners can define and automatically enforce fine-grained rules regulating data access, algorithm execution, and result disclosure. By combining usage-control mechanisms with Trusted Execution Environments, ProMiSe ensures that all analyses and outputs remain compliant with established confidentiality policies. In doing so, it extends end-to-end protection to the output phase, completing the confidentiality chain initiated with the previous contributions. The publication related to ProMiSe is the following.
-Goretti, V., Kirrane, S., Di Ciccio, C.: Usage control for process discovery through a trusted execution environment. In: ICSOC (2025), to appear