Karolinska Institutet
Browse

Bridging process mining and clinical epidemiology : a framework for discovering disease trajectories in real-world data

Download (3.49 MB)
thesis
posted on 2025-11-19, 12:55 authored by Kaile ChenKaile Chen
<p dir="ltr">Process mining bridges data mining and process models by extracting knowledge from event logs to discover, monitor, and improve real-world processes. While widely applied in business and manufacturing, its potential in clinical epidemiology remains underexplored. This doctoral thesis addresses this gap by progressively developing and applying process mining methodology to uncover meaningful disease trajectories in chronic kidney disease using real-world clinical data.</p><p dir="ltr">We first conducted a systematic review of process mining applications in chronic disease research, revealing that while process mining had matured in healthcare workflow optimisation, its use in population-based epidemiology remained limited. The review identified key methodological gaps and established that integration with statistical methods is essential for moving beyond descriptive visualisation to comparative inference (Paper I). We then assessed the feasibility of applying process mining to a validated epidemiological cohort, confirming that process mining can reliably reconstruct estimated glomerular filtration rate pathways and reveals dynamic, bidirectional disease progression patterns that traditional time-to-event methods overlook (Paper II).</p><p dir="ltr">With the feasibility confirmed by the previous two studies, we proposed and validated an eight-step framework integrating epidemiological design with data-driven process mining techniques encompassing process discovery, statistical testing, and prediction (Paper III). To address interpretation challenges inherent in complex process maps, we developed HealthProcessAI (Paper IV), a generative AI framework that integrates large language models (LLMs) with existing process mining libraries (PM4PY, bupaR) to automate interpretation and generate clinical reports.</p><p dir="ltr">Papers V and VI applied the developed methodology to clinical questions. Paper V examines how proton pump inhibitor versus H2 blocker use affects trajectories of chronic kidney disease, cardiovascular events, and death. Process models revealed that proton pump inhibitor users experience faster progression to chronic kidney disease and cardiovascular disease, with chronic kidney disease appearing as an intermediate step linking medication use to cardiovascular outcomes. However, the analysis also demonstrated that the overall effect diminishes when accounting for the competing risk of death, illustrating how process mining makes competing pathways visible in ways that standard time-to-event analysis does not. Paper VI analysed 29,901 adults surviving myocardial infarction or heart failure, stratified by chronic kidney disease severity. The study demonstrated that increasing chronic kidney disease severity was associated with greater care complexity, more frequent accumulation of circulatory diagnoses, and high-risk trajectories involving multiple organ systems, all strongly associated with mortality. These findings reveal not just which comorbidities occur, but how they accumulate in temporal sequence, providing potential targets for early intervention.</p><p dir="ltr">Together, these studies demonstrate the feasibility and added value of integrating process mining with clinical epidemiology. This approach offers a useful, data-driven methodology for understanding disease dynamics that complements traditional epidemiological methods.</p><h3>List of scientific papers</h3><p dir="ltr">This PhD thesis is based on the following papers.</p><p dir="ltr">I. <b>Chen, K.</b>, Abtahi, F., Carrero, J. J., Fernandez-Llatas, C. & Seoane, F. Process mining and data mining applications in the domain of chronic diseases: a systematic review. Artif Intell Med. 144, 102645. <a href="https://doi.org/10.1016/j.artmed.2023.102645" rel="noreferrer" target="_blank">https://doi.org/10.1016/j.artmed.2023.102645</a></p><p dir="ltr">II. <b>Chen K,</b> Abtahi F, Xu H, Fernandez-Llatas C, Carrero JJ, Seoane F. The Assessment of the Association of Proton Pump Inhibitor Usage with Chronic Kidney Disease Progression through a Process Mining Approach. Biomedicines. 2024, 12, 1362. <a href="https://doi.org/10.3390/biomedicines12061362" rel="noreferrer" target="_blank">https://doi.org/10.3390/biomedicines12061362</a></p><p dir="ltr">III. <b>Chen K,</b> Abtahi F, Carrero JJ, Fernandez-Llatas C, Xu H, Seoane F. Validation of an interactive process mining methodology for clinical epidemiology through a cohort study on chronic kidney disease progression. Sci Rep. 14, 27997 (2024). <a href="https://doi.org/10.1038/s41598-024-79704-5" rel="noreferrer" target="_blank">https://doi.org/10.1038/s41598-024-79704-5</a></p><p dir="ltr">IV. Eduardo Illueca-Fernandez, <b>Kaile Chen</b>, Fernando Seoane, Farhad Abtahi. HealthProcessAI: A Technical Framework and Proof-of-Concept for LLM-Enhanced Healthcare Process Mining. [Submitted]</p><p dir="ltr">V. <b>Chen K,</b> Abtahi F, Fernandez-Llatas C, Xu H, Seoane F. Longitudinal trajectories unravel the complex interplay of medication, cardiovascular events, chronic kidney disease, and mortality. Sci Rep. 15, 35577 (2025). <a href="https://doi.org/10.1038/s41598-025-23527-5" rel="noreferrer" target="_blank">https://doi.org/10.1038/s41598-025-23527-5</a></p><p dir="ltr">VI. <b>Chen K,</b> Xu H, Abtahi F, Fernandez-Llatas C, Seoane F, Carrero JJ. Mapping the Complexity of Care in Patients with Myocardial Infarction or Heart Failure living with Chronic Kidney Disease. [Submitted]</p>

History

Defence date

2025-12-18

Department

  • Department of Clinical Science, Intervention and Technology

Publisher/Institution

Karolinska Institutet; Kungliga Tekniska Högskolan

Main supervisor

Fernando Seoane

Co-supervisors

Farhad Abtahi; Juan-Jesus Carrero; Carlos Fernandez-Llatas; Hong Xu

Publication year

2025

Thesis type

  • Doctoral thesis

ISBN

978-91-8017-917-1

Number of pages

38

Number of supporting papers

6

Language

  • eng

Author name in thesis

Chen, Kaile

Original department name

Department of Clinical Science, Intervention and Technology

Place of publication

Stockholm

Usage metrics

    Theses

    Categories

    No categories selected

    Keywords

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC