Development and validation of novel deep learning-based models for cancer histopathology image analysis
Histopathological assessment of resected tumour specimens remains the foundation of breast cancer diagnosis and plays a critical role in guiding clinical decision-making. Advances in computational pathology have shown the potential to improve routine diagnostics and enable more precise treatment planning for breast cancer patients. Deep learning, at the forefront of these advancements, not only replicates traditional pathological assessments but also introduces novel avenues beyond routine settings like improved risk- stratification, prognostic and response-to-treatment predictive models. Such avenues of deep learning tasks are referred to as AI-based precision pathology. In this thesis, we developed and validated deep learning-based models for AI- based precision pathology tasks to improve breast cancer diagnosis using routinely stained resected tumour specimens.
In study I, we developed and prognostically validated a deep learning-based three-grade classification model (predGrade) using Haematoxylin and Eosin (H&E) stained whole slide images (WSIs), that mimics the conventional histomorphology based prognostic marker Nottingham Histological Grading (NHG) for invasive breast cancer patients. predGrade showed similar prognostic performance to the conventional NHG with the unadjusted hazard ratio (HR) for clinical NHG 2 versus 1 associated with recurrence-free survival (RFS) estimated to be 2.59 (p-value = 0.004) and clinical NHG 3 versus 1 estimated to be 3.58 (p- value < 0.001). Whereas, for predGrade, the unadjusted HR for predGrade 2 versus 1 associated with RFS was estimated to be HR = 2.52 (p-value = 0.030), and 4.07 (p-value = 0.001) for predGrade 3 versus 1 was observed in the independent external test set. It showed the potential to improve upon the known inter-observer and inter-lab variabilities present in clinical NHG and provide a more reproducible and robust clinical decision support solution for breast cancer histological grading to further avoid over- and under-treatment of the breast cancer patients.
In study II, we prognostically validated the CE-IVD-approved AI-based solution called Stratipath Breast, for prognostic risk-stratification of breast cancer patients using H&E WSIs in two independent hospital sites in Sweden (N=2719). In this retrospective validation study, we observed the HR associated with progression-free survival (PFS) to be 2.20 (95% CI: 1.22-3.98, p-value = 0.009) between Stratipath breast low- and high-risk groups, in the clinically relevant oestrogen receptor (ER)-positive/human epidermal growth factor receptor 2 (HER2)-negative/NHG 2 patient subgroup. Assignment of adjuvant chemotherapy to intermediate-risk/ER+/HER2- patients is ambiguous. Improved risk-stratification of this patient subgroup using an H&E WSI-based AI solution can potentially improve the under- and over-treatment of such patients. Further, it provides a fast and cost-effective solution over existing molecular profiling- based methods.
In study III, we proposed a methodology to spatially interpret the deep learning- based weakly supervised models. In many of the precision pathology tasks like histological grading classification or prognostic risk score modelling, the label is only available at the slide-level and not the tile or pixel level. In such scenarios, there is a need to understand the association of local regions in WSI with the slide-level prediction task. Here, we introduce the Wsi rEgion sElection approach (WEEP), which provides the selection of tiles that are directly associated with the classification of the WSI. We observed the application of WEEP in a binary classification task (low vs high histological grade). Further, the methodology provides visual interpretability of the regions that are driving the classification of the WSI. It is a straightforward and simple methodology that has applications in both research and diagnostic applications.
In study IV, we developed a deep learning-based multi-stain prognostic risk score prediction model for breast cancer patients. H&E and immunochemistry (IHC) stains are central for routine biomarker assessments in breast cancer pathology. Instead of discrete and categorical protein-expression values, the spatial combination of histomorphological features across different stains can potentially include more prognostic information. In this study, we utilised the WSI registration method to provide local and spatial alignment from different stains. We combined the tile-level features extracted from different stains using the foundation models UNI and CONCH. Further, we observed the improvement in prognostic risk prediction after the addition of multiple stains in comparison to individual stains (C-index: 0.72 [95% CI: 0.65 - 0.79]) in an aggregated 5-fold CV test set. The results show the potential of a multi-stain model to improve breast cancer patient risk-stratification over the single-stain modality.
List of scientific papers
In this thesis, we have included the following list of scientific papers, that includes two published articles in peer-reviewed journals and two completed manuscripts available as pre-prints.
1. Sharma A, Weitz P, Wang Y, Liu B, Vallon-Christersson J, Hartman J, et al. Development and prognostic validation of a three-level NHG-like deep learning-based model for histological grading of breast cancer. Breast Cancer Res. 2024 Jan 29;26(1):17. https://doi.org/10.1186/s13058-024-01770-4
2. Sharma A, Lövgren SK, Eriksson KL, Wang Y, Robertson S, Hartman J, et al. Validation of an AI-based solution for breast cancer risk stratification using routine digital histopathology images. Breast Cancer Res. 2024 Aug 14;26(1):123. https://doi.org/10.1186/s13058-024-01879-6
3. Sharma A, Liu B, Rantalainen M. WEEP: A method for spatial interpretation of weakly supervised CNN models in computational pathology [Internet]. arXiv [eess.IV]. 2024 [cited 2024 Sep 23]. Available from: http://arxiv.org/abs/2403.15238 [Manuscript Preprint]
4. Sharma A, Gustafsson FK, Hartman J, Rantalainen M. Multi-stain modelling of histopathology slides for breast cancer prognosis prediction [Internet]. medRxiv. 2024 [cited 2024 Nov 20]. p. 2024.11.10.24317066. Available from: https://www.medrxiv.org/content/10.1101/2024.11.10.24317066v1.abstract [Manuscript Preprint]
History
Defence date
2025-01-22Department
- Department of Medical Epidemiology and Biostatistics
Publisher/Institution
Karolinska InstitutetMain supervisor
Mattias RantalainenCo-supervisors
Johan Hartman; Bojing Liu; Pekka RuusuvuoriPublication year
2024Thesis type
- Doctoral thesis