Karolinska Institutet
Browse

Artificial intelligence for breast cancer precision pathology

Download (3.43 MB)
thesis
posted on 2024-09-03, 00:46 authored by Yinxi WangYinxi Wang

Breast cancer is the most common cancer type in women globally but is associated with a continuous decline in mortality rates. The improved prognosis can be partially attributed to effective treatments developed for subgroups of patients. However, nowadays, it remains challenging to optimise treatment plans for each individual. To improve disease outcome and to decrease the burden associated with unnecessary treatment and adverse drug effects, the current thesis aimed to develop artificial intelligence based tools to improve individualised medicine for breast cancer patients.

In study I, we developed a deep learning based model (DeepGrade) to stratify patients that were associated with intermediate risks. The model was optimised with haematoxylin and eosin (HE) stained whole slide images (WSIs) with grade 1 and 3 tumours and applied to stratify grade 2 tumours into grade 1-like (DG2-low) and grade 3-like (DG2-high) subgroups. The efficacy of the DeepGrade model was validated using recurrence free survival where the dichotomised groups exhibited an adjusted hazard ratio (HR) of 2.94 (95% confidence interval [CI] 1.24-6.97, P = 0.015). The observation was further confirmed in the external test cohort with an adjusted HR of 1.91 (95% CI: 1.11-3.29, P = 0.019).

In study II, we investigated whether deep learning models were capable of predicting gene expression levels using the morphological patterns from tumours. We optimised convolutional neural networks (CNNs) to predict mRNA expression for 17,695 genes using HE stained WSIs from the training set. An initial evaluation on the validation set showed that a significant correlation between the RNA-seq measurements and model predictions was observed for 52.75% of the genes. The models were further tested in the internal and external test sets. Besides, we compared the model's efficacy in predicting RNA-seq based proliferation scores. Lastly, the ability of capturing spatial gene expression variations for the optimised CNNs was evaluated and confirmed using spatial transcriptomics profiling.

In study III, we investigated the relationship between intra-tumour gene expression heterogeneity and patient survival outcomes. Deep learning models optimised from study II were applied to generate spatial gene expression predictions for the PAM50 gene panel. A set of 11 texture based features and one slide average gene expression feature per gene were extracted as input to train a Cox proportional hazards regression model with elastic net regularisation to predict patient risk of recurrence. Through nested cross-validation, the model dichotomised the training cohort into low and high risk groups with an adjusted HR of 2.1 (95% CI: 1.30-3.30, P = 0.002). The model was further validated on two external cohorts.

In study IV, we investigated the agreement between the Stratipath Breast, which is the modified, commercialised DeepGrade model developed in study I, and the Prosigna® test. Both tests sought to stratify patients with distinct prognosis. The outputs from Stratipath Breast comprise a risk score and a two-level risk stratification whereas the outputs from Prosigna® include the risk of recurrence score and a three-tier risk stratification. By comparing the number of patients assigned to ‘low’ or ‘high’ risk groups, we found an overall moderate agreement (76.09%) between the two tests. Besides, the risk scores by two tests also revealed a good correlation (Spearman's rho = 0.59, P = 1.16E-08). In addition, a good correlation was observed between the risk score from each test and the Ki67 index. The comparison was also carried out in the subgroup of patients with grade 2 tumours where similar but slightly dropped correlations were found.

List of scientific papers

I. Wang Y, Acs B, Robertson S, Liu B, Solorzano L, Wählby C, Hartman J, Rantalainen M. Improved breast cancer histological grading using deep learning. Annals of Oncology. 2022 Jan 1;33(1):89-98.
https://doi.org/10.1016/j.annonc.2021.09.007

II. Wang Y*, Kartasalo K*, Weitz P, Ács B, Valkonen M, Larsson C, Ruusuvuori P, Hartman J, Rantalainen M. Predicting Molecular Phenotypes from Histopathology Images: A Transcriptome-Wide Expression–Morphology Analysis in Breast Cancer. Cancer Research. 2021 Oct 1;81(19):5115-26. *Equal contribution.
https://doi.org/10.1158/0008-5472.CAN-21-0482

III. Wang Y, Ali MA, Humphreys K, Hartman J, Rantalainen M. Transcriptional intratumour heterogeneity predicted by deep learning in routine breast histopathology slides provides independent prognostic information. [Manuscript]

IV. Wang Y*, Robertson S*, Karlsson E, Rantalainen M, Hartman J. Evaluation of the concordance in breast cancer risk stratification between a commercialised deep learning tool and the Prosigna® test. *Equal contribution. [Manuscript]

History

Defence date

2022-12-02

Department

  • Department of Medical Epidemiology and Biostatistics

Publisher/Institution

Karolinska Institutet

Main supervisor

Rantalainen, Mattias

Co-supervisors

Hartman, Johan; Eklund, Martin; Lindberg, Johan

Publication year

2022

Thesis type

  • Doctoral thesis

ISBN

978-91-8016-845-8

Number of supporting papers

4

Language

  • eng

Original publication date

2022-11-09

Author name in thesis

Wang, Yinxi

Original department name

Department of Medical Epidemiology and Biostatistics

Place of publication

Stockholm

Usage metrics

    Theses

    Categories

    No categories selected

    Keywords

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC