Quantifying neurodegeneration from medical images with machine learning and graph theory
Neurodegeneration (or brain atrophy) is part of the pathological cascade of Alzheimer’s disease (AD) and is strongly associated with cognitive decline. In clinics, atrophy is measured through visual assessments of specific brain regions on medical images according to established rating scales. In this thesis, we developed a model based on recurrent convolutional neural networks (AVRA: Automatic visual ratings of atrophy) that could predict scores from magnetic resonance images (MRI) according to commonly used clinical rating scales, namely: Scheltens’ scale for medial temporal atrophy (MTA), Pasquier’s frontal subscale of global cortical atrophy (GCA-F), and Koedam’s posterior atrophy (PA) scale. AVRA was trained on over 2000 images rated by a single neuroradiologist and demonstrated similar inter-rater agreement levels on all three scales to what has reported between two "human raters" in previous studies.
We further applied different versions of AVRA, trained systematically on data with different levels of heterogeneity, in external data from multiple European memory clinics. We observed a general performance drop in the out-of-distribution (OOD) data compared to test sets sampled from the same cohort as the training data. By training AVRA on data from multiple sources, we show that the performance in external cohorts generally increased. AVRA demonstrated a notably low agreement in one memory clinic, despite good quality images, which suggests that it may be challenging to assess how well a machine learning model generalizes to OOD data. For additional validation of our model, we compared AVRA’s MTA ratings to two external radiologists’ and the volumes of the hippocampi and inferior lateral ventricles. The images came from a longitudinal cohort that comprised individuals with subjective cognitive decline (SCD) and mild cognitive impairment (MCI) followed up over six years. AVRA showed substantial agreement to one of the radiologists, and lower rating agreement to the other. The two radiologists also showed low agreement between each other. All sets of ratings were strongly associated with the subcortical volumes, suggesting that all three raters were reliable. We further observed that individuals with SCD and (probably) underlying AD pathology had a faster MTA progression than MCI patients with non-AD biomarker profile.
Finally, we evaluated a method to quantify patterns of atrophy through the use of graph theory. We compared structural gray matter networks between groups of healthy controls and AD patients, constructed from different subsamples and with different network construction methods. Our experiments suggested that structural gray matter networks may not be very stable. Our networks required more than 150 subjects/group to show convergence in the included network properties, which is a greater sample size than used in the majority of the studies applying these methods. The different graph construction methods did not yield consistent differences between the control and AD networks, which may explain why findings have been inconsistent across previous studies. To conclude, we demonstrated that a machine learning model can successfully learn to mimic a radiologist’s assessment of atrophy without intra-rater variability. The challenge going forward is to assert model consistency across clinics, scanners and image quality—nuisances that humans are better at ignoring than deep learning models.
List of scientific papers
I. Mårtensson G, Ferreira D, Cavallin L, Muehlboeck J-S, Wahlund L-O, Wang C, Westman E. AVRA: Automatic Visual Ratings of Atrophy from MRI images using Recurrent Convolutional Neural Networks. NeuroImage: Clinical. 2019. 23(March), p. 101872.
https://doi.org/10.1016/j.nicl.2019.101872
II. Mårtensson G, Ferreira D, Granberg T, Cavallin L, Oppedal K, Padovani A, Rektorova I, Bonanni L, Pardini M, Kramberger M, Taylor J-P, Hort J, Snædal J, Kulisevsky J, Blanc F, Antonini A, Mecocci P, Vellas B, Tsolaki M, Kłoszewska I, Soininen H, Lovestone S, Simmons A, Aarsland D, Westman E. The reliability of a deep learning model in clinical out-of-distribution MRI data: a multicohort study. [Submitted]
III. Mårtensson G, Håkansson C, Pereira JB, Palmqvist S, Hansson O, van Westen D† , Westman E†. Medial temporal atrophy in preclinical dementia: visual and automated assessment during six year follow-up. †Shared last author. [Submitted]
IV. Mårtensson G, Pereira JB, Mecocci P, Vellas B, Tsolaki M, Kłoszewska I, Soininen H, Lovestone S, Simmons A, Volpe G, Westman E. Stability of graph theoretical measures in structural brain networks in Alzheimer’s disease. Scientific Reports. 2018. 8(1), p. 11592.
https://doi.org/10.1038/s41598-018-29927-0
History
Defence date
2020-05-15Department
- Department of Neurobiology, Care Sciences and Society
Publisher/Institution
Karolinska InstitutetMain supervisor
Westman, EricCo-supervisors
Pereira, Joana B.; Volpe, GiovanniPublication year
2020Thesis type
- Doctoral thesis
ISBN
978-91-7831-732-5Number of supporting papers
4Language
- eng