Machine learning methods for precision medicine
Author: Gonzalez Ginestet, Pablo
Date: 2022-08-19
Location: Lecture Hall Petrén, Nobels väg 12B, Karolinska Institutet, Solna
Department: Inst för medicinsk epidemiologi och biostatistik / Dept of Medical Epidemiology and Biostatistics
View/ Open:
Thesis (991.3Kb)
Abstract
In precision medicine, predicting the risk of an event during a specific period may help, for example, to identify patients that need early preventive treatment. Modern machine learning (ML) techniques are therefore ideal for building these predictions. However, medical datasets often suffer from right-censoring of the outcome of interest posing an obstacle to the direct applicability of ML algorithms.
The aim of this thesis work is to develop and advance methods for prediction in settings of right-censoring, and in some settings also including competing risks. Specifically, in Project I, we developed an approach that combines inverse probability of censoring weighting (IPCW) with bagging as a pre-processing step to enable the application of all existing ML methods for classification in settings of right-censoring and competing risks, and we propose a procedure to combine optimally a set of single IPCW bagged methods.
In Project II, we developed an extension of Project 1 to combine optimally not only over ML procedures for the same outcome but combining survival outcomes such as Cox regression model and continuous outcome such as pseudo-observations-based regression.
In Project III, we integrated pseudo-observations into Convolutional Neural Network to predict the cumulative incidence using images and structured clinical data. In Project IV, we applied the methods developed in Project 1-2 to build a flexible risk prediction model to predict the risk of any cancer diagnosis using a Swedish population-based register among sarcoidosis patients.
In the last project, Project V, we explored the utility of a dynamic prediction model in a setting of complete data as decision support tool for public health to manage future pandemics. Specifically, we applied two state-of-the-art batch reinforcement learning algorithms to learn the best face covering policy response at the national level with the goal of reducing the spread of COVID-19.
The aim of this thesis work is to develop and advance methods for prediction in settings of right-censoring, and in some settings also including competing risks. Specifically, in Project I, we developed an approach that combines inverse probability of censoring weighting (IPCW) with bagging as a pre-processing step to enable the application of all existing ML methods for classification in settings of right-censoring and competing risks, and we propose a procedure to combine optimally a set of single IPCW bagged methods.
In Project II, we developed an extension of Project 1 to combine optimally not only over ML procedures for the same outcome but combining survival outcomes such as Cox regression model and continuous outcome such as pseudo-observations-based regression.
In Project III, we integrated pseudo-observations into Convolutional Neural Network to predict the cumulative incidence using images and structured clinical data. In Project IV, we applied the methods developed in Project 1-2 to build a flexible risk prediction model to predict the risk of any cancer diagnosis using a Swedish population-based register among sarcoidosis patients.
In the last project, Project V, we explored the utility of a dynamic prediction model in a setting of complete data as decision support tool for public health to manage future pandemics. Specifically, we applied two state-of-the-art batch reinforcement learning algorithms to learn the best face covering policy response at the national level with the goal of reducing the spread of COVID-19.
List of papers:
I. Pablo Gonzalez Ginestet, Ales Kotalik, David Vock, Julian Wolfson and Erin Gabriel. Stacked inverse probability of censoring weighted bagging: a case study in the InfCareHIV Register. Journal of the Royal Statistical Society Series C. 2021,70:51-65.
Fulltext (DOI)
View record in Web of Science®
II. Pablo Gonzalez Ginestet, Erin Gabriel and Michael Sachs. Survival stacking with multiple data types using pseudo-observation-based-AUC loss. Journal of Biopharmaceutical Statistics. 2022.
Fulltext (DOI)
Pubmed
View record in Web of Science®
III. Pablo Gonzalez Ginestet, Philippe Weitz, Mattias Rantalainen and Erin Gabriel. A deep convolutional neural network approach for predicting cumulative incidence based on pseudo-observations. [Manuscript]
IV. Elizabeth Arkema, Pablo Gonzalez Ginestet, Erin Gabriel and Michael Sachs. Predicting risk of cancer among sarcoidosis patients: a nationwide, registerbased, cohort-study. [Manuscript]
V. Pablo Gonzalez Ginestet, Erin Gabriel, Ziad El-Khatib and Ujjwal Neogi. Batch deep reinforcement learning for policy responses to the COVID pandemic. [Manuscript]
I. Pablo Gonzalez Ginestet, Ales Kotalik, David Vock, Julian Wolfson and Erin Gabriel. Stacked inverse probability of censoring weighted bagging: a case study in the InfCareHIV Register. Journal of the Royal Statistical Society Series C. 2021,70:51-65.
Fulltext (DOI)
View record in Web of Science®
II. Pablo Gonzalez Ginestet, Erin Gabriel and Michael Sachs. Survival stacking with multiple data types using pseudo-observation-based-AUC loss. Journal of Biopharmaceutical Statistics. 2022.
Fulltext (DOI)
Pubmed
View record in Web of Science®
III. Pablo Gonzalez Ginestet, Philippe Weitz, Mattias Rantalainen and Erin Gabriel. A deep convolutional neural network approach for predicting cumulative incidence based on pseudo-observations. [Manuscript]
IV. Elizabeth Arkema, Pablo Gonzalez Ginestet, Erin Gabriel and Michael Sachs. Predicting risk of cancer among sarcoidosis patients: a nationwide, registerbased, cohort-study. [Manuscript]
V. Pablo Gonzalez Ginestet, Erin Gabriel, Ziad El-Khatib and Ujjwal Neogi. Batch deep reinforcement learning for policy responses to the COVID pandemic. [Manuscript]
Institution: Karolinska Institutet
Supervisor: Gabriel, Erin
Co-supervisor: Rantalainen, Mattias; Neogi, Ujjwal; Sjölander, Arvid
Issue date: 2022-06-17
Rights:
Publication year: 2022
ISBN: 978-91-8016-633-1
Statistics
Total Visits
Views | |
---|---|
Machine ... | 827 |
Total Visits Per Month
November 2023 | December 2023 | January 2024 | February 2024 | March 2024 | April 2024 | May 2024 | |
---|---|---|---|---|---|---|---|
Machine ... | 18 | 21 | 22 | 16 | 28 | 21 | 11 |
File Visits
Views | |
---|---|
Thesis_Pablo_Gonzalez_Ginestet.pdf | 364 |
Thesis_PabloGonzalezGinestet.pdf | 1 |
Top country views
Views | |
---|---|
Sweden | 219 |
Ireland | 96 |
United States | 87 |
China | 63 |
United Kingdom | 60 |
Germany | 41 |
South Korea | 20 |
India | 17 |
Russia | 10 |
Austria | 8 |
Top cities views
Views | |
---|---|
Dublin | 93 |
Stockholm | 59 |
Sundbyberg | 11 |
Hangzhou | 9 |
Malmo | 9 |
Solna | 9 |
Tullinge | 9 |
Andover | 8 |
Bromma | 8 |
Uppsala | 8 |