Mathematical programming for optimal probability weighting
In spite of the fact that probability weighting is widely used in statistics to correct for unequal sampling, control for confounding, and handle missing data, it has two main limitations. First, statistical inferences may be inefficient in the presence of extreme probability weights. Second, probability weighting-based methods are highly sensitive to model mis-specifications. The aim of this Ph.D. thesis work was to develop novel methods, based on mathematical programming techniques, for optimal probability weighting.
Specifically, in Paper I, we proposed a method that estimates optimal probability weights, which are obtained as the solution to a constrained optimization problem that minimizes the Euclidean distance from the target (original/design) weights among all sets of weights that satisfy a constraint on the precision of the resulting weighted estimator. In Paper II, we extended optimal probability weights to estimate the causal effect of a time-varying treatment on a survival outcome. Optimal probability weights were obtained as the solution to a constrained optimization problem which constrained the variance of the weights, rather than the standard error of the resulting weighted estimator, as in Paper I. In Paper III, we proposed Kernel Optimal Weighting (KOW), to obtain weights that optimally balance time-dependent confounders while controlling for the precision of the resulting marginal structural model estimate by directly minimizing the error in estimation. This error is expressed as an operator derived from the g-computation formula and KOW minimizes its operator norm with respect to a reproducing kernel Hilbert spaces by solving a quadratic optimization problem. KOW mitigates the e ects of possible misspecification of the treatment model by directly balancing covariates and control for precision by penalizing extreme weights. In Paper IV, we evaluated the e ect of treatment switch on time to second-line HIV treatment failure using data from the Swedish InfCare HIV registry.
This Ph.D. thesis provided methods that will likely help to (1) extend the use of probability weighting in medicine, epidemiology, and economics, (2) extend knowledge on how mathematical programming and machine learning could be used to conduct robust analyses for improved decision-making, and, (3) provide powerful, strong, and robust results to clinicians and policy-makers.
List of scientific papers
I. Michele Santacatterina, and Matteo Bottai. Optimal probability weights for inference with constrained precision. Journal of the American Statistical Association. 2017 29 Sep.
https://doi.org/10.1080/01621459.2017.1375932
II. Michele Santacatterina, Rino Bellocco, Anders Sönnerborg, Anna Mia Ekström, and Matteo Bottai. Optimal probability weights for estimating causal effects of time-varying treatments with marginal structural Cox models. [Submitted]
III. Michele Santacatterina, and Nathan Kallus. Optimal balancing of time-dependent confounders for marginal structural models. [Manuscript]
IV. Amanda Häggblom, Michele Santacatterina, Ujjwal Neogi, Magnus Gisslen, Bo Hejdeman, Leo Flamholc, and Anders Sönnerborg. Effect of therapy switch on time to second-line antiretroviral treatment failure in HIV-infected patients. PloS one. 2017; 12(7):e0180140.
https://doi.org/10.1371/journal.pone.0180140
History
Defence date
2018-04-16Department
- Institute of Environmental Medicine
Publisher/Institution
Karolinska InstitutetMain supervisor
Bottai, MatteoCo-supervisors
Bellocco, Rino; Ekström, Anna Mia; Sönnerborg, AndersPublication year
2018Thesis type
- Doctoral thesis
ISBN
978-91-7831-005-0Number of supporting papers
4Language
- eng