Karolinska Institutet
Browse

Mathematical programming for optimal probability weighting

Download (480.21 kB)
thesis
posted on 2024-09-02, 21:26 authored by Michele Santacatterina

In spite of the fact that probability weighting is widely used in statistics to correct for unequal sampling, control for confounding, and handle missing data, it has two main limitations. First, statistical inferences may be inefficient in the presence of extreme probability weights. Second, probability weighting-based methods are highly sensitive to model mis-specifications. The aim of this Ph.D. thesis work was to develop novel methods, based on mathematical programming techniques, for optimal probability weighting.

Specifically, in Paper I, we proposed a method that estimates optimal probability weights, which are obtained as the solution to a constrained optimization problem that minimizes the Euclidean distance from the target (original/design) weights among all sets of weights that satisfy a constraint on the precision of the resulting weighted estimator. In Paper II, we extended optimal probability weights to estimate the causal effect of a time-varying treatment on a survival outcome. Optimal probability weights were obtained as the solution to a constrained optimization problem which constrained the variance of the weights, rather than the standard error of the resulting weighted estimator, as in Paper I. In Paper III, we proposed Kernel Optimal Weighting (KOW), to obtain weights that optimally balance time-dependent confounders while controlling for the precision of the resulting marginal structural model estimate by directly minimizing the error in estimation. This error is expressed as an operator derived from the g-computation formula and KOW minimizes its operator norm with respect to a reproducing kernel Hilbert spaces by solving a quadratic optimization problem. KOW mitigates the e ects of possible misspecification of the treatment model by directly balancing covariates and control for precision by penalizing extreme weights. In Paper IV, we evaluated the e ect of treatment switch on time to second-line HIV treatment failure using data from the Swedish InfCare HIV registry.

This Ph.D. thesis provided methods that will likely help to (1) extend the use of probability weighting in medicine, epidemiology, and economics, (2) extend knowledge on how mathematical programming and machine learning could be used to conduct robust analyses for improved decision-making, and, (3) provide powerful, strong, and robust results to clinicians and policy-makers.

List of scientific papers

I. Michele Santacatterina, and Matteo Bottai. Optimal probability weights for inference with constrained precision. Journal of the American Statistical Association. 2017 29 Sep.
https://doi.org/10.1080/01621459.2017.1375932

II. Michele Santacatterina, Rino Bellocco, Anders Sönnerborg, Anna Mia Ekström, and Matteo Bottai. Optimal probability weights for estimating causal effects of time-varying treatments with marginal structural Cox models. [Submitted]

III. Michele Santacatterina, and Nathan Kallus. Optimal balancing of time-dependent confounders for marginal structural models. [Manuscript]

IV. Amanda Häggblom, Michele Santacatterina, Ujjwal Neogi, Magnus Gisslen, Bo Hejdeman, Leo Flamholc, and Anders Sönnerborg. Effect of therapy switch on time to second-line antiretroviral treatment failure in HIV-infected patients. PloS one. 2017; 12(7):e0180140.
https://doi.org/10.1371/journal.pone.0180140

History

Defence date

2018-04-16

Department

  • Institute of Environmental Medicine

Publisher/Institution

Karolinska Institutet

Main supervisor

Bottai, Matteo

Co-supervisors

Bellocco, Rino; Ekström, Anna Mia; Sönnerborg, Anders

Publication year

2018

Thesis type

  • Doctoral thesis

ISBN

978-91-7831-005-0

Number of supporting papers

4

Language

  • eng

Original publication date

2018-03-23

Author name in thesis

Santacatterina, Michele

Original department name

Institute of Environmental Medicine

Place of publication

Stockholm

Usage metrics

    Theses

    Categories

    No categories selected

    Keywords

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC