Abstract
Background: Doubly robust estimation produces an unbiased estimator for the average treatment effect unless both propensity score (PS) and outcome models are incorrectly specified. Studies have shown that the doubly robust estimator is subject to more bias than the standard weighting estimator when both PS and outcome models are incorrectly specified. Method: We evaluated whether various machine learning methods can be used for estimating conditional means of the potential outcomes to enhance the robustness of the doubly robust estimator to various degrees of model misspecification in terms of reducing bias and standard error. We considered four types of methods to predict the outcomes: least squares, tree-based methods, generalized additive models and shrinkage methods. We also considered an ensemble method called the Super Learner (SL), which is a linear combination of multiple learners. We conducted simulations considering different scenarios by the complexity of PS and outcome-generating models and some ranges of treatment prevalence. Results: The shrinkage methods performed well with robust doubly robust estimates in term of bias and mean squared error across the scenarios when the models became rich by including all 2-way interactions of the covariates. The SL performed similarly to the best method in each scenario. Conclusions: Our findings indicate that machine learning methods such as the SL or the shrinkage methods using interaction models should be used for more accurate doubly robust estimators.
Original language | English (US) |
---|---|
Pages (from-to) | 1120-1133 |
Number of pages | 14 |
Journal | Pharmacoepidemiology and Drug Safety |
Volume | 29 |
Issue number | 9 |
DOIs | |
State | Published - Sep 1 2020 |
Keywords
- average causal effect
- covariate-balancing propensity score
- doubly robust estimation
- machine learning techniques
- maximum likelihood
- pharmacoepidemiology
- simulation
ASJC Scopus subject areas
- Epidemiology
- Pharmacology (medical)