Machine learning outcome regression improves doubly robust estimation of average causal effects

    Research output: Contribution to journalArticlepeer-review

    1 Scopus citations


    Background: Doubly robust estimation produces an unbiased estimator for the average treatment effect unless both propensity score (PS) and outcome models are incorrectly specified. Studies have shown that the doubly robust estimator is subject to more bias than the standard weighting estimator when both PS and outcome models are incorrectly specified. Method: We evaluated whether various machine learning methods can be used for estimating conditional means of the potential outcomes to enhance the robustness of the doubly robust estimator to various degrees of model misspecification in terms of reducing bias and standard error. We considered four types of methods to predict the outcomes: least squares, tree-based methods, generalized additive models and shrinkage methods. We also considered an ensemble method called the Super Learner (SL), which is a linear combination of multiple learners. We conducted simulations considering different scenarios by the complexity of PS and outcome-generating models and some ranges of treatment prevalence. Results: The shrinkage methods performed well with robust doubly robust estimates in term of bias and mean squared error across the scenarios when the models became rich by including all 2-way interactions of the covariates. The SL performed similarly to the best method in each scenario. Conclusions: Our findings indicate that machine learning methods such as the SL or the shrinkage methods using interaction models should be used for more accurate doubly robust estimators.

    Original languageEnglish (US)
    Pages (from-to)1120-1133
    Number of pages14
    JournalPharmacoepidemiology and Drug Safety
    Issue number9
    StatePublished - Sep 1 2020


    • average causal effect
    • covariate-balancing propensity score
    • doubly robust estimation
    • machine learning techniques
    • maximum likelihood
    • pharmacoepidemiology
    • simulation

    ASJC Scopus subject areas

    • Epidemiology
    • Pharmacology (medical)


    Dive into the research topics of 'Machine learning outcome regression improves doubly robust estimation of average causal effects'. Together they form a unique fingerprint.

    Cite this