State of the Art Causal Inference in the Presence of Extraneous Covariates: A Simulation Study
Abstract
The central task of causal inference is to remove (via statistical adjustment) confounding bias that would be presentin naive unadjusted comparisons of outcomes in different treatment groups. Statistical adjustment can roughlybe broken down into two steps. In the first step, the researcher selects some set of variables to adjust for. In thesecond step, the researcher implements a causal inference algorithm to adjust for the selected variables and estimatethe average treatment effect. In this paper, we use a simulation study to explore the operating characteristics androbustness of state-of-the-art methods for step two (statistical adjustment for selected variables) when step one(variable selection) is performed in a realistically sub-optimal manner. More specifically, we study the robustness ofa cross-fit machine learning based causal effect estimator to the presence of extraneous variables in the adjustmentset. The take-away for practitioners is that there is value to, if possible, identifying a small sufficient adjustment setusing subject matter knowledge even when using machine learning methods for adjustment.