Constructing Confidence Intervals for Average Treatment Effects from Multiple Datasets
Wang, Schröder, Frauen et al.
Constructing confidence intervals (CIs) for the average treatment effect (ATE) from patient records is crucial to assess the effectiveness and safety of drugs. However, patient records typically come from different hospitals, thus raising the question of how multiple observational datasets can be effectively combined for this purpose. In our paper, we propose a new method that estimates the ATE from multiple observational datasets and provides valid CIs. Our method makes little assumptions about the observational datasets and is thus widely applicable in medical practice. The key idea of our method is that we leverage prediction-powered inferences and thereby essentially `shrink' the CIs so that we offer more precise uncertainty quantification as compared to naïve approaches. We further prove the unbiasedness of our method and the validity of our CIs. We confirm our theoretical results through various numerical experiments. Finally, we provide an extension of our method for constructing CIs from combinations of experimental and observational datasets.
academic
Constructing Confidence Intervals for Average Treatment Effects from Multiple Datasets
This paper proposes a novel method for constructing confidence intervals for average treatment effects (ATE) from multiple observational datasets. The method makes fewer assumptions about observational datasets and has broad applicability in medical practice. The core idea leverages prediction-powered inference (PPI) to "shrink" confidence intervals, providing more precise uncertainty quantification compared to naive approaches. The paper establishes the unbiasedness of the method and validity of the confidence intervals, with numerical experiments validating the theoretical results. Additionally, the method is extended to handle combinations of experimental and observational datasets.
In the medical field, constructing confidence intervals for ATE from patient records is crucial for assessing drug efficacy and safety. However, patient records typically originate from different hospitals, making effective integration of multiple observational datasets a key challenge.
Medical Decision-Making Needs: Reliable confidence intervals are critical for clinical decision-making, ensuring evidence-based treatment selection
Data Fragmentation: Electronic health records are typically distributed across different healthcare institutions and countries, requiring integrated utilization
COVID-19 Case Study: During the pandemic, rapid assessment of drug effects from multi-center data was needed, such as studies on nirmatrelvir/ritonavir
Given a small unbiased observational dataset D₁ (satisfying unconfoundedness) and a large observational dataset D₂ (allowing unobserved confounding), the goal is to estimate the target population's ATE τ = EY¹(1) - Y¹(0) and construct valid confidence intervals.
Angelopoulos et al. (2023). Prediction-powered inference. Science.
van der Laan et al. (2024). Adaptive-TMLE for average treatment effect. arXiv.
Kallus et al. (2018). Removing hidden confounding by experimental grounding. NeurIPS.
Yang & Ding (2020). Combining multiple observational data sources. JASA.
Overall Assessment: This is a high-quality causal inference paper that successfully applies the prediction-powered inference framework to multi-dataset ATE estimation. The paper has solid theoretical foundations, well-designed experiments, and significant practical value in medical applications. While subject to certain assumption constraints, its overall contributions are substantial, providing new methodological tools for the causal inference field.