Improving Precision of RCT-Based CATE Estimation using Data Borrowing with Double Calibration
Abstrak
Understanding how treatment effects vary across patient characteristics is essential for personalized medicine, yet randomized controlled trials (RCTs) are often underpowered to detect heterogeneous treatment effects (HTEs). We propose a framework that improves the efficiency of conditional average treatment effect (CATE) estimation in RCTs by leveraging large observational studies (OS) while preserving the unbiasedness of RCT estimates. By framing CATE estimation as a supervised learning problem, we show that estimation variance is minimized using the counterfactual mean outcome (CMO) as an augmentation function. We derive finite-sample error bounds and establish conditions under which OS data improves CMO estimation, and thus CATE efficiency, even in the presence of confounding in the OS or outcome distribution shifts between populations. We introduce R-OSCAR (Robust Observational Studies for CMO-Augmented RCT), a two-stage estimator that calibrates OS outcome predictions to the RCT population and corrects residual biases through regularized regression. Simulations show that R-OSCAR can reduce the RCT sample size needed for HTE detection by up to 75%, maintaining robustness to model misspecification. Application to the Tennessee STAR study confirms these efficiency gains. Our framework offers a principled approach to integrating observational and experimental data using tools from statistical learning and transfer learning.
Topik & Kata Kunci
Penulis (6)
Amir Asiaee
Chiara Di Gravio
Cole Beck
Yuting Mei
Samhita Pal
Jared D. Huling
Akses Cepat
- Tahun Terbit
- 2023
- Bahasa
- en
- Sumber Database
- arXiv
- Akses
- Open Access ✓