IV Co-Scientist: Multi-Agent LLM Framework for Causal Instrumental Variable Discovery
Abstrak
In the presence of confounding between an endogenous variable and the outcome, instrumental variables (IVs) are used to isolate the causal effect of the endogenous variable. Identifying valid instruments requires interdisciplinary knowledge, creativity, and contextual understanding, making it a non-trivial task. In this paper, we investigate whether large language models (LLMs) can aid in this task. We perform a two-stage evaluation framework. First, we test whether LLMs can recover well-established instruments from the literature, assessing their ability to replicate standard reasoning. Second, we evaluate whether LLMs can identify and avoid instruments that have been empirically or theoretically discredited. Building on these results, we introduce IV Co-Scientist, a multi-agent system that proposes, critiques, and refines IVs for a given treatment-outcome pair. We also introduce a statistical test to contextualize consistency in the absence of ground truth. Our results show the potential of LLMs to discover valid instrumental variables from a large observational database.
Topik & Kata Kunci
Penulis (5)
Ivaxi Sheth
Zhijing Jin
Bryan Wilder
Dominik Janzing
Mario Fritz
Akses Cepat
- Tahun Terbit
- 2026
- Bahasa
- en
- Sumber Database
- arXiv
- Akses
- Open Access ✓