A large language model-based tool for identifying relationships to industry in research on the carcinogenicity of benzene, cobalt, and aspartame
Abstrak
Abstract Background Industry-funded research poses a threat to the validity of scientific inference on carcinogenic hazards. Scientists require tools to better identify and characterize industry sponsored research across bodies of evidence to reduce the possible influence of industry bias in evidence synthesis reviews. We applied a novel large language model (LLM)-based tool named InfluenceMapper to demonstrate and evaluate its performance in identifying relationships to industry in research on the carcinogenicity of benzene, cobalt, and aspartame. Methods All epidemiological, animal cancer, and mechanistic studies included in systematic reviews on the carcinogenicity of the three agents by the IARC Monographs programme. Selected agents were recently evaluated by the Monographs and are of commercial interest by major industries. InfluenceMapper extracted disclosed entities in study publications and classified up to 40 possible disclosed relationship types between each entity and the study and between each entity and author. A human classified entities as ‘industry or industry-funded’ and assessed relationships with industry for potential conflicts of interest. Positive predictive values described the extent of true positive relationships identified by InfluenceMapper compared to human assessment. Results Analyses included 2,046 studies for all three agents. We identified 320 disclosed industry or industry-funded entities from InfluenceMapper output that were involved in 770 distinct study-entity and author-entity relationships. For each agent, between 4 and 8% of studies disclosed funding by industry and 1–4% of studies had at least one author who disclosed receiving industry funding directly. Industry trade associations for all three agents funded 22 studies published in 16 journals over a 37-year span. Aside from funding, the most prevalent disclosed relationships with industry were receiving data, holding employment, paid consulting, and providing expert testimony. Positive predictive values were excellent (≥ 98%) for study-entity relationships but declined for relationships with individual authors. Conclusions LLM-based tools can significantly expedite and bolster the detection of disclosed conflicts of interest from industry sponsored research in cancer prevention. Possible use cases include facilitating the assessment of bias from industry studies in evidence synthesis reviews and alerting scientists to the influence of industry on scientific inference. Persistent challenges in ascertaining conflicts of interest underscore the urgent need for standardized, transparent, and enforceable disclosures in biomedical journals.
Topik & Kata Kunci
Penulis (6)
Nathan L. DeBono
Vanessa Amar
Hardy Hardy
Mary K. Schubauer-Berigan
Derek Ruths
Nicholas B. King
Format Sitasi
Akses Cepat
- Tahun Terbit
- 2025
- Sumber Database
- DOAJ
- DOI
- 10.1186/s12940-025-01223-1
- Akses
- Open Access ✓