arXiv Open Access 2025

A computationally frugal open-source foundation model for thoracic disease detection in lung cancer screening programs

Niccolò McConnell Pardeep Vasudev Daisuke Yamada Daryl Cheng Mehran Azimbagirad +11 lainnya
Lihat Sumber

Abstrak

Low-dose computed tomography (LDCT) imaging employed in lung cancer screening (LCS) programs is increasing in uptake worldwide. LCS programs herald a generational opportunity to simultaneously detect cancer and non-cancer-related early-stage lung disease. Yet these efforts are hampered by a shortage of radiologists to interpret scans at scale. Here, we present TANGERINE, a computationally frugal, open-source vision foundation model for volumetric LDCT analysis. Designed for broad accessibility and rapid adaptation, TANGERINE can be fine-tuned off the shelf for a wide range of disease-specific tasks with limited computational resources and training data. Relative to models trained from scratch, TANGERINE demonstrates fast convergence during fine-tuning, thereby requiring significantly fewer GPU hours, and displays strong label efficiency, achieving comparable or superior performance with a fraction of fine-tuning data. Pretrained using self-supervised learning on over 98,000 thoracic LDCTs, including the UK's largest LCS initiative to date and 27 public datasets, TANGERINE achieves state-of-the-art performance across 14 disease classification tasks, including lung cancer and multiple respiratory diseases, while generalising robustly across diverse clinical centres. By extending a masked autoencoder framework to 3D imaging, TANGERINE offers a scalable solution for LDCT analysis, departing from recent closed, resource-intensive models by combining architectural simplicity, public availability, and modest computational requirements. Its accessible, open-source lightweight design lays the foundation for rapid integration into next-generation medical imaging tools that could transform LCS initiatives, allowing them to pivot from a singular focus on lung cancer detection to comprehensive respiratory disease management in high-risk populations.

Topik & Kata Kunci

Penulis (16)

N

Niccolò McConnell

P

Pardeep Vasudev

D

Daisuke Yamada

D

Daryl Cheng

M

Mehran Azimbagirad

J

John McCabe

S

Shahab Aslani

A

Ahmed H. Shahin

Y

Yukun Zhou

T

The SUMMIT Consortium

A

Andre Altmann

Y

Yipeng Hu

P

Paul Taylor

S

Sam M. Janes

D

Daniel C. Alexander

J

Joseph Jacob

Format Sitasi

McConnell, N., Vasudev, P., Yamada, D., Cheng, D., Azimbagirad, M., McCabe, J. et al. (2025). A computationally frugal open-source foundation model for thoracic disease detection in lung cancer screening programs. https://arxiv.org/abs/2507.01881

Akses Cepat

Lihat di Sumber
Informasi Jurnal
Tahun Terbit
2025
Bahasa
en
Sumber Database
arXiv
Akses
Open Access ✓