Semantic Scholar Open Access 2024 707 sitasi

A whole-slide foundation model for digital pathology from real-world data

Hanwen Xu N. Usuyama Jaspreet Bagga Sheng Zhang Rajesh Rao +23 lainnya

Abstrak

Digital pathology poses unique computational challenges, as a standard gigapixel slide may comprise tens of thousands of image tiles1–3. Prior models have often resorted to subsampling a small portion of tiles for each slide, thus missing the important slide-level context4. Here we present Prov-GigaPath, a whole-slide pathology foundation model pretrained on 1.3 billion 256 × 256 pathology image tiles in 171,189 whole slides from Providence, a large US health network comprising 28 cancer centres. The slides originated from more than 30,000 patients covering 31 major tissue types. To pretrain Prov-GigaPath, we propose GigaPath, a novel vision transformer architecture for pretraining gigapixel pathology slides. To scale GigaPath for slide-level learning with tens of thousands of image tiles, GigaPath adapts the newly developed LongNet5 method to digital pathology. To evaluate Prov-GigaPath, we construct a digital pathology benchmark comprising 9 cancer subtyping tasks and 17 pathomics tasks, using both Providence and TCGA data6. With large-scale pretraining and ultra-large-context modelling, Prov-GigaPath attains state-of-the-art performance on 25 out of 26 tasks, with significant improvement over the second-best method on 18 tasks. We further demonstrate the potential of Prov-GigaPath on vision–language pretraining for pathology7,8 by incorporating the pathology reports. In sum, Prov-GigaPath is an open-weight foundation model that achieves state-of-the-art performance on various digital pathology tasks, demonstrating the importance of real-world data and whole-slide modelling. Prov-GigaPath, a whole-slide pathology foundation model pretrained on a large dataset containing around 1.3 billion pathology images, attains state-of-the-art performance in cancer classification and pathomics tasks.

Topik & Kata Kunci

Penulis (28)

H

Hanwen Xu

N

N. Usuyama

J

Jaspreet Bagga

S

Sheng Zhang

R

Rajesh Rao

T

Tristan Naumann

C

Cliff Wong

Z

Zelalem Gero

J

Javier González

Y

Yu Gu

Y

Yanbo Xu

M

Mu-Hsin Wei

W

Wenhui Wang

S

Shuming Ma

F

Furu Wei

J

Jianwei Yang

C

Chun-yue Li

J

Jianfeng Gao

J

Jaylen Rosemon

T

Tucker Bower

S

Soohee Lee

R

R. Weerasinghe

B

Bill Wright

A

Ari Robicsek

B

B. Piening

C

Carlo Bifulco

S

Sheng Wang

H

Hoifung Poon

Format Sitasi

Xu, H., Usuyama, N., Bagga, J., Zhang, S., Rao, R., Naumann, T. et al. (2024). A whole-slide foundation model for digital pathology from real-world data. https://doi.org/10.1038/s41586-024-07441-w

Akses Cepat

Lihat di Sumber doi.org/10.1038/s41586-024-07441-w
Informasi Jurnal
Tahun Terbit
2024
Bahasa
en
Total Sitasi
707×
Sumber Database
Semantic Scholar
DOI
10.1038/s41586-024-07441-w
Akses
Open Access ✓