Identifying unmeasured heterogeneity in microbiome data via quantile thresholding (QuanT)
Abstrak
Abstract Background Microbiome data, like other high-throughput data, suffer from technical heterogeneity stemming from differential experimental designs and processing. In addition to measured artifacts such as batch effects, there is heterogeneity due to unknown or unmeasured factors, which lead to spurious conclusions if unaccounted for. With the advent of large-scale multi-center microbiome studies and the increasing availability of public datasets, this issue becomes more pronounced. Current approaches for addressing unmeasured heterogeneity in high-throughput data were developed for microarray and/or RNA sequencing data. They cannot accommodate the unique characteristics of microbiome data such as sparsity and over-dispersion. Results Here, we introduce quantile thresholding (QuanT), a novel non-parametric approach for identifying unmeasured heterogeneity tailored to microbiome data. QuanT applies quantile regression across multiple quantile levels to threshold the microbiome abundance data and uncovers latent heterogeneity using thresholded binary residual matrices. We validated QuanT using both synthetic and real microbiome datasets, demonstrating its superiority in capturing and mitigating heterogeneity and improving the accuracy of downstream analyses, such as prediction analysis, differential abundance tests, and community-level diversity evaluations. Conclusions We present QuanT, a novel tool for comprehensive identification of unmeasured heterogeneity in microbiome data. QuanT’s distinct non-parametric method markedly enhances downstream analyses, serving as a valuable tool for data integration and comprehensive analysis in microbiome research. Video Abstract
Topik & Kata Kunci
Penulis (6)
Jiuyao Lu
Glen A. Satten
Katie A. Meyer
Lenore J. Launer
Wodan Ling
Ni Zhao
Akses Cepat
- Tahun Terbit
- 2026
- Sumber Database
- DOAJ
- DOI
- 10.1186/s40168-025-02282-9
- Akses
- Open Access ✓