geneEX: An Integrated Phenotype‐Driven Algorithm for Rapid Identification of Causative Variants in Monogenic Disorders
Abstrak
ABSTRACT Background In the diagnostic process of monogenic genetic disorders, identifying pathogenic variants is a crucial step. Thanks to the widespread adoption of Next‐Generation Sequencing (NGS) technology, diagnostic efficiency has been significantly enhanced. However, with the increasing demand for diagnostic accuracy in clinical practice for monogenic genetic diseases, accurately and swiftly pinpointing pathogenic variants among numerous candidate variants remains a significant challenge. The complexity of data analysis and interpretation continues to limit both the efficiency and accuracy of diagnosis. Methods In this study, we have developed an innovative phenotype‐driven algorithm, geneEX. This algorithm integrates large language model technology to accurately extract phenotypes from clinical information and automatically acquire Human Phenotype Ontology (HPO) information through a semantic vector representation model, thereby identifying HPO‐associated genes. Additionally, it supports semantic matching between patients' free‐text phenotypic descriptions and disease phenotypes, further enhancing the identification of pathogenic genes. The algorithm can rank candidate causative variants, enabling rapid and precise identification of potential pathogenic variants in rare genetic disorders. Results geneEX demonstrates commendable performance in ranking pathogenic variants across both virtual and clinical datasets. The supplementary matching of phenotypes in free‐text form significantly enhances the precision of candidate variant prioritization for samples. Conclusion geneEX has achieved automated HPO acquisition through its independently developed phenotype extraction and standardization methods, thereby enabling the full‐process automated identification from clinical samples to pathogenic variants. Additionally, by integrating free‐text phenotypic descriptions with disease phenotype matching, it enhances the accuracy of pathogenic gene identification. This innovative approach significantly improves the precision and efficiency of identifying pathogenic variants in rare genetic disorders, providing robust support for the diagnosis of monogenic diseases.
Topik & Kata Kunci
Penulis (13)
Junyu Zhang
Dongyun Liu
Mei Chen
Yunqian Fang
Kun Dai
Xiaoxi Zhu
Qingqing Xu
Meiling Hou
Li Wang
Jianfeng Wang
Jun Zhang
Bo Liang
Xiaoming Teng
Akses Cepat
- Tahun Terbit
- 2025
- Sumber Database
- DOAJ
- DOI
- 10.1002/mgg3.70139
- Akses
- Open Access ✓