Semantic Scholar Open Access 2022 659 sitasi

A Review of Feature Selection Methods for Machine Learning-Based Disease Risk Prediction

N. Pudjihartono T. Fadason A. Kempa-Liehr J. O’Sullivan

Abstrak

Machine learning has shown utility in detecting patterns within large, unstructured, and complex datasets. One of the promising applications of machine learning is in precision medicine, where disease risk is predicted using patient genetic data. However, creating an accurate prediction model based on genotype data remains challenging due to the so-called “curse of dimensionality” (i.e., extensively larger number of features compared to the number of samples). Therefore, the generalizability of machine learning models benefits from feature selection, which aims to extract only the most “informative” features and remove noisy “non-informative,” irrelevant and redundant features. In this article, we provide a general overview of the different feature selection methods, their advantages, disadvantages, and use cases, focusing on the detection of relevant features (i.e., SNPs) for disease risk prediction.

Penulis (4)

N

N. Pudjihartono

T

T. Fadason

A

A. Kempa-Liehr

J

J. O’Sullivan

Format Sitasi

Pudjihartono, N., Fadason, T., Kempa-Liehr, A., O’Sullivan, J. (2022). A Review of Feature Selection Methods for Machine Learning-Based Disease Risk Prediction. https://doi.org/10.3389/fbinf.2022.927312

Akses Cepat

Lihat di Sumber doi.org/10.3389/fbinf.2022.927312
Informasi Jurnal
Tahun Terbit
2022
Bahasa
en
Total Sitasi
659×
Sumber Database
Semantic Scholar
DOI
10.3389/fbinf.2022.927312
Akses
Open Access ✓