DOAJ Open Access 2022

NEW ORGANIZATION PROCESS OF FEATURE SELECTION BY FILTER WITH CORRELATION-BASED FEATURES SELECTION METHOD

Olga Solovei

Abstrak

The subject of the article is feature selection techniques that are used on data preprocessing step before building machine learning models. In this paper the focus is put on a Filter technique when it uses Correlation-based Feature Selection (further CFS) with symmetrical uncertainty method (further CFS-SU) or CFS with Pearson Correlation (further CFS-PearCorr). The goal of the work is to increase the efficiency of feature selection by Filter with CFS by proposing a new organization process of feature selection. The tasks which are solved in the article: review and analysis of the existing organization process of feature selections by Filter with CFS; identify the routs cause the performance degradation; propose a new approach; evaluate the proposed approach. To implement the specified tasks, the following methods were used: information theory, process theory, algorithm theory, statistics theory, sampling techniques, data modeling theory, science experiments. Results. Based on the received results are proved: 1) the chosen features subset’s evaluation function couldn’t be based only on CFS merit as it causes a learning algorithm’s results degradation; 2) the accuracies of the classification learning algorithms had improved and the values of determination coefficient of the regression leaning algorithms had increased when features are selected according to the proposed new organization process. Conclusions. A new organization process for feature selection which is proposed in current work combines filter and learning algorithm properties in evaluation strategy which helps to choose the optimal feature subset for predefined learning algorithm. The computation complexity of the proposed approach to feature selection doesn’t depend on dataset’s dimensions which makes it robust to different data varieties; it eliminates the time needed for feature subsets’ search as subsets are selected randomly. The conducted experiments proved that the performance of the classification and regression learning algorithms with features selected according to the new flow had outperformed the performance of the same learning algorithms built with without applied new process on data preprocessing step.

Topik & Kata Kunci

Penulis (1)

O

Olga Solovei

Format Sitasi

Solovei, O. (2022). NEW ORGANIZATION PROCESS OF FEATURE SELECTION BY FILTER WITH CORRELATION-BASED FEATURES SELECTION METHOD. https://doi.org/10.30837/ITSSI.2022.21.039

Akses Cepat

PDF tidak tersedia langsung

Cek di sumber asli →
Lihat di Sumber doi.org/10.30837/ITSSI.2022.21.039
Informasi Jurnal
Tahun Terbit
2022
Sumber Database
DOAJ
DOI
10.30837/ITSSI.2022.21.039
Akses
Open Access ✓