Hasil untuk "machine learning"

Menampilkan 20 dari ~2963490 hasil · dari CrossRef, DOAJ

JSON API
DOAJ Open Access 2026
PERMEPSY: a multicentre, randomized, double-blind proof-of-concept trial of personalized metacognitive training for adults with psychosis — a study protocol

Maria Lamarca, Maria Lamarca, Maria Lamarca et al.

BackgroundWhile psychological interventions are effective at improving symptoms of psychosis, accessible, cost- and time-efficient treatments remain limited. Personalized medicine has emerged as a promising approach, tailoring interventions to individual needs. Metacognitive Training (MCT), with its established efficacy and adaptable format, is well-suited for personalization. The PERMEPSY project (Towards a Personalized Medicine Approach to Psychological Treatment for Psychosis) aims to deliver tailored MCT intervention for individuals with psychosis.MethodsPERMEPSY is an international study funded by ERAPerMed (JTC2022) involving five clinical partners (Spain, Chile, France, Germany, Poland) and one technological partner (Spain). The project involves a proof-of-concept clinical trial recruiting 51 participants from each center for a total of 255 adult participants with psychosis in a prospective study (Registration: NCT06603922, 19-09-2024). The trial will test the efficacy of a Machine Learning (ML)-derived platform at predicting clinical and functional outcomes from baseline scores and compare a personalized MCT (P-MCT) to a classical MCT based on the platform’s predictions.AimsPERMEPSY seeks to (1) develop and test the predictive power of an algorithm that could support decision-making, and (2) ascertain whether P-MCT is more effective than MCT at improving key symptoms and cognitive impairments associated to psychosis.ResultsA harmonized retrospective database enabled the development of a predictive ML algorithm, integrated into an innovative platform. This platform provides clinicians with the information needed to deliver P-MCT. Predictions include changes in positive symptoms (e.g., delusions), insight, self-esteem, and treatment adherence.DiscussionBy integrating diverse data types and innovative technology, PERMEPSY addresses the need for personalized, effective treatment in psychosis, aiming to reduce individual and systemic burdens while supporting clinicians in their decision-making.

DOAJ Open Access 2025
Leveraging Artificial Intelligence to Predict Posterior Malleolus Fracture Extension in Tibial Shaft Fractures

Junaid Aamir MBChB MRes MRCS, Chijioke Orji MBBS, MRCS, Lyndon Mason BMBS MRCS FRCS

Research Type: Level 3 - Retrospective cohort study, Case-control study, Meta-analysis of Level 3 studies Introduction/Purpose: Occult posterior malleolar fractures (PMFs) are reported commonly in tibial shaft fractures. Presurgical identification is necessary to negate possible complications during tibial fracture surgical treatment (e.g. PMF fracture displacement. The aims of this study was to determine the most relevant factors predicting PM fractures by applying the AI "Minimum Redundancy Maximum Relevance" (mRMR) feature selection method to investigate the predictive power of various clinical and demographic factors. Methods: This was a historic cohort study, employing the mRMR method to identify the most relevant features associated with occult PMF in tibial shaft fractures . The inclusion criteria for this study were any patient who had sustained a diaphyseal tibial fracture who had undergone surgery during the study period who had also undergone a CT scan in addition to plain radiographs. The selected features were then used to train a machine learning model for identifying occult PMF. The model's performance was evaluated by measuring classification accuracy across different combinations of features. Results: Out of 764 diaphyseal fractures identified 442 met the inclusion criteria. A total of 107 patients had PMF extensions (24.21%). The analysis revealed that tibia the most relevant fractures were tibia fracture type, fibular fracture morphology, tibia fracture level, fibular fracture level and mechanism. On further analysis, tibial spiral fracture was the most significant predictor, with a classification accuracy of 0.78 with other factor clusters not reaching the same significance. Low energy mechanisms and tibial fracture comminution were the best predictors of no occult PMF. Conclusion: This study demonstrates that AI can effectively be used to identify prediction factors of occult PMF in tibial fractures. Spiral tibia fractures being the most relevant predictor. The findings highlight the potential of using AI-driven methods, such as mRMR, to enhance the accuracy of injury prediction, and these findings are in keeping with results from traditional statistics.

Orthopedic surgery
DOAJ Open Access 2025
Enhanced Intrusion Detection in Drone Networks: A Cross-Layer Convolutional Attention Approach for Drone-to-Drone and Drone-to-Base Station Communications

Mohammad Aldossary, Ibrahim Alzamil, Jaber Almutairi

Due to Internet of Drones (IoD) technology, drone networks have proliferated, transforming surveillance, logistics, and disaster management. Distributed Denial of Service (DDoS) attacks, malware infections, and communication abnormalities increase cybersecurity dangers to these networks, threatening operational safety and efficiency. Current Intrusion Detection Systems (IDSs) fail to handle drone transmission data’s dynamic, high-dimensional nature, resulting in inadequate real-time anomaly identification and mitigation. This study presents the Cross-Layer Convolutional Attention Network (CLCAN), a new IDS architecture for IoD networks. CLCAN accurately detects complex cyber threats using multi-scale convolutional processing, hierarchical contextual attention, and dynamic feature fusion. Preprocessing methods like weighted differential scaling and gradient-based adaptive resampling improve data quality and reduce class imbalances. Contextual attribute transformation captures the nuanced network behaviors needed for anomaly identification. The proposed technique is shown to be necessary and effective by real-world drone communication dataset evaluations. CLCAN outperforms CNN, LSTM, and XGBoost with 98.4% accuracy, 98.7% recall, and 98.1% F1-score. The model has a remarkable AUC of 0.991. CLCAN can handle datasets of over 118,000 balanced data records in 85 s, compared to 180 s for comparable frameworks. This study pioneers a unified security solution for Drone-to-Drone (D2D) and Drone-to-Base Station (D2BS) communications, filling a crucial IoD security gap. It protects mission-critical drone operations with a strong, efficient, and scalable IDS from emerging cyber threats.

Motor vehicles. Aeronautics. Astronautics
DOAJ Open Access 2025
Tackling visual impairment: emerging avenues in ophthalmology

Fang Lin, Yuxing Su, Chenxi Zhao et al.

Visual impairment, stemming from genetic, degenerative, and traumatic causes, affects millions globally. Recent advancements in ophthalmology present novel strategies for managing and potentially reversing these conditions. Here, we explore 10 emerging avenues—including gene therapy, stem cell therapy, advanced imaging, novel therapeutics, nanotechnology, artificial intelligence (AI) and machine learning, teleophthalmology, optogenetics, bionics, and neuro-ophthalmology—all making strides to improve diagnosis, treatment, and vision restoration. Among these, gene therapy and stem cell therapy are revolutionizing the treatment of retinal degenerative diseases, while advanced imaging technologies enable early detection and personalized care. Therapeutic advancements like anti-vascular endothelial growth factor therapies and neuroprotective agents, along with nanotechnology, have improved clinical outcomes for multiple ocular conditions. AI, especially machine learning, is enhancing diagnostic accuracy, facilitating early detection, and personalized treatment strategies, particularly when integrated with advanced imaging technologies. Teleophthalmology, further strengthened by AI, is expanding access to care, particularly in underserved regions, whereas emerging technologies like optogenetics, bionics, and neuro-ophthalmology offer new hope for patients with severe vision impairment. In light of ongoing research, we summarize the current clinical landscape and the potential advantages of these innovations to revolutionize the management of visual impairments. Additionally, we address the challenges and limitations associated with these emerging avenues in ophthalmology, providing insights into their future trajectories in clinical practice. Continued advancements in these fields promise to reshape the landscape of ophthalmic care, ultimately improving the quality of life for individuals with visual impairments.

Medicine (General)
DOAJ Open Access 2025
Prediction of nexus among ESG disclosure and firm Performance: Applicability, explainability and implications

Joel Victor Dossa, Chiagoziem C. Ukwuoma, Dara Thomas et al.

This study investigates the nexus between ESG disclosure and firm performance using advanced machine learning models (MLs) to capture complex, non-linear interactions. Analyzing data from Chinese A-share firms (2012–2022), it employs Explainable AI (XAI) tools such as SHAP, heat maps, and Williams plots to enhance model transparency and interpretability. Among several models, the Extra Trees model demonstrated the best predictive performance, revealing that ESG disclosure positively correlates with firm performance, with environmental disclosure exerting the strongest influence. Policymakers are urged to promote standardized, transparent ESG disclosures, particularly focusing on environmental practices while addressing greenwashing to enhance credibility. Investors can prioritize firms with strong environmental practices and use predictive models to refine decision-making. Corporate managers are encouraged to embed sustainability into long-term strategies and utilize ML techniques for improved governance. The study contributes by showcasing the utility of MLs in exploring ESG-performance relationships, offering actionable insights for stakeholders, and providing a foundation for future research. Researchers are encouraged to investigate non-linear ESG impacts across diverse contexts, using broader samples and incorporating market-based measures and ESG rating agencies to improve generalizability. This approach advances understanding of ESG's role in driving firm performance while addressing methodological gaps.

Environmental sciences, Technology
DOAJ Open Access 2025
Comprehensive framework of machine learning and deep learning architectures with metaheuristic optimization for high-fidelity prediction of nanofluid specific heat capacity

Priya Mathur, Dheeraj Kumar, Farhan Sheth et al.

Abstract Accurately predicting the specific heat capacity of nanofluids is critical for optimizing their performance in engineering and industrial applications. This study explores twelve machine learning and deep learning models using conventional and stacking ensemble techniques. In the stacking framework, a linear regression model is employed as a meta-learner to improve base model performance. Additionally, two nature-inspired metaheuristic optimization algorithms—Particle Swarm Optimization and Grey Wolf Optimization—were used to fine-tune the hyperparameters of machine learning models. This research is based on a comprehensive dataset of 1,269 experimental nanofluid samples, with key inputs including nanofluid type (hybrid and direct), temperature, and volume concentration. To improve model generalization, data augmentation strategies inspired by polynomial/Fourier expansions and autoencoder-based methods were implemented. The results demonstrate that the stacked multi-layer perceptron model, integrated with linear regression, achieved the highest predictive accuracy, recording an R² score of 0.99927, a mean squared error of 466.06, and a root mean squared error of 21.58. Among standalone machine learning models, CatBoost was the best performer (R² score: 0.99923, MSE: 487.71, RMSE: 22.08), ranking second overall. The impact of metaheuristic optimization was significant; Grey Wolf Optimization, for instance, reduced the LightGBM model’s mean squared error from 29386.43 to 6549.006. These findings underscore the efficacy of hybrid ML/DL frameworks, advanced data augmentation, and metaheuristic optimization in predictive modeling of nanofluid thermophysical properties, providing a robust foundation for future research in heat transfer applications.

Medicine, Science
DOAJ Open Access 2025
Predicting Choroidal Nevus Transformation to Melanoma Using Machine Learning

Prashant D. Tailor, MD, Piotr K. Kopinski, MD, PhD, Haley S. D’Souza, MD et al.

Purpose: To develop and validate machine learning (ML) models to predict choroidal nevus transformation to melanoma based on multimodal imaging at initial presentation. Design: Retrospective multicenter study. Participants: Patients diagnosed with choroidal nevus on the Ocular Oncology Service at Wills Eye Hospital (2007–2017) or Mayo Clinic Rochester (2015–2023). Methods: Multimodal imaging was obtained, including fundus photography, fundus autofluorescence, spectral domain OCT, and B-scan ultrasonography. Machine learning models were created (XGBoost, LGBM, Random Forest, Extra Tree) and optimized for area under receiver operating characteristic curve (AUROC). The Wills Eye Hospital cohort was used for training and testing (80% training–20% testing) with fivefold cross validation. The Mayo Clinic cohort provided external validation. Model performance was characterized by AUROC and area under precision–recall curve (AUPRC). Models were interrogated using SHapley Additive exPlanations (SHAP) to identify the features most predictive of conversion from nevus to melanoma. Differences in AUROC and AUPRC between models were tested using 10 000 bootstrap samples with replacement and results. Main Outcome Measures: Area under receiver operating curve and AUPRC for each ML model. Results: There were 2870 nevi included in the study, with conversion to melanoma confirmed in 128 cases. Simple AI Nevus Transformation System (SAINTS; XGBoost) was the top-performing model in the test cohort [pooled AUROC 0.864 (95% confidence interval (CI): 0.864–0.865), pooled AUPRC 0.244 (95% CI: 0.243–0.246)] and in the external validation cohort [pooled AUROC 0.931 (95% CI: 0.930–0.931), pooled AUPRC 0.533 (95% CI: 0.531–0.535)]. Other models also had good discriminative performance: LGBM (test set pooled AUROC 0.831, validation set pooled AUROC 0.815), Random Forest (test set pooled AUROC 0.812, validation set pooled AUROC 0.866), and Extra Tree (test set pooled AUROC 0.826, validation set pooled AUROC 0.915). A model including only nevi with at least 5 years of follow-up demonstrated the best performance in AUPRC (test: pooled 0.592 (95% CI: 0.590–0.594); validation: pooled 0.656 [95% CI: 0.655–0.657]). The top 5 features in SAINTS by SHAP values were: tumor thickness, largest tumor basal diameter, tumor shape, distance to optic nerve, and subretinal fluid extent. Conclusions: We demonstrate accuracy and generalizability of a ML model for predicting choroidal nevus transformation to melanoma based on multimodal imaging. Financial Disclosures: Proprietary or commercial disclosure may be found in the Footnotes and Disclosures at the end of this article.

DOAJ Open Access 2024
Evaluating early predictive performance of machine learning approaches for engineering change schedule – A case study using predictive process monitoring techniques

Ognjen Radišić-Aberger, Peter Burggräf, Fabian Steinberg et al.

By applying machine learning algorithms, predictive business process monitoring (PBPM) techniques provide an opportunity to counteract undesired outcomes of processes. An especially complex variation of business processes is the engineering change (EC) process. Here, failing to adhere to planned implementation dates can have severe impacts on assembly lines, and it is paramount that potential negative cases are identified as early as possible. Current PBPM research, however, has seldomly investigated the predictive performance of machine learning approaches and their applicability at early process steps, let alone for the EC process. In our research, we show that given adequate feature encoding, shallow learners can accurately predict schedule adherence after process initialisation. Based on EC data from an automotive manufacturer, we provide a case sensitive performance overview on algorithm-encoding combinations. For that, three algorithms (XGBoost, Random Forest, LSTM) were combined with four encoding techniques. The encoding techniques used were the two common aggregation-based and index-based last state encoding, and two new combinations of these, which we term advanced aggregation-based and complex aggregation-based encoding. The study indicates that XGBoost-index-encoded approaches outclass regarding average predictive performance, whereas Random-Forest-aggregation-encoded approaches perform better regarding temporal stability due to reduced influence by dynamic features. Our research provides a case-based reasoning approach for deciding on which algorithm-encoding combination and evaluation metrics to apply. In doing so, we provide a blueprint for an early warning and monitoring method within the EC process and other similarly complex processes.

Marketing. Distribution of products, Management. Industrial management
DOAJ Open Access 2024
Heart disease prediction using autoencoder and DenseNet architecture

Norah Saleh Alghamdi, Mohammed Zakariah, Achyut Shankar et al.

Heart disease continues to be a prominent cause of death globally, emphasizing the critical requirement for precise prediction techniques and prompt therapies. This research presents a new method that utilizes the collective capabilities of autoencoder and DenseNet architectures to predict heart illness. Our study is based on the Heart Disease UCI Cleveland dataset, which includes 13 variables that cover clinical and demographic parameters such as age, sex, cholesterol levels, and exercise-induced angina. The dataset presents issues due to its varied attribute types, including category and numerical variables. Furthermore, our approach tackles these difficulties by utilizing a dense autoencoder model, which produced exceptional outcomes. The Model attained a mean accuracy of 99.67% on the Heart Disease UCI Cleveland dataset. Further testing showed it was resilient, with a test accuracy of 99.99%. In addition, the Model demonstrated outstanding macro precision, macro recall, and macro F1 score, with percentages of 99.98%, 99.97%, and 99.96%, respectively. In addition, our results indicate that combining autoencoder and DenseNet designs shows potential for predicting cardiac disease, with substantial enhancements in accuracy and performance metrics compared to current approaches. This methodology can improve clinical decision-making and patient outcomes in cardiovascular care by accurately finding and defining complex patterns within the data. Notwithstanding these encouraging outcomes, our investigation has constraints. The specific attributes of the dataset utilized may limit the applicability of our findings. Subsequent studies could examine the suitability of our method for various datasets and analyze supplementary variables that may improve forecast precision. Furthermore, it is necessary to conduct prospective validation studies to evaluate our strategy’s practical effectiveness in clinical environments.

Electronic computers. Computer science
DOAJ Open Access 2024
Hyperspectral imaging for the detection of plant pathogens in seeds: recent developments and challenges

Luciellen da Costa Ferreira, Ian Carlos Bispo Carvalho, Lúcio André de Castro Jorge et al.

Food security, a critical concern amid global population growth, faces challenges in sustainable agricultural production due to significant yield losses caused by plant diseases, with a multitude of them caused by seedborne plant pathogen. With the expansion of the international seed market with global movement of this propagative plant material, and considering that about 90% of economically important crops grown from seeds, seed pathology emerged as an important discipline. Seed health testing is presently part of quality analysis and carried out by seed enterprises and governmental institutions looking forward to exclude a new pathogen in a country or site. The development of seedborne pathogens detection methods has been following the plant pathogen detection and diagnosis advances, from the use of cultivation on semi-selective media, to antibodies and DNA-based techniques. Hyperspectral imaging (HSI) associated with artificial intelligence can be considered the new frontier for seedborne pathogen detection with high accuracy in discriminating infected from healthy seeds. The development of the process consists of standardization of methods and protocols with the validation of spectral signatures for presence and incidence of contamined seeds. Concurrently, epidemiological studies correlating this information with disease outbreaks would help in determining the acceptable thresholds of seed contamination. Despite the high costs of equipment and the necessity for interdisciplinary collaboration, it is anticipated that health seed certifying programs and seed suppliers will benefit from the adoption of HSI techniques in the near future.

DOAJ Open Access 2023
Employing EMG sensors in Bionic limbs based on a New Binary Trick Method

Mohammed Guhdar Mohammed, Belnd Saadi Salih, Vaman Muhammed Haji

Human muscles can be read by using electromyography (EMG) sensors, which are electrical signals generated by the muscles of human and animal bodies. This means it is possible to use electricity generated by muscles to control actuators/servo motors for any specific tasks. This could support a wide range of applications, especially for people with disabilities. One such application would be making bionic limbs based on servo motors. According to a study held by the K4D helpdesk report based on estimations that 15.3% of the world’s population has a moderate or severe disability, this proportion is likely to increase to 18-20% in conflict- affected areas (Thompson, 2017). The goal of this study is to make bionic limbs affordable by minimizing the cost while maintaining accuracy at an acceptable rate. To achieve this goal, the study proposes a new idea for using electromyography (EMG) sensors in bionic limbs, which suggests a decrease in the number of EMG sensors to decrease the cost and power consumption. Decreasing the number of EMG sensors will result in a loss of accuracy in controlling actuators (servo motors) because usually, each sensor is responsible for activating one servo motor. In normal projects, one will need at least six EMG sensors to control six servo motors. The study will use only three EMG sensors to control/activate six servo motors depending on the binary trick idea suggested by this study, which is manipulating all three input signals from EMG sensors at once and then deciding which servo motor to activate by using a supervised machine learning technique such as K-nearest neighbors (kNN).

DOAJ Open Access 2023
GaN JBS Diode Device Performance Prediction Method Based on Neural Network

Hao Ma, Xiaoling Duan, Shulong Wang et al.

GaN JBS diodes exhibit excellent performance in power electronics. However, device performance is affected by multiple parameters of the P+ region, and the traditional TCAD simulation method is complex and time-consuming. In this study, we used a neural network machine learning method to predict the performance of a GaN JBS diode. First, 3018 groups of sample data composed of device structure and performance parameters were obtained using TCAD tools. The data were then input into the established neural network for training, which could quickly predict the device performance. The final prediction results show that the mean relative errors of the on-state resistance and reverse breakdown voltage are 0.048 and 0.028, respectively. The predicted value has an excellent fitting effect. This method can quickly design GaN JBS diodes with target performance and accelerate research on GaN JBS diode performance prediction.

Mechanical engineering and machinery
DOAJ Open Access 2022
Machine learning and data-driven prediction of pore pressure from geophysical logs: A case study for the Mangahewa gas field, New Zealand

Ahmed E. Radwan, David A. Wood, Ahmed A. Radwan

Pore pressure is an essential parameter for establishing reservoir conditions, geological interpretation and drilling programs. Pore pressure prediction depends on information from various geophysical logs, seismic, and direct down-hole pressure measurements. However, a level of uncertainty accompanies the prediction of pore pressure because insufficient information is usually recorded in many wells. Applying machine learning (ML) algorithms can decrease the level of uncertainty of pore pressure prediction uncertainty in cases where available information is limited. In this research, several ML techniques are applied to predict pore pressure through the over-pressured Eocene reservoir section penetrated by four wells in the Mangahewa gas field, New Zealand. Their predictions substantially outperform, in terms of prediction performance, those generated using a multiple linear regression (MLR) model. The geophysical logs used as input variables are sonic, temperature and density logs, and some direct pore pressure measurements were available at the reservoir level to calibrate the predictions. A total of 25,935 data records involving six well-log input variables were evaluated across the four wells. All ML methods achieved credible levels of pore pressure prediction performance. The most accurate models for predicting pore pressure in individual wells on a supervised basis are decision tree (DT), adaboost (ADA), random forest (RF) and transparent open box (TOB). The DT achieved root mean square error (RMSE) ranging from 0.25 psi to 14.71 psi for the four wells. The trained models were less accurate when deployed on a semi-supervised basis to predict pore pressure in the other wellbores. For two wells (Mangahewa-03 and Mangahewa-06), semi-supervised prediction achieved acceptable prediction performance of RMSE of 130–140 psi; while for the other wells, semi-supervised prediction performance was reduced to RMSE > 300 psi. The results suggest that these models can be used to predict pore pressure in nearby locations, i.e. similar geology at corresponding depths within a field, but they become less reliable as the step-out distance increases and geological conditions change significantly. In comparison to other approaches to predict pore pressures, this study has identified that application of several ML algorithms involving a large number of data records can lead to more accurate prediction results.

Engineering geology. Rock mechanics. Soil mechanics. Underground construction
DOAJ Open Access 2022
A Novel Approach for Multichannel Epileptic Seizure Classification Based on Internet of Things Framework Using Critical Spectral Verge Feature Derived from Flower Pollination Algorithm

Dhanalekshmi Prasad Yedurkar, Shilpa P. Metkar, Fadi Al-Turjman et al.

A novel approach for multichannel epilepsy seizure classification which will help to automatically locate seizure activity present in the focal brain region was proposed. This paper suggested an Internet of Things (IoT) framework based on a smart phone by utilizing a novel feature termed multiresolution critical spectral verge (MCSV), based on frequency-derived information for epileptic seizure classification which was optimized using a flower pollination algorithm (FPA). A wireless sensor technology (WSN) was utilized to record the electroencephalography (EEG) signal of epileptic patients. Next, the EEG signal was pre-processed utilizing a multiresolution-based adaptive filtering (MRAF) method. Then, the maximal frequency point at which the power spectral density (PSD) of each EEG segment was greater than the average spectral power of the corresponding frequency band was computed. This point was further optimized to extract a point termed as critical spectral verge (CSV) to extract the exact high frequency oscillations representing the actual seizure activity present in the EEG signal. Next, a support vector machine (SVM) classifier was used for channel-wise classification of the seizure and non-seizure regions using CSV as a feature. This process of classification using the CSV feature extracted from the MRAF output is referred to as the MCSV approach. As a final step, cloud-based services were employed to analyze the EEG information from the subject’s smart phone. An exhaustive analysis was undertaken to assess the performance of the MCSV approach for two datasets. The presented approach showed an improved performance with a 93.83% average sensitivity, a 97.94% average specificity, a 97.38% average accuracy with the SVM classifier, and a 95.89% average detection rate as compared with other state-of-the-art studies such as deep learning. The methods presented in the literature were unable to precisely localize the origination of the seizure activity in the brain region and reported a low seizure detection rate. This work introduced an optimized CSV feature which was effectively used for multichannel seizure classification and localization of seizure origination. The proposed MCSV approach will help diagnose epileptic behavior from multichannel EEG signals which will be extremely useful for neuro-experts to analyze seizure details from different regions of the brain.

Chemical technology
DOAJ Open Access 2021
Applications of Rough Sets in Big Data Analysis: An Overview

Pięta Piotr, Szmuc Tomasz

Big data, artificial intelligence and the Internet of things (IoT) are still very popular areas in current research and industrial applications. Processing massive amounts of data generated by the IoT and stored in distributed space is not a straightforward task and may cause many problems. During the last few decades, scientists have proposed many interesting approaches to extract information and discover knowledge from data collected in database systems or other sources. We observe a permanent development of machine learning algorithms that support each phase of the data mining process, ensuring achievement of better results than before. Rough set theory (RST) delivers a formal insight into information, knowledge, data reduction, uncertainty, and missing values. This formalism, formulated in the 1980s and developed by several researches, can serve as a theoretical basis and practical background for dealing with ambiguities, data reduction, building ontologies, etc. Moreover, as a mature theory, it has evolved into numerous extensions and has been transformed through various incarnations, which have enriched expressiveness and applicability of the related tools. The main aim of this article is to present an overview of selected applications of RST in big data analysis and processing. Thousands of publications on rough sets have been contributed; therefore, we focus on papers published in the last few years. The applications of RST are considered from two main perspectives: direct use of the RST concepts and tools, and jointly with other approaches, i.e., fuzzy sets, probabilistic concepts, and deep learning. The latter hybrid idea seems to be very promising for developing new methods and related tools as well as extensions of the application area.

Mathematics, Electronic computers. Computer science

Halaman 13 dari 148175