Hasil untuk "machine learning"

Menampilkan 20 dari ~10330640 hasil · dari CrossRef, DOAJ, Semantic Scholar

JSON API
S2 Open Access 2014
Searching for exotic particles in high-energy physics with deep learning

P. Baldi, Peter Sadowski, D. Whiteson

Collisions at high-energy particle colliders are a traditionally fruitful source of exotic particle discoveries. Finding these rare particles requires solving difficult signal-versus-background classification problems, hence machine-learning approaches are often used. Standard approaches have relied on ‘shallow’ machine-learning models that have a limited capacity to learn complex nonlinear functions of the inputs, and rely on a painstaking search through manually constructed nonlinear features. Progress on this problem has slowed, as a variety of techniques have shown equivalent performance. Recent advances in the field of deep learning make it possible to learn more complex functions and better discriminate between signal and background classes. Here, using benchmark data sets, we show that deep-learning methods need no manually constructed inputs and yet improve the classification metric by as much as 8% over the best current approaches. This demonstrates that deep-learning approaches can improve the power of collider searches for exotic particles. High-energy particle colliders are important for finding new particles, but huge volumes of data must be searched through to locate them. Here, the authors show the use of deep-learning methods on benchmark data sets as an approach to improving such new particle searches.

1287 sitasi en Physics, Medicine
S2 Open Access 2018
Deep Learning in Microscopy Image Analysis: A Survey

Fuyong Xing, Yuanpu Xie, H. Su et al.

Computerized microscopy image analysis plays an important role in computer aided diagnosis and prognosis. Machine learning techniques have powered many aspects of medical investigation and clinical practice. Recently, deep learning is emerging as a leading machine learning tool in computer vision and has attracted considerable attention in biomedical image analysis. In this paper, we provide a snapshot of this fast-growing field, specifically for microscopy image analysis. We briefly introduce the popular deep neural networks and summarize current deep learning achievements in various tasks, such as detection, segmentation, and classification in microscopy image analysis. In particular, we explain the architectures and the principles of convolutional neural networks, fully convolutional networks, recurrent neural networks, stacked autoencoders, and deep belief networks, and interpret their formulations or modelings for specific tasks on various microscopy images. In addition, we discuss the open challenges and the potential trends of future research in microscopy image analysis using deep learning.

377 sitasi en Computer Science, Medicine
DOAJ Open Access 2026
Multi-Crop Yield Estimation and Spatial Analysis of Agro-Climatic Indices Based on High-Resolution Climate Simulations in Türkiye’s Lakes Region, a Typical Mediterranean Biogeography

Fuat Kaya, Sinan Demir, Mert Dedeoğlu et al.

Mediterranean biogeography is characterized as a global “hotspot” for climate change; understanding the impacts of these changes on local agricultural systems through high-resolution analyses has thus become a critical need. This study addresses this gap by evaluating the holistic effects of climate change on site-specific agriculture systems, focusing on the Eğirdir–Karacaören (EKB) and Beyşehir (BB) lake basins in the Lakes Region of Türkiye. This study employed machine learning modeling techniques to forecast changes in the yields of key crops, such as wheat, maize, apple, alfalfa, and sugar beet. Detailed spatial analyses of changes in agro-climatic conditions (heat stress, chilling requirement, frost days, and growing degree days for key crops) between the reference period (1995–2014) and two decadal periods projected for 2040–2049 and 2070–2079 were conducted under the Shared Socioeconomic Pathways (SSP3-7.0). Daily temperature, precipitation, relative humidity, and solar radiation data, derived from high-resolution climate simulations, were aggregated into annual summaries. These datasets were then spatially matched with district-level yield statistics obtained from the official data providers to construct crop-specific data matrices. For each crop, Random Forest (RF) regression models were fitted, and a Leave-One-Site-Out (LOSOCV) cross-validation method was used to evaluate model performance during the reference period. Yield prediction models were evaluated using the mean absolute error (MAE). The models achieved low MAE values for wheat (33.95 kg da<sup>−1</sup> in EKB and 75.04 kg da<sup>−1</sup> in BB), whereas the MAE values for maize and alfalfa were considerably higher, ranging from 658 to 986 kg da<sup>−1</sup>. Projections for future periods indicate declines in relative yield across both basins. For 2070–2079, wheat and maize yields are projected to decrease by 10–20%, accompanied by wide uncertainty intervals. Both basins are expected to experience a substantial increase in heat stress days (>35 °C), a reduction in frost days, and an overall acceleration of plant phenology. Results provided insights to inform region-specific, evidence-based adaptation options, such as selecting heat-tolerant varieties, optimizing planting calendars, and integrating precision agriculture practices to improve resource efficiency under changing climatic conditions. Overall, this study establishes a scientific basis for enhancing the resilience of agricultural systems to climate change in two lake basins within the Mediterranean biogeography.

DOAJ Open Access 2025
Adaptive neuro-fuzzy inference system for accurate power forecasting for on-grid photovoltaic systems: A case study in Sharjah, UAE

Tareq Salameh, Mena Maurice Farag, Abdul-Kadir Hamid et al.

This study addresses the fundamental challenge of accurately forecasting power generation from photovoltaic (PV) systems, which is crucial for effective grid integration and energy management. The intermittency and variability of solar power due to environmental factors pose significant difficulties in achieving reliable predictions. An adaptive neuro-fuzzy inference system (ANFIS) model is proposed for forecasting the performance of a 2.88 kW on-grid PV system in Sharjah, UAE. The model leverages extensive real-time data collected during the peak energy generation season to predict critical variables such as the maximum power point (MPP), voltage, and current. The ANFIS model achieves high prediction accuracy, with a Coefficient of Determination (R2) of 0.9967 for power generation, 0.9076 for voltage generation, and 0.9913 for current generation. These results highlight the model’s robustness in capturing the nonlinear dependencies between environmental factors and PV output. The study compares the ANFIS model with other established machine learning models, including Linear Regression, Decision Tree, Support Vector Machine (SVM), and Random Forest. The ANFIS model outperforms these models in terms of prediction accuracy, demonstrating its superior generalization capabilities. The findings underscore the potential of the ANFIS model for robust forecasting and effective PV performance management, providing a reliable tool for early fault detection and system assessment. Future work will focus on integrating fault detection capabilities and extending model validation across different seasons to ensure a comprehensive investigation of the system dynamics under fluctuating weather conditions.

Engineering (General). Civil engineering (General)
DOAJ Open Access 2025
Organ-system-based subclassification of preeclampsia using machine learning predicts pregnancy outcomes

Yanhong Xu, Yizheng Zu, Xiaosi Lu et al.

Abstract Background Preeclampsia (PE) is a complex disorder with significant variability in organ involvement. It remains unclear whether machine learning can identify organ-system-based subclasses of PE. This study aimed to identify novel subclasses of PE using organ-system biomarkers to improve pregnancy outcome prediction. Methods We retrospectively analyzed clinical data from PE patients at Fujian Maternity and Child Health Hospital. K-means clustering was applied using organ system function indicators, with 10-fold cross-validation. Functional indicators and pregnancy outcomes were compared across subclasses. Heatmap and sankey diagrams were used to reveal the distribution of patients across combined organ system clusters. Results The analysis included 7,531 PE patients treated between 2013 and 2023. 10-fold cross-validation confirmed clustering robustness with mean ARI of 0.8806 ± 0.0099 and NMI of 0.7800 ± 0.0123. Three heart function clusters were identified using five indicators, with H-Cluster 1 showing the poorest heart function and the highest complication rates. Five kidney clusters were determined from ten indicators. K-Cluster 1 and K-Cluster 5 showed distinct biomarker patterns but similar complication rates (P > 0.05). Liver function analysis using thirteen indicators revealed four clusters. L-Cluster 1 exhibited elevated liver enzymes and bilirubin with higher severe PE and intrahepatic cholestasis rates, whereas L-Cluster 3 had lower protein levels but higher anemia, fetal distress and hemorrhage incidence (P < 0.05). Five coagulation clusters were determined from nine indicators, showing significant differences in indicators and complication rates (P < 0.05). Heatmap and sankey diagram analyses revealed significant overlap between high-risk clusters, with the most frequent combination being H-Cluster 1, K-Cluster 1, L-Cluster 1 and C-Cluster 5. Conclusions Machine learning identified distinct PE subclasses based on organ system dysfunction patterns, each demonstrating unique pregnancy outcomes. This suggests potential clinical utility of computational approaches for PE subclassification and generates hypotheses for further investigation of its biological mechanisms.

Gynecology and obstetrics
DOAJ Open Access 2025
Stacked ensemble model for NBA game outcome prediction analysis

Guangsen He, Hyun Soo Choi

Abstract This research presents a stacked ensemble approach that employs artificial intelligence (AI) techniques to predict the outcomes of NBA games. Several machine learning algorithms were utilized, including Naïve Bayes, AdaBoost, Multilayer Perceptron (MLP), K-Nearest Neighbors (KNN), XGBoost, Decision Tree, and Logistic Regression. The best-performing models were selected to serve as the base learners in the ensemble architecture. To improve the model’s interpretability and transparency, SHAP was used to clarify its decision-making process. The model was trained and evaluated using publicly available NBA datasets from 2021–2022,2022–2023, and 2023–2024. Experimental results indicate that the proposed ensemble approach is practical in predicting game outcomes. Furthermore, the SHAP analysis provides valuable insights into the underlying predictive mechanisms, offering actionable information for coaches and analysts.

Medicine, Science
DOAJ Open Access 2025
A Comparative Analysis of Hyper-Parameter Optimization Methods for Predicting Heart Failure Outcomes

Qisthi Alhazmi Hidayaturrohman, Eisuke Hanada

This study presents a comparative analysis of hyper-parameter optimization methods used in developing predictive models for patients at risk of heart failure readmission and mortality. We evaluated three optimization approaches—Grid Search (GS), Random Search (RS), and Bayesian Search (BS)—across three machine learning algorithms—Support Vector Machine (SVM), Random Forest (RF), and eXtreme Gradient Boosting (XGBoost). The models were built using real patient data from the Zigong Fourth People’s Hospital, which included 167 features from 2008 patients. The mean, MICE, kNN, and RF imputation techniques were implemented to handle missing values. Our initial results showed that SVM models outperformed the others, achieving an accuracy of up to 0.6294, sensitivity above 0.61, and an AUC score exceeding 0.66. However, after 10-fold cross-validation, the RF models demonstrated superior robustness, with an average AUC improvement of 0.03815, whereas the SVM models showed potential for overfitting, with a slight decline (−0.0074). The XGBoost models exhibited moderate improvement (+0.01683) post-validation. Bayesian Search had the best computational efficiency, consistently requiring less processing time than the Grid and Random Search methods. This study reveals that while model selection is crucial, an appropriate optimization method and imputation technique significantly impact model performance. These findings provide valuable insights for developing robust predictive models for healthcare applications, particularly for heart failure risk assessment.

Technology, Engineering (General). Civil engineering (General)
DOAJ Open Access 2025
Resolving chemical-motif similarity with enhanced atomic structure representations for accurately predicting descriptors at metallic interfaces

Cheng Cai, Tao Wang

Abstract Accurately predicting catalytic descriptors with machine learning (ML) methods is significant to achieving accelerated catalyst design, where a unique representation of the atomic structure of each system is the key to developing a universal, efficient, and accurate ML model that is capable of tackling diverse degrees of complexity in heterogeneous catalysis scenarios. Herein, we integrate equivariant message-passing-enhanced atomic structure representation to resolve chemical-motif similarity in highly complex catalytic systems. Our developed equivariant graph neural network (equivGNN) model achieves mean absolute errors <0.09 eV for different descriptors at metallic interfaces, including complex adsorbates with more diverse adsorption motifs on ordered catalyst surfaces, adsorption motifs on highly disordered surfaces of high-entropy alloys, and the complex structures of supported nanoparticles. The prediction accuracy and easy implementation attained by our model across various systems demonstrate its robustness and potentially broad applicability, laying a reasonable basis for achieving accelerated catalyst design.

DOAJ Open Access 2025
Controllers (Expert Accountants) and Technologies: Artificial Intelligence and Explainable Artificial Intelligence

Luana COSĂCESCU

The demands of controlling when meeting cutting-edge technology are quite high given its underlying principles, its prospective character, flexibility, but also the desire for transparency, ethics, and responsibility. Through controllers (expert accountants), in their roles as collaborators, reminders, relationship managers of top management, smart technologies will be truly put to good use as business intelligence tools, as trusted allies (digital assistants, AI copilots, AI generative chatbots, interactive dashboards with AI inserts). Of course, there will be obstacles, a certain amount of distrust related to the “black boxes” regarding creation, operation, possible reactions. Hence the multiplication of searches to find something safer, with fewer unknowns regarding the purpose, risk levels, possible discriminations. This is how we arrived at XAI — explainable artificial intelligence, but also at HITL — complex models in which human judgment is integrated. The two systems also have their limits (especially regarding the balance between accuracy and explainability), but it is certain that the degree of trust, openness, and understanding of users (towards algorithms, models, artificial intelligence in general) through these tools will further increase. Basically, both tools suggest the same thing: if employees are directly involved and helped to understand something from the arguments, from the behavior of machines (whether it is about machine learning models, neural networks, or deep learning), then there will be an interactive collaboration between specialists and machines that is particularly beneficial to each productive or functional segment, but also to the entire organization.

Economic history and conditions, Finance
DOAJ Open Access 2025
Empowering text classification with NLP and explainable AI for enhanced interpretability

Sumaya Mustafa, Mariwan Hama Saeed

Abstract Artificial intelligence (AI) models have demonstrated significant success in classifying various types of text. However, the complex nature of these models often complicates the interpretability of their classifications. To address these challenges and to enhance explainability, this study proposes a novel approach to text classification leveraging natural language processing (NLP) techniques and explainable AI (XAI) methods. Text preprocessing steps were essential for improving the quality of text analysis. This was gained by eliminating elements that contribute minimal semantic value. To achieve robust performance and mitigate the risk of overfitting, repeated stratified K-Fold cross-validation was utilized. Furthermore, the synthetic minority oversampling technique (SMOTE) was employed to address dataset imbalance issues. In the classification phase, nine machine learning models and hybrid/multi-model approaches were employed. To validate the explainability of the classifications, the local interpretable model-agnostic explanations (LIME) framework was utilized. The study utilized two datasets containing texts from domains such as sports, medicine, entertainment, politics, technology, and business. Empirical evaluations demonstrated the effectiveness of the proposed approach. The proposed hybrid model achieved exceptional performance across key metrics, including accuracy, precision, recall, and F1-score. The proposed hybrid model achieved results of up to 99% accuracy. This work can be used for various text analysis applications.

Electrical engineering. Electronics. Nuclear engineering, Information technology
DOAJ Open Access 2025
Conditional VAE for personalized neurofeedback in cognitive training.

Imad Eddine Tibermacine, Samuele Russo, Gianmarco Scarano et al.

Machine learning (ML) offers great potential in healthcare, especially in the analysis of complex physiological signals like electroencephalography (EEG). EEG recordings hold valuable insights into neurological function and can aid in diagnosing various conditions. In this work, we explore the use of a Conditional Variational Autoencoder (CVAE) that injects a binary health label (healthy or orthopedic impairment) into both the encoder input and the latent space coupled with the extracted features, leveraging the conditional input vector to learn representations specific to different health conditions. Our study involved using two public OpenNeuro datasets [1,2]. From the healthy dataset we randomly selected seven subjects to match the seven impaired participants; in both sets we retained the same 11 scalp channels (C3, Cz, C4, FC3, FCz, FC4, CP3, CPz, CP4, F3, F4). Six descriptors-Short time Fourier Transform (STFT), Hurst Exponent (HE), Detrended Fluctuation Analysis (DFA), Correlation Dimension (CD), Kolmogorov-Sinai Entropy (permutation entropy; KS-proxy), and the Largest Lyapunov Exponent (LLE)-are extracted channel-wise and concatenated to form the input feature vector, which distills distinct characteristics from the EEG signals. We rigorously evaluated the performance of our CVAE model in combination with each feature extraction technique. The conditional supply of class labels to both encoder and decoder enabled the CVAE to achieve 93% accuracy on the unseen test split of the dataset with precision of 93%, a recall of 93%, and an F1-score of 0.93 outperforming re-trained CNN baselines. These results highlight the promise of CVAEs and the significance of well-suited feature extraction for robust EEG classification. This work could contribute to the development of automated healthcare diagnostic tools.

Medicine, Science
DOAJ Open Access 2024
Large Language Models in Healthcare and Medical Domain: A Review

Zabir Al Nazi, Wei Peng

The deployment of large language models (LLMs) within the healthcare sector has sparked both enthusiasm and apprehension. These models exhibit the remarkable ability to provide proficient responses to free-text queries, demonstrating a nuanced understanding of professional medical knowledge. This comprehensive survey delves into the functionalities of existing LLMs designed for healthcare applications and elucidates the trajectory of their development, starting with traditional Pretrained Language Models (PLMs) and then moving to the present state of LLMs in the healthcare sector. First, we explore the potential of LLMs to amplify the efficiency and effectiveness of diverse healthcare applications, particularly focusing on clinical language understanding tasks. These tasks encompass a wide spectrum, ranging from named entity recognition and relation extraction to natural language inference, multimodal medical applications, document classification, and question-answering. Additionally, we conduct an extensive comparison of the most recent state-of-the-art LLMs in the healthcare domain, while also assessing the utilization of various open-source LLMs and highlighting their significance in healthcare applications. Furthermore, we present the essential performance metrics employed to evaluate LLMs in the biomedical domain, shedding light on their effectiveness and limitations. Finally, we summarize the prominent challenges and constraints faced by large language models in the healthcare sector by offering a holistic perspective on their potential benefits and shortcomings. This review provides a comprehensive exploration of the current landscape of LLMs in healthcare, addressing their role in transforming medical applications and the areas that warrant further research and development.

Information technology
DOAJ Open Access 2024
Protein feature engineering framework for AMPylation site prediction

Hardik Prabhu, Hrushikesh Bhosale, Aamod Sane et al.

Abstract AMPylation is a biologically significant yet understudied post-translational modification where an adenosine monophosphate (AMP) group is added to Tyrosine and Threonine residues primarily. While recent work has illuminated the prevalence and functional impacts of AMPylation, experimental identification of AMPylation sites remains challenging. Computational prediction techniques provide a faster alternative approach. The predictive performance of machine learning models is highly dependent on the features used to represent the raw amino acid sequences. In this work, we introduce a novel feature extraction pipeline to encode the key properties relevant to AMPylation site prediction. We utilize a recently published dataset of curated AMPylation sites to develop our feature generation framework. We demonstrate the utility of our extracted features by training various machine learning classifiers, on various numerical representations of the raw sequences extracted with the help of our framework. Tenfold cross-validation is used to evaluate the model’s capability to distinguish between AMPylated and non-AMPylated sites. The top-performing set of features extracted achieved MCC score of 0.58, Accuracy of 0.8, AUC-ROC of 0.85 and F1 score of 0.73. Further, we elucidate the behaviour of the model on the set of features consisting of monogram and bigram counts for various representations using SHapley Additive exPlanations.

Medicine, Science

Halaman 38 dari 516532