Extreme learning machine (ELM) has gained increasing interest from various research fields recently. In this review, we aim to report the current state of the theoretical research and practical advances on this subject. We first give an overview of ELM from the theoretical perspective, including the interpolation theory, universal approximation capability, and generalization ability. Then we focus on the various improvements made to ELM which further improve its stability, sparsity and accuracy under general or specific conditions. Apart from classification and regression, ELM has recently been extended for clustering, feature selection, representational learning and many other learning tasks. These newly emerging algorithms greatly expand the applications of ELM. From implementation aspect, hardware implementation and parallel computation techniques have substantially sped up the training of ELM, making it feasible for big data processing and real-time reasoning. Due to its remarkable efficiency, simplicity, and impressive generalization performance, ELM have been applied in a variety of domains, such as biomedical engineering, computer vision, system identification, and control and robotics. In this review, we try to provide a comprehensive view of these advances in ELM together with its future perspectives.
Machine learning has played an important role in the analysis of high-energy physics data for decades. The emergence of deep learning in 2012 allowed for machine learning tools which could adeptly handle higher-dimensional and more complex problems than previously feasible. This review is aimed at the reader who is familiar with high-energy physics but not machine learning. The connections between machine learning and high-energy physics data analysis are explored, followed by an introduction to the core concepts of neural networks, examples of the key results demonstrating the power of deep learning for analysis of LHC data, and discussion of future prospects and concerns.
Abstract Machine learning algorithms have been applied widely in the fields of natural science, social science and engineering. It can be expected that machine learning approaches especially deep learning algorithms will help geoscientists to discover mineral deposits through processing of various geoscience datasets. This study reviews the state-of-the-art application of deep learning algorithms for processing geochemical exploration data and mining the geochemical patterns. Deep learning algorithms can deal with complex and nonlinear problems and, therefore, can enhance the identification of geochemical anomalies and the recognition of hidden patterns. Applied geochemistry needs more applications of machine learning and/or deep learning algorithms.
The accelerating global demand for critical minerals, driven by clean energy technologies and climate goals, presents urgent sustainability challenges in materials design. High-entropy alloys (HEAs), particularly FeNiCrCoAl, offer a promising alternative by enabling reduced reliance on critical elements such as Ni, Cr, and Co. This study introduces a data-driven framework that integrates molecular dynamics (MD) simulations with artificial intelligence (AI), specifically machine learning (ML), to predict and optimize the mechanical performance of FeNiCrCoAl HEAs. MD simulations generated over 1800 datasets capturing ultimate tensile strength (UTS) across diverse compositions and temperatures. These data were used to train the Random Forest ML models, achieving high predictive accuracy (R2 = 0.975, RMSE = 0.22). Explainable AI techniques revealed Ni as a key contributor to strength, enabling targeted reduction of Co, Cr, and Al. A novel composition was discovered that reduced critical element content by over 50% achieving nearly double the UTS while retaining more than 90% of its tensile strength across the temperature range. This integrated MD-ML approach provides a scalable and sustainable pathway for alloy design, bridging atomic-scale simulation with predictive modeling to address global resource efficiency goals.
The term machine learning refers to a set of topics dealing with the creation and evaluation of algorithms that facilitate pattern recognition, classification, and prediction, based on models derived from existing data. Two facets of mechanization should be acknowledged when considering machine learning in broad terms. Firstly, it is intended that the classification and prediction tasks can be accomplished by a suitably programmed computing machine. That is, the product of machine learning is a classifier that can be feasibly used on available hardware. Secondly, it is intended that the creation of the classifier should itself be highly mechanized, and should not involve too much human input. This second facet is inevitably vague, but the basic objective is that the use of automatic algorithm construction methods can minimize the possibility that human biases could affect the selection and performance of the algorithm. Both the creation of the algorithm and its operation to classify objects or predict events are to be based on concrete, observable data. The history of relations between biology and the field of machine learning is long and complex. An early technique [1] for machine learning called the perceptron constituted an attempt to model actual neuronal behavior, and the field of artificial neural network (ANN) design emerged from this attempt. Early work on the analysis of translation initiation sequences [2] employed the perceptron to define criteria for start sites in Escherichia coli. Further artificial neural network architectures such as the adaptive resonance theory (ART) [3] and neocognitron [4] were inspired from the organization of the visual nervous system. In the intervening years, the flexibility of machine learning techniques has grown along with mathematical frameworks for measuring their reliability, and it is natural to hope that machine learning methods will improve the efficiency of discovery and understanding in the mounting volume and complexity of biological data. This tutorial is structured in four main components. Firstly, a brief section reviews definitions and mathematical prerequisites. Secondly, the field of supervised learning is described. Thirdly, methods of unsupervised learning are reviewed. Finally, a section reviews methods and examples as implemented in the open source data analysis and visualization language R (http://www.r-project.org).
We show that the support vector machine (SVM) classification algorithm, a recent development from the machine learning community, proves its potential for structure-activity relationship analysis. In a benchmark test, the SVM is compared to several machine learning techniques currently used in the field. The classification task involves predicting the inhibition of dihydrofolate reductase by pyrimidines, using data obtained from the UCI machine learning repository. Three artificial neural networks, a radial basis function network, and a C5.0 decision tree are all outperformed by the SVM. The SVM is significantly better than all of these, bar a manually capacity-controlled neural network, which takes considerably longer to train.
This study evaluates the performance and energy trade-offs of three popular data processing libraries—Pandas, PySpark, and Polars—applied to GreenNav, a CO<sub>2</sub> emission prediction pipeline for urban traffic. GreenNav is an eco-friendly navigation app designed to predict CO<sub>2</sub> emissions and determine low-carbon routes using a hybrid CNN-LSTM model integrated into a complete pipeline for the ingestion and processing of large, heterogeneous geospatial and road data. Our study quantifies the end-to-end execution time, cumulative CPU load, and maximum RAM consumption for each library when applied to the GreenNav pipeline; it then converts these metrics into energy consumption and CO<sub>2</sub> equivalents. Experiments conducted on datasets ranging from 100 MB to 8 GB demonstrate that Polars in lazy mode offers substantial gains, reducing the processing time by a factor of more than twenty, memory consumption by about two-thirds, and energy consumption by about 60%, while maintaining the predictive accuracy of the model (R<sup>2</sup> ≈ 0.91). These results clearly show that the careful selection of data processing libraries can reconcile high computing performance and environmental sustainability in large-scale machine learning applications.
Kamal Hossain Nahin, Jamal Hossain Nirob, Akil Ahmad Taki
et al.
Abstract This paper introduces the design and exploration of a compact, high-performance multiple-input multiple-output (MIMO) antenna for 6G applications operating in the terahertz (THz) frequency range. Leveraging a meta learner-based stacked generalization ensemble strategy, this study integrates classical machine learning techniques with an optimized multi-feature stacked ensemble to predict antenna properties with greater accuracy. Specifically, a neural network is applied as a base learner for predicting antenna parameters, resulting in increased predictive performance, achieving R², EVS, MSE, RMSE, and MAE values of 0.96, 0.998, 0.00842, 0.00453, and 0.00999, respectively. Utilizing regression-based machine learning, antenna parameters are optimized to attain dual-band resonance with bandwidths of 3.34 THz and 1 THz across two bands, ensuring robust data throughput and communication stability. The antenna, designed with dimensions of 70 × 280 μm², demonstrates a maximum gain of 15.82 dB, excellent isolation exceeding − 32.9 dB, and remarkable efficiency of 99.8%, underscoring its suitability for high-density, high-speed 6G environments. The design methodology integrates CST simulations and an RLC equivalent circuit model, substantiated by ADS simulations, with comparable reflection coefficients validating the accuracy of the models. With its compact footprint, broad bandwidth, and optimized isolation and efficiency, the proposed MIMO antenna is positioned as an ideal candidate for future 6G communication applications.
Causal machine learning is an approach that combines causal inference and machine learning to understand and utilize causal relationships in data. In current research and applications, traditional machine learning and deep learning models always focus on prediction and pattern recognition. In contrast, causal machine learning goes a step further by revealing causal relationships between different variables. We explore a novel concept called Double Machine Learning that embraces causal machine learning in this research. The core goal is to select independent variables from a gesture identification problem that are causally related to final gesture results. This selection allows us to classify and analyze gestures more efficiently, thereby improving models’ performance and interpretability. Compared to commonly used feature selection methods such as Variance Threshold, Select From Model, Principal Component Analysis, Least Absolute Shrinkage and Selection Operator, Artificial Neural Network, and TabNet, Double Machine Learning methods focus more on causal relationships between variables rather than correlations. Our research shows that variables selected using the Double Machine Learning method perform well under different classification models, with final results significantly better than those of traditional methods. This novel Double Machine Learning-based approach offers researchers a valuable perspective for feature selection and model construction. It enhances the model’s ability to uncover causal relationships within complex data. Variables with causal significance can be more informative than those with only correlative significance, thus improving overall prediction performance and reliability.
Mehmet Kivrak, Hatice Sevim Nalkiran, Oguzhan Kesen
et al.
Breast cancer is the most common malignancy in women, with the Luminal A subtype generally associated with favorable survival. However, age and menopausal status may influence tumor biology and prognosis. To improve prediction beyond conventional models, we analyzed transcriptomic and clinical data from the METABRIC cohort. Patients with Luminal A breast cancer were stratified into premenopausal, postmenopausal–nongeriatric, and geriatric (≥70 years) groups. Differentially expressed genes (DEGs) were identified, and Boruta feature selection revealed 27 clinical and genomic variables. Random Forest, Logistic Regression, Multilayer Perceptron, and ensemble XGBoost models were trained with stratified 5-fold cross-validation, using SMOTE to correct class imbalance. Principal component analysis showed distinct clustering across age groups, while DEG analysis revealed 41 genes associated with age and survival. Key predictors included clinical variables (age, tumor size, NPI, radiotherapy) and molecular markers (ATM, HERC2, AKT2, FOXO3, CYP3A43). Among ML models, XGBoost demonstrated the highest performance (accuracy 98%, sensitivity 98%, specificity 97%, F1-score 0.99, AUC 0.86), outperforming other algorithms. These findings indicate that age-related transcriptomic changes impact survival in Luminal A breast cancer and that an ML-based integrative approach combining clinical and molecular variables provides superior prognostic accuracy, supporting its potential for clinical application.
Abstract Membrane incompatibility poses significant health risks, including severe complications and potential fatality. Surface modification of membranes has emerged as a pivotal technology in the membrane industry, aiming to improve the hemocompatibility and performance of dialysis membranes by mitigating undesired membrane-protein interactions, which can lead to fouling and subsequent protein adsorption. Affinity energy, defined as the strength of interaction between membranes and human serum proteins, plays a crucial role in assessing membrane-protein interactions. These interactions may trigger adverse reactions, potentially harmful to patients. Researchers often rely on trial-and-error approaches to enhance membrane hemocompatibility by reducing these interactions. This study focuses on developing machine learning algorithms that accurately and rapidly predict affinity energy between novel chemical structures of membrane materials and human serum proteins, based on a molecular docking dataset. Various membrane materials with distinct characteristics, chemistry, and orientation are considered in conjunction with different proteins. A comparative analysis of linear regression, K-nearest neighbors regression, decision tree regression, random forest regression, XGBoost regression, lasso regression, and support vector regression is conducted to predict affinity energy. The dataset, comprising 916 records for both training and test segments, incorporates 12 parameters extracted from data points and involves six different proteins. Results indicate that random forest (R² = 0.8987, MSE = 0.36, MAE = 0.45) and XGBoost (R² = 0.83, MSE = 0.49, MAE = 0.49) exhibit comparable predictive performance on the training dataset. However, random forest outperforms XGBoost on the testing dataset. Seven machine learning algorithms for predicting affinity energy are analyzed and compared, with random forest demonstrating superior predictive accuracy. The application of machine learning in predicting affinity energy holds significant promise for researchers and professionals in hemodialysis. These models, by enabling early interventions in hemodialysis membranes, could enhance patient safety and optimize the care of hemodialysis patients.
We study contrastive learning under the PAC learning framework. While a series of recent works have shown statistical results for learning under contrastive loss, based either on the VC-dimension or Rademacher complexity, their algorithms are inherently inefficient or not implying PAC guarantees. In this paper, we consider contrastive learning of the fundamental concept of linear representations. Surprisingly, even under such basic setting, the existence of efficient PAC learners is largely open. We first show that the problem of contrastive PAC learning of linear representations is intractable to solve in general. We then show that it can be relaxed to a semi-definite program when the distance between contrastive samples is measured by the $\ell_2$-norm. We then establish generalization guarantees based on Rademacher complexity, and connect it to PAC guarantees under certain contrastive large-margin conditions. To the best of our knowledge, this is the first efficient PAC learning algorithm for contrastive learning.