Meenakshi Sudarvizhi Seenipeyathevar, Prasath Palaniappan, Vijayakumar Arumugam
et al.
ABSTRACT This study deals with an integrated experimental‐machine learning framework for wear estimation in functionally graded composites made from recycled magnesium machining chips, using low‐cost ceramic fibers as reinforcement with the radial Modeling technique. The primary hurdle that is being addressed is the accurate prediction of wear behavior in spatially graded magnesium matrix composites, while simultaneously avoiding extensive experimental testing. Under varying degrees of applied loads (4.4 to 39 N), sliding speeds (0.45 to 4.5 m/s), and sliding distances (500 to 4500 m), the wear performance was experimentally assessed. Results demonstrate a hardness increment of 26.26% in the outer region compared to the inner region, while resistance to wear was enhanced by 19.8% in the outer zone due to the grading of ceramic fibers. A limited experimental dataset consisting of wear measurements from the inner, middle, and outer zones of the composite was utilized in developing and validating four machine‐learning models for wear rate prediction. The tree‐based ensemble methods significantly outperformed deep‐learning strategies, with the LightGBM model providing the best prediction performance across all zones and achieving optimization with a maximum tree depth of 5, 480 leaves, and a feature fraction of 0.05. Moreover, zone‐specific XGBoost models were also developed, employing customized learning rates and minimal loss reduction parameters in order to elevate prediction accuracy. The proposed machine‐learning framework thus provides a pathway for rapid and reliable wear rate estimation for ceramic fiber‐reinforced magnesium composites, significantly lessening experimental burden. Results highlight that recycled magnesium waste, when combined with ceramic reinforcement, can be effectively employed to produce sustainable and economically viable materials with improved wear resistance, particularly for automotive and industrial applications.
Juan Carlos Ramírez-Vázquez, Guadalupe Esmeralda Rivera-García, Marco Antonio Gómez-Guzmán
et al.
A substantial number of students with hearing impairments are enrolled in higher education, motivating the development of inclusive assistive technologies that reduce communication barriers. This study developed and evaluated a prototype electronic glove that translates Mexican Sign Language (LSM) signs into Spanish text using machine learning. Eight participants (four deaf and four hearing with LSM proficiency) completed four sessions involving 12 signs; three sessions (S1–S3) were used for model development and one session (T) was held out for evaluation. Models were trained on S1–S3 and tested on T using a session-level split without window mixing across sessions; therefore, results represent a speaker-dependent, inter-session pilot assessment rather than a speaker-independent generalization test. The glove integrates flex sensors and an inertial measurement unit IMU MPU6050 connected to an ESP32-C3 SuperMini microcontroller. These components were selected due to their low cost, availability, and ease of integration, making them suitable for the development of accessible wearable assistive technologies. Under this protocol, the system achieved a window-level overall test accuracy of 97.0% (95% CI computed at the window level: 96.00–97.00), with higher performance for the dynamic subset (98.0%) than for the static subset (95.0%), and an algorithmic decision delay of 1.2 s. Usability and acceptance were evaluated using the System Usability Scale (SUS) and a Technology Acceptance Model (TAM)-based questionnaire. The mean SUS score was 50.6 ± 1.8 (marginal usability), while participants reported positive perceptions across TAM constructs. Overall, findings demonstrate technical feasibility under controlled inter-session conditions and provide a foundation for iterative user-centered refinement, followed by strict speaker-independent validation and classroom deployment studies in future work.
Nicodemo Abate, Diego Ronchi, Sara Elettra Zaia
et al.
This study presents a multi-resolution and multi-temporal remote sensing approach to assess human-induced changes in cultural landscapes, with a focus on the archaeological site of Amrit (Syria) within the MapDam project. By integrating satellite archives (KH, Landsat series, NASADEM) with ancillary geospatial data (OpenStreetMap) and advanced analytical methods, four decades (1984–2024) of land-use/land-cover (LULC) change and shoreline dynamics were reconstructed. Machine learning classification (Random Forest) achieved high accuracy (Test Accuracy = 0.94; Kappa = 0.89), enabling robust LULC mapping, while predictive modelling of urban expansion, calibrated through a Gradient Boosting Machine, attained a Figure of Merit of 0.157, confirming strong predictive reliability. The results reveal path-dependent urban growth concentrated on low-slope terrains (≤5°) and consistent with proximity to infrastructure, alongside significant shoreline regression after 1974. A Business-as-Usual projection for 2024–2034 estimates 8.676 ha of new anthropisation, predominantly along accessible plains and peri-urban fringes. Beyond quantitative outcomes, this study demonstrates the replicability and scalability of open-source, data-driven workflows using Google Earth Engine and Python 3.14, making them applicable to other high-risk heritage contexts. This transparent methodology is particularly critical in conflict zones or in regions where cultural assets are neglected due to economic constraints, political agendas, or governance limitations, offering a powerful tool to document and safeguard endangered archaeological landscapes.
BackgroundHeart failure (HF) represents the terminal phase of multiple cardiovascular conditions and is associated with significant morbidity and mortality rates. Arachidonic acid (AA), an essential fatty acid, plays a crucial role in modulating cardiovascular function under both normal and disease states. The purpose of this research was to examine how AA is related to HF, providing new perspective for individualized treatment.MethodsTranscriptomic datasets were retrieved from the Gene Expression Omnibus (GEO) database. The raw data were consolidated to identify differentially expressed genes (DEGs) and subsequently subjected to bioinformatics analysis. Gene ontology (GO) annotation and Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment analyses were performed. Signature genes were identified through Least Absolute Shrinkage and Selection Operator (LASSO) regression, Support Vector Machine-Recursive Feature Elimination (SVM-RFE), and Random Forest (RF) algorithms. Receiver Operating Characteristic (ROC) curves were generated for gene evaluation, and a nomogram was developed. An analysis of immune cell infiltration was conducted using Single Sample Gene Set Enrichment Analysis (ssGSEA), and Gene Set Enrichment Analysis (GSEA) was conducted to determine important pathways. Subsequently, we also performed drug sensitivity evaluation. Finally, the expression levels of the identified signature genes in HF samples were confirmed using qRT-PCR analysis.ResultsFour characteristic genes demonstrating favorable performance in the ROC analysis. The comprehensive nomogram developed in this study exhibited enhanced clinical utility. In addition, notable variations in immune cell infiltration levels were detected, and GSEA highlighted key biological pathways.ConclusionThis investigation demonstrated a strong association between arachidonic acid-associated gene expression and heightened risk of HF, offering novel perspectives on the disease's underlying pathological processes and providing potential insights for personalized management of HF.
Diseases of the circulatory (Cardiovascular) system
The extraction of shape features from vector elements is essential in cartography and geographic information science, supporting a range of intelligent processing tasks. Traditional methods rely on different machine learning algorithms tailored to specific types of line and polygon elements, limiting their general applicability. This study introduces a novel approach called “Pre-Trained Shape Feature Representations from Transformers (PSRT)”, which utilizes transformer encoders designed with three self-supervised pre-training tasks: coordinate masking prediction, coordinate offset correction, and coordinate sequence rearrangement. This approach enables the extraction of general shape features applicable to both line and polygon elements, generating high-dimensional embedded feature vectors. These vectors facilitate downstream tasks like shape classification, pattern recognition, and cartographic generalization. Our experimental results show that PSRT can extract vector shape features effectively without needing labeled samples and is adaptable to various types of vector features. Compared to the methods without pre-training, PSRT enhances training efficiency by over five times and improves accuracy by 5–10% in tasks such as line element matching and polygon shape classification. This innovative approach offers a more unified, efficient solution for processing vector shape data across different applications.
Marcos Gabriel Mendes Lauande, Geraldo Braz Júnior, João Dallyson Sousa de Almeida
et al.
Penile cancer has an incidence strongly linked to sociocultural factors, being more common in underdeveloped countries like Brazil, where it represents approximately 2% of cancers affecting men. This dataset was created to address the scarcity of publicly available resources for classifying histopathological images in penile cancer research. The images were collected in 2021 from tissue samples obtained through biopsies of patients undergoing treatment for penile cancer. After staining with Hematoxylin and Eosin (H&E), the tissue samples were photographed using a Leica ICC50 HD camera attached to a bright-field microscope (Leica DM500). The dataset comprises 194 high-resolution images (2048 × 1536 pixels), categorized by magnification (40X and 100X) and pathological classification (Tumor or Non-Tumor). Metadata includes additional information such as histological grade and, for some images, HPV status. Although previous works have focused primarily on binary classification tasks, the dataset includes additional labels, such as histological grade and HPV (Human Papilloma Virus) presence, which provide opportunities for multi-label classification or other types of predictive modelling. These extended labels enhance the dataset’s versatility for more complex tasks in medical image analysis. The dataset holds significant reuse potential for machine learning tasks beyond binary classification, allowing researchers to explore additional layers of analysis, such as HPV detection and histological grading. It can also be used for model benchmarking and comparative studies in cancer research, contributing to developing new diagnostic tools. The dataset and metadata are available for further research and model development.
Computer applications to medicine. Medical informatics, Science (General)
Rahman Shafique, Khadija Kanwal, Venkata Chunduri
et al.
Abstract Regular inspection of the health of railway tracks is crucial to maintaining reliable and safe train operations. Some factors including cracks, rail discontinuity, ballast issues, burn wheels, super-elevation, loose nuts and bolts, and misalignment developed on the railways due to pre-emptive investigations, non-maintenance, and delay in detection pose grave threats and danger to the safe operation of railway transportation. In the past, manual inspection was performed for the rail track by a rail cart which is both prone to error and inefficient due to human biases and error. Several train accidents are reported in Pakistan; it is important to automate these techniques to avoid such train accidents for the safety of countless lives. This study aims to enhance railway track fault detection using an automatic rail track fault detection technique with acoustic analysis. Moreover, the proposed method contributes to making the dataset large by using the CTGAN technique. Results show that acoustic data may help to determine the railway track faults effectively and logistic regression is used to perform the classification for railway track faults with an accuracy of 100%.
The increasing impact of the greenhouse effect on ecosystems is prompting transportation agencies to seek methods for reducing CO2 emissions during pavement construction and maintenance. Additionally, the laboratory mix design process, which involves selecting aggregate gradation and binder content, is time-consuming and labor-intensive. To accelerate the traditional mix design procedure, this study presented a mix design procedure that can automatically determine gradation and binder content based on machine learning (ML) and a meta-heuristic algorithm. Specifically, ML approaches were employed to model the relationship between volumetric properties (mixture bulk specific gravity (Gmb) and air void (VV)) and both mixture component properties and mixture proportion, based on a dataset collected from literature with 660 mixture designs. Integrated with the prediction of ML models and the modified multi-objective grey wolf optimization (MOGWO) algorithm, an automatic asphalt mix design was proposed to pursue three goals, including VV, cost, and CO2 emission. The results indicated that least squares support vector regression (LSSVR) and eXtreme gradient boosting (XGBoost) achieved the highest prediction accuracies (correlation coefficient: 0.92 for VV and 0.96 for Gmb). The MOGWO algorithm successfully found the 26 optimal mix designs for the case of VV vs. cost vs. CO2 emission. Compared to the traditional laboratory design, the optimal mixture with VV of 4% achieves a cost saving of 2.46% and a reduction of 4.03% in carbon emission. The volumetric properties of the mixtures output by the approach also align closely with values measured in a laboratory.
Ahmed Medhat Zayed, Arne Janssens, Pavlos Mamouris
et al.
Abstract Background The integrity of clinical research and machine learning models in healthcare heavily relies on the quality of underlying clinical laboratory data. However, the preprocessing of this data to ensure its reliability and accuracy remains a significant challenge due to variations in data recording and reporting standards. Methods We developed lab2clean, a novel algorithm aimed at automating and standardizing the cleaning of retrospective clinical laboratory results data. lab2clean was implemented as two R functions specifically designed to enhance data conformance and plausibility by standardizing result formats and validating result values. The functionality and performance of the algorithm were evaluated using two extensive electronic medical record (EMR) databases, encompassing various clinical settings. Results lab2clean effectively reduced the variability of laboratory results and identified potentially erroneous records. Upon deployment, it demonstrated effective and fast standardization and validation of substantial laboratory data records. The evaluation highlighted significant improvements in the conformance and plausibility of lab results, confirming the algorithm’s efficacy in handling large-scale data sets. Conclusions lab2clean addresses the challenge of preprocessing and cleaning clinical laboratory data, a critical step in ensuring high-quality data for research outcomes. It offers a straightforward, efficient tool for researchers, improving the quality of clinical laboratory data, a major portion of healthcare data. Thereby, enhancing the reliability and reproducibility of clinical research outcomes and clinical machine learning models. Future developments aim to broaden its functionality and accessibility, solidifying its vital role in healthcare data management. Graphical Abstract
Computer applications to medicine. Medical informatics
Mafalda Reis Pereira, Mafalda Reis Pereira, Filipe Neves dos Santos
et al.
Early diagnosis of plant diseases is needed to promote sustainable plant protection strategies. Applied predictive modeling over hyperspectral spectroscopy (HS) data can be an effective, fast, cost-effective approach for improving plant disease diagnosis. This study aimed to investigate the potential of HS point-of-measurement (POM) data for in-situ, non-destructive diagnosis of tomato bacterial speck caused by Pseudomonas syringae pv. tomato (Pst), and bacterial spot, caused by Xanthomonas euvesicatoria (Xeu), on leaves (cv. cherry). Bacterial artificial infection was performed on tomato plants at the same phenological stage. A sensing system composed by a hyperspectral spectrometer, a transmission optical fiber bundle with a slitted probe and a white light source were used for spectral data acquisition, allowing the assessment of 3478 spectral points. An applied predictive classification model was developed, consisting of a normalizing pre-processing strategy allied with a Linear Discriminant Analysis (LDA) for reducing data dimensionality and a supervised machine learning algorithm (Support Vector Machine – SVM) for the classification task. The predicted model achieved classification accuracies of 100% and 74% for Pst and Xeu test set assessments, respectively, before symptom appearance. Model predictions were coherent with host-pathogen interactions mentioned in the literature (e.g., changes in photosynthetic pigment levels, production of bacterial-specific molecules, and activation of plants’ defense mechanisms). Furthermore, these results were coherent with visual phenotyping inspection and PCR results. The reported outcomes support the application of spectral point measurements acquired in-vivo for plant disease diagnosis, aiming for more precise and eco-friendly phytosanitary approaches.
Intrusion Detection Systems are expected to detect and prevent malicious activities in a network, such as a smart grid. However, they are the main systems targeted by cyber-attacks. A number of approaches have been proposed to classify and detect these attacks, including supervised machine learning. However, these models require large labeled datasets for training and testing. Therefore, this paper compares the performance of supervised and unsupervised learning models in detecting cyber-attacks. The benchmark of CICDDOS 2019 was used to train, test, and validate the models. The supervised models are Gaussian Naïve Bayes, Classification and Regression Decision Tree, Logistic Regression, C-Support Vector Machine, Light Gradient Boosting, and Alex Neural Network. The unsupervised models are Principal Component Analysis, K-means, and Variational Autoencoder. The performance comparison is made in terms of accuracy, probability of detection, probability of misdetection, probability of false alarm, processing time, prediction time, training time per sample, and memory size. The results show that the Alex Neural Network model outperforms the other supervised models, while the Variational Autoencoder model has the best results compared to unsupervised models.
Francesco Carnazza, Federico Carollo, Dominik Zietlow
et al.
Full information about a many-body quantum system is usually out-of-reach due to the exponential growth—with the size of the system—of the number of parameters needed to encode its state. Nonetheless, in order to understand the complex phenomenology that can be observed in these systems, it is often sufficient to consider dynamical or stationary properties of local observables or, at most, of few-body correlation functions. These quantities are typically studied by singling out a specific subsystem of interest and regarding the remainder of the many-body system as an effective bath. In the simplest scenario, the subsystem dynamics, which is in fact an open quantum dynamics, can be approximated through Markovian quantum master equations. Here, we formulate the problem of finding the generator of the subsystem dynamics as a variational problem, which we solve using the standard toolbox of machine learning for optimization. This dynamical or ‘Lindblad’ generator provides the relevant dynamical parameters for the subsystem of interest. Importantly, the algorithm we develop is constructed such that the learned generator implements a physically consistent open quantum time-evolution. We exploit this to learn the generator of the dynamics of a subsystem of a many-body system subject to a unitary quantum dynamics. We explore the capability of our method to recover the time-evolution of a two-body subsystem and exploit the physical consistency of the generator to make predictions on the stationary state of the subsystem dynamics.
Arfat Ahmad Khan, Muhammad Asif Nauman, Rab Nawaz Bashir
et al.
Accurate Evapotranspiration for saline soils (ETs) is important as well as challenging for the reclamation of saline soils through an effective leaching process. Evapotranspiration (ET) by FAO-56 Penman-Monteith standard method is complex, especially for saline soils. Moreover, existing studies focus on the use of the Internet of Things (IoT) and machine learning-enabled smart and precision irrigation water recommendation systems along with the ET estimation by limited parameters. The ETs for saline soils are also equally important for the reclamation of saline soils, which is ignored by the existing literature. The study proposed IoT and machine leaching-based architecture of context-aware monthly ETs estimations for saline soil reclamation with the effective leaching process. The IoT-enabled crop field contexts in terms of crop field temperature, soil salinity, and irrigation water salinity are used as input features to the Long Short-Term Memory (LSTM) and ensembled LSTM models for monthly ETs predictions. The performance of the proposed solution is observed in terms of the accuracy of the machine learning models along with the comparison against the FAO-56 PM-based standard method. The implementation of the proposed solution reveals that the ensembled LSTM-based approach for ETs is more accurate as compared to the LSTM model with accuracies of 92 and 90% for the training and validation datasets, respectively. The predictions made by the ensembled LSTM are more in line with the FAO-56 PM-based method with a Pearson correlation of 0.916 as compared to LSTM models. The implementation of the proposed solution in real-time environments reveals that the proposed solution is more effective in reducing the soil salinity as compared to the traditional method.
<p>The eastern Mediterranean surface circulation is highly energetic and composed of structures interacting stochastically. However, some main features are still debated, and the behavior of some fine-scale dynamics and their role in shaping the general circulation is yet unknown. In the following paper, we use an unsupervised neural network clustering method to analyze the long-term variability of the different mesoscale structures. We decompose 26 years of altimetric data into clusters reflecting different circulation patterns of weak and strong flows with either strain or vortex-dominated velocities. The vortex-dominated cluster is more persistent in the western part of the basin, which is more active than the eastern part due to the strong flow along the coast, interacting with the extended bathymetry and engendering continuous instabilities. The cluster that reflects a weak flow dominated the middle of the basin, including the Mid-Mediterranean Jet (MMJ) pathway. However, the temporal analysis shows a frequent and intermittent occurrence of a strong flow in the middle of the basin, which could explain the previous contradictory assessment of MMJ existence using in-situ observations. Moreover, we prove that the Levantine Sea is becoming more and more energetic as the activity of the main mesoscale features is showing a positive trend.</p>