Abhiraam Eranti, Yogesh Tewari, Rafael Palacios
et al.
Deep-learning methods have boosted the analytical power of Raman spectroscopy, yet they still require large, task-specific, labeled datasets and often fail to transfer across application domains. The study explores pre-trained encoders as a solution. Pre-trained encoders have significantly impacted Natural Language Processing and Computer Vision with their ability to learn transferable representations that can be applied to a variety of datasets, significantly reducing the amount of time and data required to create capable models. The following work puts forward a new approach that applies these benefits to Raman Spectroscopy. The proposed approach, RSPTE (Raman Spectroscopy Pre-Trained Encoder), is designed to learn generalizable spectral representations without labels. RSPTE employs a novel domain adaptation strategy using unsupervised Barlow Twins decorrelation objectives to learn fundamental spectral patterns from multi-domain Raman Spectroscopy datasets containing samples from medicine, biology, and mineralogy. Transferability is demonstrated through evaluation on several models created by fine-tuning RSPTE for different application domains: Medicine (detection of Melanoma and COVID), Biology (Pathogen Identification), and Agriculture. As an example, using only 20% of the dataset, models trained with RSPTE achieve accuracies ranging 50%–86% (depending on the dataset used) while without RSPTE the range is 9%–57%. Using the full dataset, accuracies with RSPTE range 81%–97%, and without pre-training 51%–97%. Current methods and state-of-the-art models in Raman Spectroscopy are compared to RSPTE for context, and RSPTE exhibits competitive results, especially with less data as well. These results provide evidence that the proposed RSPTE model can effectively learn and transfer generalizable spectral features across different domains, achieving accurate results with less data in less time (both data collection time and training time).
ABSTRACT Reinforcement learning (RL) is a promising machine‐learning solution to traffic signal control problems, which have been extensively studied. However, variants of non‐linear, deep artificial neural network (ANN) function approximators (FAs) have been predominantly employed in previous studies proposing RL‐based controllers, leaving a significant interpretability issue due to their black‐box nature. In this work, the use of the linear FA for a value‐based RL agent in traffic signal control problems is investigated along with the least‐squares Q‐learning method, abbreviated as LSTDQ. The interpretable linear FA was found to be adequate for the RL agent to learn an optimal policy. This leads to the proposal to replace a non‐linear ANN FA with the linear FA counterpart, resolving the interpretability issue. Moreover, the LSTDQ learning method shows superior behaviour convergence compared to a gradient descent method. In a low‐intensity arrival pattern scenario, the control by the RL agent cuts about half of the average delay resulting from the pretimed control. Owing to the conciseness of the linear FA, a direct interpretation analysis of the converged linear‐FA parameters is presented. Lastly, two online relearning tests of the agents under non‐stationary arrivals are conducted to demonstrate the online performance of LSTDQ. In conclusion, the linear‐FA specification and the LSTDQ method are together proposed to be used for its control algorithm interpretability property, superior convergence quality, and lack of hyperparameters.
In response to the issues of severe pitch oscillation and unstable roll attitude present in existing reinforcement learning-based aircraft cruise control methods during dynamic maneuvers, this paper proposes a precise control method for aircraft cruising based on proximal policy optimization (PPO) with nonlinear attitude constraints. This method first introduces a combination of long short-term memory (LSTM) and a fully connected layer (FC) to form the policy network of the PPO method, improving the algorithm’s learning efficiency for sequential data while avoiding feature compression. Secondly, it transforms cruise control into tracking target heading, altitude, and speed, achieving a mapping from motion states to optimal control actions within the policy network, and designs nonlinear constraints as the maximum reward intervals for pitch and roll to mitigate abnormal attitudes during maneuvers. Finally, a JSBSim simulation platform is established to train the network parameters, obtaining the optimal strategy for cruise control and achieving precise end-to-end control of the aircraft. Experimental results show that, compared to the cruise control method without dynamic constraints, the improved method reduces heading deviation by approximately 1.6° during ascent and 4.4° during descent, provides smoother pitch control, decreases steady-state altitude error by more than 1.5 m, and achieves higher accuracy in overlapping with the target trajectory during hexagonal trajectory tracking.
Objective Despite extensive exploration of potential biomarkers of cardiovascular diseases (CVDs) derived from retinal images, it remains unclear how retinal images contribute to CVD risk profiling and how the results can inform lifestyle modifications. Therefore, we aimed to determine the performance of cardiovascular risk prediction model from retinal images via explicitly estimating 10 traditional CVD risk factors and compared with the model based on actual risk measurements.Design A prospective cohort study design.Setting The UK Biobank (UKBB), a prospective cohort study, following the health conditions including CVD outcomes of adults recruited between 2006 and 2010.Participants A subset of data from the UKBB which contains 52 297 entries with retinal images and 5-year cumulative incidence of major adverse cardiovascular events (MACE) was used. Our dataset is split into 3:1:1 as training set (n=31 403), validation set (n=10 420) and testing set (n=10 474). We developed a deep learning (DL) model to predict 5-year MACE using a two-stage DL neural network.Primary and secondary outcome measures We computed accuracy, area under the receiver operating characteristic curve (AUC) and compared variations in the risk prediction models combining CVD risk factors and retinal images.Results The first-stage DL model demonstrated that the 10 CVD risk factors can be estimated from a given retinal image with an accuracy ranging between 65.2% and 89.8% (overall AUC of 0.738 with 95% CI: 0.710 to 0.766). In MACE prediction, our model outperformed the traditional score-based models, with 8.2% higher AUC than Systematic COronary Risk Evaluation (SCORE), 3.5% for SCORE 2 and 7.1% for the Framingham Risk Score (with p value<0.05 for all three comparisons).Conclusions Our algorithm estimates the 5-year risk of MACE based on retinal images, while explicitly presenting which risk factors should be checked and intervened. This two-stage approach provides human interpretable information between stages, which helps clinicians gain insights into the screening process copiloting with the DL model.
Nowadays, artificial intelligence (AI) has been utilized in several domains of the healthcare sector. Despite its effectiveness in healthcare settings, its massive adoption remains limited due to the transparency issue, which is considered a significant obstacle. To achieve the trust of end users, it is necessary to explain the AI models' output. Therefore, explainable AI (XAI) has become apparent as a potential solution by providing transparent explanations of the AI models' output. In this review paper, the primary aim is to review articles that are mainly related to machine learning (ML) or deep learning (DL) based human disease diagnoses, and the model's decision-making process is explained by XAI techniques. To do that, two journal databases (Scopus and the IEEE Xplore Digital Library) were thoroughly searched using a few predetermined relevant keywords. The PRISMA guidelines have been followed to determine the papers for the final analysis, where studies that did not meet the requirements were eliminated. Finally, 90 Q1 journal articles are selected for in-depth analysis, covering several XAI techniques. Then, the summarization of the several findings has been presented, and appropriate responses to the proposed research questions have been outlined. In addition, several challenges related to XAI in the case of human disease diagnosis and future research directions in this sector are presented.
With the rapid development of deep learning, researchers are actively exploring its applications in the field of industrial anomaly detection. Deep learning methods differ significantly from traditional mathematical modeling approaches, eliminating the need for intricate mathematical derivations and offering greater flexibility. Deep learning technologies have demonstrated outstanding performance in anomaly detection problems and gained widespread recognition. However, when dealing with multivariate data anomaly detection problems, deep learning faces challenges such as large-scale data annotation and handling relationships between complex data variables. To address these challenges, this study proposes an innovative and lightweight deep learning model—the Attention-Based Deep Convolutional Autoencoding Prediction Network (AT-DCAEP). The model consists of a characterization network based on convolutional autoencoders and a prediction network based on attention mechanisms. The AT-DCAEP exhibits excellent performance in multivariate time series data anomaly detection without the need for pre-labeling large-scale datasets, making it an efficient unsupervised anomaly detection method. We extensively tested the performance of AT-DCAEP on six publicly available datasets, and the results show that compared to current state-of-the-art methods, AT-DCAEP demonstrates superior performance, achieving the optimal balance between anomaly detection performance and computational cost.
Anna Teresa Seiche, Lucas Wittstruck, Thomas Jarmer
In order to meet the increasing demand for crops under challenging climate conditions, efficient and sustainable cultivation strategies are becoming essential in agriculture. Targeted herbicide use reduces environmental pollution and effectively controls weeds as a major cause of yield reduction. The key requirement is a reliable weed detection system that is accessible to a wide range of end users. This research paper introduces a self-built, low-cost, multispectral camera system and evaluates it against the high-end MicaSense Altum system. Pixel-based weed and crop classification was performed on UAV datasets collected with both sensors in maize using a U-Net. The training and testing data were generated via an index-based thresholding approach followed by annotation. As a result, the F1-score for the weed class reached 82% on the Altum system and 76% on the low-cost system, with recall values of 75% and 68%, respectively. Misclassifications occurred on the low-cost system images for small weeds and overlaps, with minor oversegmentation. However, with a precision of 90%, the results show great potential for application in automated weed control. The proposed system thereby enables sustainable precision farming for the general public. In future research, its spectral properties, as well as its use on different crops with real-time on-board processing, should be further investigated.
Abstract Background There is increasing evidence that myosteatosis, which is currently not assessed in clinical routine, plays an important role in risk estimation in individuals with impaired glucose metabolism, as it is associated with the progression of insulin resistance. With advances in artificial intelligence, automated and accurate algorithms have become feasible to fill this gap. Methods In this retrospective study, we developed and tested a fully automated deep learning model using data from two prospective cohort studies (German National Cohort [NAKO] and Cooperative Health Research in the Region of Augsburg [KORA]) to quantify myosteatosis on whole‐body T1‐weighted Dixon magnetic resonance imaging as (1) intramuscular adipose tissue (IMAT; the current standard) and (2) quantitative skeletal muscle (SM) fat fraction (SMFF). Subsequently, we investigated the two measures for their discrimination of and association with impaired glucose metabolism beyond baseline demographics (age, sex and body mass index [BMI]) and cardiometabolic risk factors (lipid panel, systolic blood pressure, smoking status and alcohol consumption) in asymptomatic individuals from the KORA study. Impaired glucose metabolism was defined as impaired fasting glucose or impaired glucose tolerance (140–200 mg/dL) or prevalent diabetes mellitus. Results Model performance was high, with Dice coefficients of ≥0.81 for IMAT and ≥0.91 for SM in the internal (NAKO) and external (KORA) testing sets. In the target population (380 KORA participants: mean age of 53.6 ± 9.2 years, BMI of 28.2 ± 4.9 kg/m2, 57.4% male), individuals with impaired glucose metabolism (n = 146; 38.4%) were older and more likely men and showed a higher cardiometabolic risk profile, higher IMAT (4.5 ± 2.2% vs. 3.9 ± 1.7%) and higher SMFF (22.0 ± 4.7% vs. 18.9 ± 3.9%) compared to normoglycaemic controls (all P ≤ 0.005). SMFF showed better discrimination for impaired glucose metabolism than IMAT (area under the receiver operating characteristic curve [AUC] 0.693 vs. 0.582, 95% confidence interval [CI] [0.06–0.16]; P < 0.001) but was not significantly different from BMI (AUC 0.733 vs. 0.693, 95% CI [−0.09 to 0.01]; P = 0.15). In univariable logistic regression, IMAT (odds ratio [OR] = 1.18, 95% CI [1.06–1.32]; P = 0.004) and SMFF (OR = 1.19, 95% CI [1.13–1.26]; P < 0.001) were associated with a higher risk of impaired glucose metabolism. This signal remained robust after multivariable adjustment for baseline demographics and cardiometabolic risk factors for SMFF (OR = 1.10, 95% CI [1.01–1.19]; P = 0.028) but not for IMAT (OR = 1.14, 95% CI [0.97–1.33]; P = 0.11). Conclusions Quantitative SMFF, but not IMAT, is an independent predictor of impaired glucose metabolism, and discrimination is not significantly different from BMI, making it a promising alternative for the currently established approach. Automated methods such as the proposed model may provide a feasible option for opportunistic screening of myosteatosis and, thus, a low‐cost personalized risk assessment solution.
Diseases of the musculoskeletal system, Human anatomy
Abstract Deep Self-Attention Network (Transformer) is an encoder–decoder architectural model that excels in establishing long-distance dependencies and is first applied in natural language processing. Due to its complementary nature with the inductive bias of convolutional neural network (CNN), Transformer has been gradually applied to medical image processing, including kidney image processing. It has become a hot research topic in recent years. To further explore new ideas and directions in the field of renal image processing, this paper outlines the characteristics of the Transformer network model and summarizes the application of the Transformer-based model in renal image segmentation, classification, detection, electronic medical records, and decision-making systems, and compared with CNN-based renal image processing algorithm, analyzing the advantages and disadvantages of this technique in renal image processing. In addition, this paper gives an outlook on the development trend of Transformer in renal image processing, which provides a valuable reference for a lot of renal image analysis.
Abstract The development of 7‐Tesla (7T) magnetic resonance imaging systems has opened new avenues for exploring the advantages of diffusion imaging at higher field strengths, especially in neuroscience research. This review investigates whether 7T diffusion imaging offers significant benefits over lower field strengths by addressing the following: Technical challenges and corresponding strategies: Challenges include achieving shorter transverse relaxation/effective transverse relaxation times and greater B0 and B1 inhomogeneities. Advanced techniques including high‐performance gradient systems, parallel imaging, multi‐shot acquisition, and parallel transmission can mitigate these issues. Comparison of 3‐Tesla and 7T diffusion imaging: Technologies such as multiplexed sensitivity encoding and deep learning reconstruction (DLR) have been developed to mitigate artifacts and improve image quality. This comparative analysis demonstrates significant improvements in the signal‐to‐noise ratio and spatial resolution at 7T with a powerful gradient system, facilitating enhanced visualization of microstructural changes. Despite greater geometric distortions and signal inhomogeneity at 7T, the system shows clear advantages in high b‐value imaging and high‐resolution diffusion tensor imaging. Additionally, multiplexed sensitivity encoding significantly reduces image blurring and distortion, and DLR substantially improves the signal‐to‐noise ratio and image sharpness. 7T diffusion applications in structural analysis and disease characterization: This review discusses the potential applications of 7T diffusion imaging in structural analysis and disease characterization.
Medical physics. Medical radiology. Nuclear medicine
Emanuele Aiello, Mirko Agarla, Diego Valsesia
et al.
Large-scale self-supervised pretraining of deep learning models is known to be critical in several fields, such as language processing, where its has led to significant breakthroughs. Indeed, it is often more impactful than architectural designs. However, the use of self-supervised pretraining lags behind in several domains, such as hyperspectral images, due to data scarcity. This paper addresses the challenge of data scarcity in the development of methods for spatial super-resolution of hyperspectral images (HSI-SR). We show that state-of-the-art HSI-SR methods are severely bottlenecked by the small paired datasets that are publicly available, also leading to unreliable assessment of the architectural merits of the models. We propose to capitalize on the abundance of high resolution (HR) RGB images to develop a self-supervised pretraining approach that significantly improves the quality of HSI-SR models. In particular, we leverage advances in spectral reconstruction methods to create a vast dataset with high spatial resolution and plausible spectra from RGB images, to be used for pretraining HSI-SR methods. Experimental results, conducted across multiple datasets, report large gains for state-of-the-art HSI-SR methods when pretrained according to the proposed procedure, and also highlight the unreliability of ranking methods when training on small datasets.
Information retrieval aims to find the most important data for specific queries. The challenge is retrieving relevant data efficiently due to the large search area. Existing solutions lead to unnecessary processing costs. Additionally, identifying the main focus of the query is crucial for targeted retrieval. Current methods struggle to address these issues effectively. To overcome these challenges, we have proposed a goal-question-indicator (GQI) approach for personalized learning inquiry (PLA). This approach allows for efficient retrieval of variable-sized data with reduced processing requirements. We have also presented the open learning analytics platform's (Open-LAP) pointer motor segment, which helps end users specify goals, generates discussion topics, and provides self-characterizing pointers.