Parkinson's disease (PD) and Alzheimer's disease (AD) are the two most prevalent and incurable neurodegenerative diseases (NDs) worldwide, for which early diagnosis is critical to delay their progression. However, the high dimensionality of multi-metric data with diverse structural forms, the heterogeneity of neuroimaging and phenotypic data, and class imbalance collectively pose significant challenges to early ND diagnosis. To address these challenges, we propose a dynamically weighted dual graph attention network (DW-DGAT) that integrates: (1) a general-purpose data fusion strategy to merge three structural forms of multi-metric data; (2) a dual graph attention architecture based on brain regions and inter-sample relationships to extract both micro- and macro-level features; and (3) a class weight generation mechanism combined with two stable and effective loss functions to mitigate class imbalance. Rigorous experiments, based on the Parkinson Progression Marker Initiative (PPMI) and Alzheimer's Disease Neuroimaging Initiative (ADNI) studies, demonstrate the state-of-the-art performance of our approach.
Background: Thymidine kinase 2 (TK2) deficiency is an ultra-rare, severe mitochondrial myopathy caused by pathogenic variants in TK2 and characterized by a wide range of ages at onset. The infantile form, presenting before 2 years of age, is the most rapidly progressive and is associated with a high risk of early mortality. We describe the clinical outcomes of early nucleoside therapy in a series of children with infantile-onset TK2 deficiency. Methods: We retrospectively reviewed four children with genetically confirmed infantile-onset TK2 deficiency treated with oral deoxycytidine/deoxythymidine (dC/dT) through an Early Access Program at two centers. Dosing was escalated to 800 mg/kg/day as tolerated. Patients were followed at baseline, Month 1, and regular intervals thereafter. Outcomes included neurological examinations, eight motor milestones, and respiratory and feeding support. Safety laboratory results, neuroimaging, and biopsy findings were reviewed. Results: Treatment began at 19–24 months (median duration 26 months; range: 4–81). All presented within the first year with hypotonia, motor regression, and respiratory and/or bulbar involvement. Two required invasive ventilation and three required tube feeding before therapy. After dC/dT initiation, all improved with no further milestone loss. Three achieved independent ambulation and stair climbing; the fourth, at 4 months of therapy, has begun unassisted walking. Both tracheostomized patients were weaned from ventilation, and enteral feeding was discontinued in all three within 1–6 months. Only mild dose-related diarrhea occurred in one patient. Conclusion: Early nucleoside therapy halts disease progression and restores motor function in infantile-onset TK2 deficiency, the most severe form of the disease.
Sadia Afrin Rimi, Md. Jalal Uddin Chowdhury, Rifat Abdullah
et al.
The number of people living in this agricultural nation of ours, which is surrounded by lush greenery, is growing on a daily basis. As a result of this, the level of arable land is decreasing, as well as residential houses and industrial factories. The food crisis is becoming the main threat for us in the upcoming days. Because on the one hand, the population is increasing, and on the other hand, the amount of food crop production is decreasing due to the attack of diseases. Rice is one of the most significant cultivated crops since it provides food for more than half of the world's population. Bangladesh is dependent on rice (Oryza sativa) as a vital crop for its agriculture, but it faces a significant problem as a result of the ongoing decline in rice yield brought on by common diseases. Early disease detection is the main difficulty in rice crop cultivation. In this paper, we proposed our own dataset, which was collected from the Bangladesh field, and also applied deep learning and transfer learning models for the evaluation of the datasets. We elaborately explain our dataset and also give direction for further research work to serve society using this dataset. We applied a light CNN model and pre-trained InceptionNet-V2, EfficientNet-V2, and MobileNet-V2 models, which achieved 91.5% performance for the EfficientNet-V2 model of this work. The results obtained assaulted other models and even exceeded approaches that are considered to be part of the state of the art. It has been demonstrated by this study that it is possible to precisely and effectively identify diseases that affect rice leaves using this unbiased datasets. After analysis of the performance of different models, the proposed datasets are significant for the society for research work to provide solutions for decreasing rice leaf disease.
Citrus, as one of the most economically important fruit crops globally, suffers severe yield depressions due to various diseases. Accurate disease detection and classification serve as critical prerequisites for implementing targeted control measures. Recent advancements in artificial intelligence, particularly deep learning-based computer vision algorithms, have substantially decreased time and labor requirements while maintaining the accuracy of detection and classification. Nevertheless, these methods predominantly rely on massive, high-quality annotated training examples to attain promising performance. By introducing two key designs: contrasting with cluster centroids and a multi-layer contrastive training (MCT) paradigm, this paper proposes a novel clustering-guided self-supervised multi-layer contrastive representation learning (CMCRL) algorithm. The proposed method demonstrates several advantages over existing counterparts: (1) optimizing with massive unannotated samples; (2) effective adaptation to the symptom similarity across distinct citrus diseases; (3) hierarchical feature representation learning. The proposed method achieves state-of-the-art performance on the public citrus image set CDD, outperforming existing methods by 4.5\%-30.1\% accuracy. Remarkably, our method narrows the performance gap with fully supervised counterparts (all samples are labeled). Beyond classification accuracy, our method shows great performance on other evaluation metrics (F1 score, precision, and recall), highlighting the robustness against the class imbalance challenge.
H. P. Khandagale, Sangram Patil, V. S. Gavali
et al.
Plant disease detection is a critical task in agriculture, directly impacting crop yield, food security, and sustainable farming practices. This study proposes FourCropNet, a novel deep learning model designed to detect diseases in multiple crops, including CottonLeaf, Grape, Soybean, and Corn. The model leverages an advanced architecture comprising residual blocks for efficient feature extraction, attention mechanisms to enhance focus on disease-relevant regions, and lightweight layers for computational efficiency. These components collectively enable FourCropNet to achieve superior performance across varying datasets and class complexities, from single-crop datasets to combined datasets with 15 classes. The proposed model was evaluated on diverse datasets, demonstrating high accuracy, specificity, sensitivity, and F1 scores. Notably, FourCropNet achieved the highest accuracy of 99.7% for Grape, 99.5% for Corn, and 95.3% for the combined dataset. Its scalability and ability to generalize across datasets underscore its robustness. Comparative analysis shows that FourCropNet consistently outperforms state-of-the-art models such as MobileNet, VGG16, and EfficientNet across various metrics. FourCropNet's innovative design and consistent performance make it a reliable solution for real-time disease detection in agriculture. This model has the potential to assist farmers in timely disease diagnosis, reducing economic losses and promoting sustainable agricultural practices.
Deep learning has enabled breakthroughs in automated diagnosis from medical imaging, with many successful applications in ophthalmology. However, standard medical image classification approaches only assess disease presence at the time of acquisition, neglecting the common clinical setting of longitudinal imaging. For slow, progressive eye diseases like age-related macular degeneration (AMD) and primary open-angle glaucoma (POAG), patients undergo repeated imaging over time to track disease progression and forecasting the future risk of developing disease is critical to properly plan treatment. Our proposed Longitudinal Transformer for Survival Analysis (LTSA) enables dynamic disease prognosis from longitudinal medical imaging, modeling the time to disease from sequences of fundus photography images captured over long, irregular time periods. Using longitudinal imaging data from the Age-Related Eye Disease Study (AREDS) and Ocular Hypertension Treatment Study (OHTS), LTSA significantly outperformed a single-image baseline in 19/20 head-to-head comparisons on late AMD prognosis and 18/20 comparisons on POAG prognosis. A temporal attention analysis also suggested that, while the most recent image is typically the most influential, prior imaging still provides additional prognostic value.
The growing availability of well-organized Electronic Health Records (EHR) data has enabled the development of various machine learning models towards disease risk prediction. However, existing risk prediction methods overlook the heterogeneity of complex diseases, failing to model the potential disease subtypes regarding their corresponding patient visits and clinical concept subgroups. In this work, we introduce TACCO, a novel framework that jointly discovers clusters of clinical concepts and patient visits based on a hypergraph modeling of EHR data. Specifically, we develop a novel self-supervised co-clustering framework that can be guided by the risk prediction task of specific diseases. Furthermore, we enhance the hypergraph model of EHR data with textual embeddings and enforce the alignment between the clusters of clinical concepts and patient visits through a contrastive objective. Comprehensive experiments conducted on the public MIMIC-III dataset and Emory internal CRADLE dataset over the downstream clinical tasks of phenotype classification and cardiovascular risk prediction demonstrate an average 31.25% performance improvement compared to traditional ML baselines and a 5.26% improvement on top of the vanilla hypergraph model without our co-clustering mechanism. In-depth model analysis, clustering results analysis, and clinical case studies further validate the improved utilities and insightful interpretations delivered by TACCO. Code is available at https://github.com/PericlesHat/TACCO.
Eduardo C. Araujo, Claudia T. Codeço, Sandro Loch
et al.
The influence of climate on mosquito-borne diseases like dengue and chikungunya is well-established, but comprehensively tracking long-term spatial and temporal trends across large areas has been hindered by fragmented data and limited analysis tools. This study presents an unprecedented analysis, in terms of breadth, estimating the SIR transmission parameters from incidence data in all 5,570 municipalities in Brazil over 14 years (2010-2023) for both dengue and chikungunya. We describe the Episcanner computational pipeline, developed to estimate these parameters, producing a reusable dataset describing all dengue and chikungunya epidemics that have taken place in this period, in Brazil. The analysis reveals new insights into the climate-epidemic nexus: We identify distinct geographical and temporal patterns of arbovirus disease incidence across Brazil, highlighting how climatic factors like temperature and precipitation influence the timing and intensity of dengue and chikungunya epidemics. The innovative Episcanner tool empowers researchers and public health officials to explore these patterns in detail, facilitating targeted interventions and risk assessments. This research offers a new perspective on the long-term dynamics of climate-driven mosquito-borne diseases and their geographical specificities linked to the effects of global temperature fluctuations such as those captured by the ENSO index.
Many people die from lung-related diseases every year. X-ray is an effective way to test if one is diagnosed with a lung-related disease or not. This study concentrates on categorizing three distinct types of lung X-rays: those depicting healthy lungs, those showing lung opacities, and those indicative of viral pneumonia. Accurately diagnosing the disease at an early phase is critical. In this paper, five different pre-trained models will be tested on the Lung X-ray Image Dataset. SqueezeNet, VGG11, ResNet18, DenseNet, and MobileNetV2 achieved accuracies of 0.64, 0.85, 0.87, 0.88, and 0.885, respectively. MobileNetV2, as the best-performing pre-trained model, will then be further analyzed as the base model. Eventually, our own model, MobileNet-Lung based on MobileNetV2, with fine-tuning and an additional layer of attention within feature layers, was invented to tackle the lung disease classification task and achieved an accuracy of 0.933. This result is significantly improved compared with all five pre-trained models.
Agnieszka M. Frydrych, Richard Parsons, Omar Kujan
Abstract Objectives Malnutrition is common among patients with head and neck cancer (HNC) and associated with poorer outcomes. Oral nutritional supplements (ONS) are often prescribed, with concerns raised about their cariogenicity. This study examined ONS use and caries experience in patients with HNC 12 months post‐diagnosis. Methods Fifty‐four patients with HNC referred for pre‐radiotherapy dental assessment were recruited. Data collected included: age, gender, residential postcode, smoking, alcohol use, HNC characteristics, dental history, oral hygiene habits, dietary advice and ONS use. Data was collected at diagnosis, during radiotherapy and 6 weeks, three, six‐ and 12‐months post‐treatment completion. Results Fifty‐one subjects completed the study. 76.5% of the participants used ONS for an average of 13.8 weeks. Caries developed in 22.9% of ONS users and 11.1% of non‐users ( p = 0.6585). The mean overall duration of ONS use was 18.7 weeks for the caries group and 8.5 weeks for the caries‐free group ( p = 0.1507). Lack of collaboration and disconnection was noted between dietary advice given by dieticians and dentists. Conclusions ONS use is common among patients with HNC. Larger studies are needed to establish the reasons for caries development and impacts of ONS use on oral health. Importance of multidisciplinary management of malnutrition is highlighted.
Constantino Álvarez Casado, Manuel Lage Cañellas, Matteo Pedone
et al.
Respiratory diseases remain a leading cause of mortality worldwide, highlighting the need for faster and more accurate diagnostic tools. This work presents a novel approach leveraging digital stethoscope technology for automatic respiratory disease classification and biometric analysis. Our approach has the potential to significantly enhance traditional auscultation practices. By leveraging one of the largest publicly available medical database of respiratory sounds, we train machine learning models to classify various respiratory health conditions. Our method differs from conventional methods by using Empirical Mode Decomposition (EMD) and spectral analysis techniques to isolate clinically relevant biosignals embedded within acoustic data captured by digital stethoscopes. This approach focuses on information closely tied to cardiovascular and respiratory patterns within the acoustic data. Spectral analysis and filtering techniques isolate Intrinsic Mode Functions (IMFs) strongly correlated with these physiological phenomena. These biosignals undergo a comprehensive feature extraction process for predictive modeling. These features then serve as input to train several machine learning models for both classification and regression tasks. Our approach achieves high accuracy in both binary classification (89% balanced accuracy for healthy vs. diseased) and multi-class classification (72% balanced accuracy for specific diseases like pneumonia and COPD). For the first time, this work introduces regression models capable of estimating age and body mass index (BMI) based solely on acoustic data, as well as a model for sex classification. Our findings underscore the potential of intelligent digital stethoscopes to significantly enhance assistive and remote diagnostic capabilities, contributing to advancements in digital health, telehealth, and remote patient monitoring.
Peter J. Diggle, Claudio Fronterre, Katherine Gass
et al.
Current WHO guidelines set prevalence thresholds below which a Neglected Tropical Disease can be considered to have been eliminated as a public health problem, and specify how surveys to assess whether elimination has been achieved should be designed and analysed, based on classical survey sampling methods. In this paper we describe an alternative approach based on geospatial statistical modelling. We first show the gains in efficiency that can be obtained by exploiting any spatial correlation in the underlying prevalence surface. We then suggest that the current guidelines implicit use of a significance testing argument is not appropriate; instead, we argue for a predictive inferential framework, leading to design criteria based on controlling the rates at which areas whose true prevalence lies above and below the elimination threshold are incorrectly classified. We describe how this approach naturally accommodates context-specific information in the form of georeferenced covariates that have been shown to be predictive of disease prevalence. Finally, we give a progress report of an ongoing collaboration with the Guyana Ministry of Health Neglected Tropical Disease program on the design of an IDA (Ivermectin, Diethylcarbamazine and Albendazole) Impact Survey (IIS) of lymphatic filariasis to be conducted in Guyana in early 2023
Onintze Zaballa, Aritz Pérez, Elisa Gómez-Inhiesto
et al.
Electronic health records contain valuable information for monitoring patients' health trajectories over time. Disease progression models have been developed to understand the underlying patterns and dynamics of diseases using these data as sequences. However, analyzing temporal data from EHRs is challenging due to the variability and irregularities present in medical records. We propose a Markovian generative model of treatments developed to (i) model the irregular time intervals between medical events; (ii) classify treatments into subtypes based on the patient sequence of medical events and the time intervals between them; and (iii) segment treatments into subsequences of disease progression patterns. We assume that sequences have an associated structure of latent variables: a latent class representing the different subtypes of treatments; and a set of latent stages indicating the phase of progression of the treatments. We use the Expectation-Maximization algorithm to learn the model, which is efficiently solved with a dynamic programming-based method. Various parametric models have been employed to model the time intervals between medical events during the learning process, including the geometric, exponential, and Weibull distributions. The results demonstrate the effectiveness of our model in recovering the underlying model from data and accurately modeling the irregular time intervals between medical actions.
The World Health Organization added Disease X to their shortlist of blueprint priority diseases to represent a hypothetical, unknown pathogen that could cause a future epidemic. During different virus outbreaks of the past, such as COVID-19, Influenza, Lyme Disease, and Zika virus, researchers from various disciplines utilized Google Trends to mine multimodal components of web behavior to study, investigate, and analyze the global awareness, preparedness, and response associated with these respective virus outbreaks. As the world prepares for Disease X, a dataset on web behavior related to Disease X would be crucial to contribute towards the timely advancement of research in this field. Furthermore, none of the prior works in this field have focused on the development of a dataset to compile relevant web behavior data, which would help to prepare for Disease X. To address these research challenges, this work presents a dataset of web behavior related to Disease X, which emerged from different geographic regions of the world, between February 2018 and August 2023. Specifically, this dataset presents the search interests related to Disease X from 94 geographic regions. The dataset was developed by collecting data using Google Trends. The relevant search interests for all these regions for each month in this time range are available in this dataset. This paper also discusses the compliance of this dataset with the FAIR principles of scientific data management. Finally, an analysis of this dataset is presented to uphold the applicability, relevance, and usefulness of this dataset for the investigation of different research questions in the interrelated fields of Big Data, Data Mining, Healthcare, Epidemiology, and Data Analysis with a specific focus on Disease X.
Sarah Mittenentzwei, Veronika Weiß, Stefanie Schreiber
et al.
While narrative visualization has been used successfully in various applications to communicate scientific data in the format of a story to a general audience, the same has not been true for medical data. There are only a few exceptions that present tabular medical data to non-experts. However, a key component of medical visualization is the interactive analysis of 3D data, such as 3D models of anatomical structures, which were rarely included in narrative visualizations so far. In this design study, we investigate how neurological disease data can be communicated through narrative visualization techniques to a general audience in an understandable way. We designed a narrative visualization explaining cerebral small vessel disease. Learning about its avoidable risk factors serves to motivate the audience watching the resulting visual data story. Using this example, we discuss the adaption of basic narrative components. This includes the conflict and characters of a story, as well as the story's structure and content to address and communicate specific characteristics of medical data. Furthermore, we explore the extent to which complex medical relationships need to be simplified to be understandable to a general audience without distorting the underlying data and evidence. In particular, the data needs to be preprocessed for non-experts and appropriate forms of interaction must be found. We explore approaches to make the data more personally relatable, such as including a fictional patient. We evaluated our approach in a user study with 40 participants in a web-based implementation of the designed story. We found that the combination of a carefully thought-out storyline with a clear key message, appealing visualizations combined with easy-to-use interactions, and credible references are crucial for creating a narrative visualization about a neurological disease that engages an audience.
In the last decades, the area under cultivation of maize products has increased because of its essential role in the food cycle for humans, livestock, and poultry. Moreover, the diseases of plants impact food safety and can significantly reduce both the quality and quantity of agricultural products. There are many challenges to accurate and timely diagnosis of the disease. This research presents a novel scheme based on a deep neural network to overcome the mentioned challenges. Due to the limited number of data, the transfer learning technique is employed with the help of two well-known architectures. In this way, a new effective model is adopted by a combination of pre-trained MobileNetV2 and Inception Networks due to their effective performance on object detection problems. The convolution layers of MoblieNetV2 and Inception modules are parallelly arranged as earlier layers to extract crucial features. In addition, the imbalance problem of classes has been solved by an augmentation strategy. The proposed scheme has a superior performance compared to other state-of-the-art models published in recent years. The accuracy of the model reaches 97%, approximately. In summary, experimental results prove the method's validity and significant performance in diagnosing disease in plant leaves.
Infectious diseases usually originate from a specific location within a city. Due to the heterogenous distribution of population and public facilities, and the structural heterogeneity of human mobility network embedded in space, infectious diseases break out at different locations would cause different transmission risk and control difficulty. This study aims to investigate the impact of initial outbreak locations on the risk of spatiotemporal transmission and reveal the driving force behind high-risk outbreak locations. First, integrating mobile phone location data, we built a SLIR (susceptible-latent-infectious-removed)-based meta-population model to simulate the spreading process of an infectious disease (i.e., COVID-19) across fine-grained intra-urban regions (i.e., 649 communities of Shenzhen City, China). Based on the simulation model, we evaluated the transmission risk caused by different initial outbreak locations by proposing three indexes including the number of infected cases (CaseNum), the number of affected regions (RegionNum), and the spatial diffusion range (SpatialRange). Finally, we investigated the contribution of different influential factors to the transmission risk via machine learning models. Results indicates that different initial outbreak locations would cause similar CaseNum but different RegionNum and SpatialRange. To avoid the epidemic spread quickly to more regions, it is necessary to prevent epidemic breaking out in locations with high population-mobility flow density. While to avoid epidemic spread to larger spatial range, remote regions with long daily trip distance of residents need attention. Those findings can help understand the transmission risk and driving force of initial outbreak locations within cities and make precise prevention and control strategies in advance.
Cécile Proust-Lima, Tiphaine Saulnier, Viviane Philipps
et al.
Neurodegenerative diseases are characterized by numerous markers of progression and clinical endpoints. For instance, Multiple System Atrophy (MSA), a rare neurodegenerative synucleinopathy, is characterized by various combinations of progressive autonomic failure and motor dysfunction, and a very poor prognosis. Describing the progression of such complex and multi-dimensional diseases is particularly difficult. One has to simultaneously account for the assessment of multivariate markers over time, the occurrence of clinical endpoints, and a highly suspected heterogeneity between patients. Yet, such description is crucial for understanding the natural history of the disease, staging patients diagnosed with the disease, unravelling subphenotypes, and predicting the prognosis. Through the example of MSA progression, we show how a latent class approach modeling multiple repeated markers and clinical endpoints can help describe complex disease progression and identify subphenotypes for exploring new pathological hypotheses. The proposed joint latent class model includes class-specific multivariate mixed models to handle multivariate repeated biomarkers possibly summarized into latent dimensions and class-and-cause-specific proportional hazard models to handle time-to-event data. Maximum likelihood estimation procedure, validated through simulations is available in the lcmm R package. In the French MSA cohort comprising data of 598 patients during up to 13 years, five subphenotypes of MSA were identified that differ by the sequence and shape of biomarkers degradation, and the associated risk of death. In posterior analyses, the five subphenotypes were used to explore the association between clinical progression and external imaging and fluid biomarkers, while properly accounting for the uncertainty in the subphenotypes membership.