TernausNet: U-Net with VGG11 Encoder Pre-Trained on ImageNet for Image Segmentation
V. Iglovikov, Alexey A. Shvets
Pixel-wise image segmentation is demanding task in computer vision. Classical U-Net architectures composed of encoders and decoders are very popular for segmentation of medical images, satellite images etc. Typically, neural network initialized with weights from a network pre-trained on a large data set like ImageNet shows better performance than those trained from scratch on a small dataset. In some practical applications, particularly in medicine and traffic safety, the accuracy of the models is of utmost importance. In this paper, we demonstrate how the U-Net type architecture can be improved by the use of the pre-trained encoder. Our code and corresponding pre-trained weights are publicly available at this https URL. We compare three weight initialization schemes: LeCun uniform, the encoder with weights from VGG11 and full network trained on the Carvana dataset. This network architecture was a part of the winning solution (1st out of 735) in the Kaggle: Carvana Image Masking Challenge.
666 sitasi
en
Computer Science
Acquisition Of Balinese Imagined Spelling using Electroencephalogram (BISE) DatasetMendeley Data
I Made Agus Wirawan, Ketut Paramarta
One of the main goals of today's technology is to create a connected environment between humans and technological devices to perform daily physical activities. However, users with speech disorders cannot use this application. Loss of verbal communication can be caused by injuries and neurodegenerative diseases that affect motor production, speech articulation, and language comprehension. To overcome this problem, Brain-Computer Interfaces (BCI) use EEG signals as assistive technology to provide a new communication channel for individuals who cannot communicate due to loss of motor control. Of the several BCI studies that use EEG signals, no studies have studied Balinese characters. As a first step, this study examines the acquisition of EEG signal data for Balinese character recognition. There are several stages in obtaining EEG signal data for Balinese character spelling imagination in this study: preparation of research documents, preparation of stimulus media, submission of ethical permits, determination of participants, recording process, data presentation, and publication of datasets. The result datasets from this study are in the form of raw data, and data was analyzed for 18 Balinese and 6 vowel characters, both spelling and imagined.
Computer applications to medicine. Medical informatics, Science (General)
Prediction of Spontaneous Breathing Trial Outcome in Critically Ill-Ventilated Patients Using Deep Learning: Development and Verification Study
Hui-Chiao Yang, Angelica Te-Hui Hao, Shih-Chia Liu
et al.
BackgroundLong-term ventilator-dependent patients often face problems such as decreased quality of life, increased mortality, and increased medical costs. Respiratory therapists must perform complex and time-consuming ventilator weaning assessments, which typically take 48-72 hours. Traditional disengagement methods rely on manual evaluation and are susceptible to subjectivity, human errors, and low efficiency.
ObjectiveThis study aims to develop an artificial intelligence–based prediction model to predict whether a patient can successfully pass a spontaneous breathing trial (SBT) using the patient’s clinical data collected before SBT initiation. Instead of comparing different SBT strategies or analyzing their impact on extubation success, this study focused on establishing a data-driven approach under a fixed SBT strategy to provide an objective and efficient assessment tool. Through this model, we aim to enhance the accuracy and efficiency of ventilator weaning assessments, reduce unnecessary SBT attempts, optimize intensive care unit resource usage, and ultimately improve the quality of care for ventilator-dependent patients.
MethodsThis study used a retrospective cohort study and developed a novel deep learning architecture, hybrid CNN-MLP (convolutional neural network–multilayer perceptron), for analysis. Unlike the traditional CNN-MLP classification method, hybrid CNN-MLP performs feature learning and fusion by interleaving CNN and MLP layers so that data features can be extracted and integrated at different levels, thereby improving the flexibility and prediction accuracy of the model. The study participants were patients aged 20 years or older hospitalized in the intensive care unit of a medical center in central Taiwan between January 1, 2016, and December 31, 2022. A total of 3686 patients were included in the study, and 6536 pre-SBT clinical records were collected before each SBT of these patients, of which 3268 passed the SBT and 3268 failed.
ResultsThe model performed well in predicting SBT outcomes. The training dataset’s precision is 99.3% (2443/2460 records), recall is 93.5% (2443/2614 records), specificity is 99.3% (2597/2614 records), and F1-score is 0.963. In the test dataset, the model maintains accuracy with a precision of 89.2% (561/629 records), a recall of 85.8% (561/654 records), a specificity of 89.6% (586/654 records), and an F1-score of 0.875. These results confirm the reliability of the model and its potential for clinical application.
ConclusionsThis study successfully developed a deep learning–based SBT prediction model that can be used as an objective and efficient ventilator weaning assessment tool. The model's performance shows that it can be integrated into clinical workflow, improve the quality of patient care, and reduce ventilator dependence, which is an important step in improving the effectiveness of respiratory therapy.
Computer applications to medicine. Medical informatics
Structure-Accurate Medical Image Translation via Dynamic Frequency Balance and Knowledge Guidance
Jiahua Xu, Dawei Zhou, Lei Hu
et al.
Multimodal medical images play a crucial role in the precise and comprehensive clinical diagnosis. Diffusion model is a powerful strategy to synthesize the required medical images. However, existing approaches still suffer from the problem of anatomical structure distortion due to the overfitting of high-frequency information and the weakening of low-frequency information. Thus, we propose a novel method based on dynamic frequency balance and knowledge guidance. Specifically, we first extract the low-frequency and high-frequency components by decomposing the critical features of the model using wavelet transform. Then, a dynamic frequency balance module is designed to adaptively adjust frequency for enhancing global low-frequency features and effective high-frequency details as well as suppressing high-frequency noise. To further overcome the challenges posed by the large differences between different medical modalities, we construct a knowledge-guided mechanism that fuses the prior clinical knowledge from a visual language model with visual features, to facilitate the generation of accurate anatomical structures. Experimental evaluations on multiple datasets show the proposed method achieves significant improvements in qualitative and quantitative assessments, verifying its effectiveness and superiority.
Evaluating the Explainability of Vision Transformers in Medical Imaging
Leili Barekatain, Ben Glocker
Understanding model decisions is crucial in medical imaging, where interpretability directly impacts clinical trust and adoption. Vision Transformers (ViTs) have demonstrated state-of-the-art performance in diagnostic imaging; however, their complex attention mechanisms pose challenges to explainability. This study evaluates the explainability of different Vision Transformer architectures and pre-training strategies - ViT, DeiT, DINO, and Swin Transformer - using Gradient Attention Rollout and Grad-CAM. We conduct both quantitative and qualitative analyses on two medical imaging tasks: peripheral blood cell classification and breast ultrasound image classification. Our findings indicate that DINO combined with Grad-CAM offers the most faithful and localized explanations across datasets. Grad-CAM consistently produces class-discriminative and spatially precise heatmaps, while Gradient Attention Rollout yields more scattered activations. Even in misclassification cases, DINO with Grad-CAM highlights clinically relevant morphological features that appear to have misled the model. By improving model transparency, this research supports the reliable and explainable integration of ViTs into critical medical diagnostic workflows.
A Survey on Heterogeneous Computing Using SmartNICs and Emerging Data Processing Units
Nathan Tibbetts, Sifat Ibtisum, Satish Puri
The emergence of new, off-path smart network cards (SmartNICs), known generally as Data Processing Units (DPU), has opened a wide range of research opportunities. Of particular interest is the use of these and related devices in tandem with their host's CPU, creating a heterogeneous computing system with new properties and strengths to be explored, capable of accelerating a wide variety of workloads. This survey begins by providing the motivation and relevant background information for this new field, including its origins, a few current hardware offerings, major programming languages and frameworks for using them, and associated challenges. We then review and categorize a number of recent works in the field, covering a wide variety of studies, benchmarks, and application areas, such as data center infrastructure, commercial uses, and AI and ML acceleration. We conclude with a few observations.
Data center TCP dataset
Jan Fesl, Tereza Čapková, Michal Konopa
et al.
In this paper, we would like to introduce a unique dataset that covers thousands of network flow measurements realized through TCP in a data center environment. The TCP protocol is widely used for reliable data transfers and has many different versions. The various versions of TCP are specific in how they deal with link congestion through the congestion control algorithm (CCA). Our dataset represents a unique, comprehensive comparison of the 17 currently used versions of TCP with different CCAs. Each TCP flow was measured precisely 50 times to eliminate the measurement instability. The comparison of the various TCP versions is based on the knowledge of 18 quantitative attributes representing the parameters of a TCP transmission. Our dataset is suitable for testing and comparing different versions of TCP, creating new CCAs based on machine learning models, or creating and testing machine learning models, allowing the identification and optimization of the currently existing versions of TCP.
Computer applications to medicine. Medical informatics, Science (General)
An extensible and unifying approach to retrospective clinical data modeling: the BrainTeaser Ontology
Guglielmo Faggioli, Laura Menotti, Stefano Marchesin
et al.
Abstract Automatic disease progression prediction models require large amounts of training data, which are seldom available, especially when it comes to rare diseases. A possible solution is to integrate data from different medical centres. Nevertheless, various centres often follow diverse data collection procedures and assign different semantics to collected data. Ontologies, used as schemas for interoperable knowledge bases, represent a state-of-the-art solution to homologate the semantics and foster data integration from various sources. This work presents the BrainTeaser Ontology (BTO), an ontology that models the clinical data associated with two brain-related rare diseases (ALS and MS) in a comprehensive and modular manner. BTO assists in organizing and standardizing the data collected during patient follow-up. It was created by harmonizing schemas currently used by multiple medical centers into a common ontology, following a bottom-up approach. As a result, BTO effectively addresses the practical data collection needs of various real-world situations and promotes data portability and interoperability. BTO captures various clinical occurrences, such as disease onset, symptoms, diagnostic and therapeutic procedures, and relapses, using an event-based approach. Developed in collaboration with medical partners and domain experts, BTO offers a holistic view of ALS and MS for supporting the representation of retrospective and prospective data. Furthermore, BTO adheres to Open Science and FAIR (Findable, Accessible, Interoperable, and Reusable) principles, making it a reliable framework for developing predictive tools to aid in medical decision-making and patient care. Although BTO is designed for ALS and MS, its modular structure makes it easily extendable to other brain-related diseases, showcasing its potential for broader applicability. Database URL https://zenodo.org/records/7886998 .
Computer applications to medicine. Medical informatics
Google Trends Assessment of Keywords Related to Smoking and Smoking Cessation During the COVID-19 Pandemic in 4 European Countries: Retrospective Analysis
Tobias Jagomast, Jule Finck, Imke Tangemann-Münstedt
et al.
BackgroundSmoking is a modifiable risk factor for SARS-CoV-2 infection. Evidence of smoking behavior during the pandemic is ambiguous. Most investigations report an increase in smoking. In this context, Google Trends data monitor real-time public information–seeking behavior and are therefore useful to characterize smoking-related interest over the trajectory of the pandemic.
ObjectiveThis study aimed to use Google Trends data to evaluate the effect of the pandemic on public interest in smoking-related topics with a focus on lockdowns, vaccination campaigns, and incidence.
MethodsThe weekly relative search volume was retrieved from Google Trends for England, Germany, Italy, and Spain from December 31, 2017, to April 18, 2021. Data were collected for keywords concerning consumption, cessation, and treatment. The relative search volume before and during the pandemic was compared, and general trends were evaluated using the Wilcoxon rank-sum test. Short-term changes and hereby temporal clusters linked to lockdowns or vaccination campaigns were addressed by the flexible spatial scan statistics proposed by Takahashi and colleagues. Subsequently, the numbers of clusters after the onset of the pandemic were compared by chi-square test.
ResultsCountry-wise minor differences were observed while 3 overarching trends prevailed. First, regarding cessation, the statistical comparison revealed a significant decline in interest for 58% (7/12) of related keywords, and fewer clusters were present during the pandemic. Second, concerning consumption, significantly reduced relative search volume was observed for 58% (7/12) of keywords, while treatment-related keywords exhibited heterogeneous trends. Third, substantial clusters of increased interest were sparsely linked to lockdowns, vaccination campaigns, or incidence.
ConclusionsThis study reports a substantial decline in overall relative search volume and clusters for cessation interest. These results underline the importance of intensifying cessation aid during times of crisis. Lockdowns, vaccination, and incidence had less impact on information-seeking behavior. Other public measures that positively affect smoking behavior remain to be determined.
Public aspects of medicine, Computer applications to medicine. Medical informatics
Random Token Fusion for Multi-View Medical Diagnosis
Jingyu Guo, Christos Matsoukas, Fredrik Strand
et al.
In multi-view medical diagnosis, deep learning-based models often fuse information from different imaging perspectives to improve diagnostic performance. However, existing approaches are prone to overfitting and rely heavily on view-specific features, which can lead to trivial solutions. In this work, we introduce Random Token Fusion (RTF), a novel technique designed to enhance multi-view medical image analysis using vision transformers. By integrating randomness into the feature fusion process during training, RTF addresses the issue of overfitting and enhances the robustness and accuracy of diagnostic models without incurring any additional cost at inference. We validate our approach on standard mammography and chest X-ray benchmark datasets. Through extensive experiments, we demonstrate that RTF consistently improves the performance of existing fusion methods, paving the way for a new generation of multi-view medical foundation models.
Navigating Data Scarcity using Foundation Models: A Benchmark of Few-Shot and Zero-Shot Learning Approaches in Medical Imaging
Stefano Woerner, Christian F. Baumgartner
Data scarcity is a major limiting factor for applying modern machine learning techniques to clinical tasks. Although sufficient data exists for some well-studied medical tasks, there remains a long tail of clinically relevant tasks with poor data availability. Recently, numerous foundation models have demonstrated high suitability for few-shot learning (FSL) and zero-shot learning (ZSL), potentially making them more accessible to practitioners. However, it remains unclear which foundation model performs best on FSL medical image analysis tasks and what the optimal methods are for learning from limited data. We conducted a comprehensive benchmark study of ZSL and FSL using 16 pretrained foundation models on 19 diverse medical imaging datasets. Our results indicate that BiomedCLIP, a model pretrained exclusively on medical data, performs best on average for very small training set sizes, while very large CLIP models pretrained on LAION-2B perform best with slightly more training samples. However, simply fine-tuning a ResNet-18 pretrained on ImageNet performs similarly with more than five training examples per class. Our findings also highlight the need for further research on foundation models specifically tailored for medical applications and the collection of more datasets to train these models.
IMIL: Interactive Medical Image Learning Framework
Adrit Rao, Andrea Fisher, Ken Chang
et al.
Data augmentations are widely used in training medical image deep learning models to increase the diversity and size of sparse datasets. However, commonly used augmentation techniques can result in loss of clinically relevant information from medical images, leading to incorrect predictions at inference time. We propose the Interactive Medical Image Learning (IMIL) framework, a novel approach for improving the training of medical image analysis algorithms that enables clinician-guided intermediate training data augmentations on misprediction outliers, focusing the algorithm on relevant visual information. To prevent the model from using irrelevant features during training, IMIL will 'blackout' clinician-designated irrelevant regions and replace the original images with the augmented samples. This ensures that for originally mispredicted samples, the algorithm subsequently attends only to relevant regions and correctly correlates them with the respective diagnosis. We validate the efficacy of IMIL using radiology residents and compare its performance to state-of-the-art data augmentations. A 4.2% improvement in accuracy over ResNet-50 was observed when using IMIL on only 4% of the training set. Our study demonstrates the utility of clinician-guided interactive training to achieve meaningful data augmentations for medical image analysis algorithms.
Processes for Designing Innovative Biomedical Hardware to Use in Space and on Earth
Kimia Seyedmadani, Keith A. Tucker, Baraquiel Reyna
et al.
The new era of space exploration is increasing the astronaut's number and diversity in low orbit and beyond. The influx of such a diverse crew population will also increase the need for medical technologies to ensure safe and productive missions. Such a need represents a unique opportunity to innovate and develop diagnostics and treatment tools to meet future needs. Historically, terrestrial regulatory oversight of biomedical design processes was considered separate from spaceflight regulatory processes because it did not address spaceflight constraints. These constraints challenge the creative development of unique solutions for use in space. Translation between healthcare innovation in spaceflight to healthcare on Earth and vice versa requires understanding the commonalities, unique needs and constraints. This manuscript provides a framework for comparing Earth-space design processes and a perspective on the best practices to improve healthcare equity and health outcomes.
Computer applications to medicine. Medical informatics, Medical technology
Wissenschaftsausbildung im Medizinstudium: Das Oldenburger Datenanalyseprojekt als Umsetzungsbeispiel [Lessons learned]
Timmer, Antje, Neuser, Johanna, Uslar, Verena
et al.
Introduction: According to the Master Plan 2020, science education will play a critical role in future medical curricula. Science modules have already been implemented at many locations. Other medical faculties will follow in the next few years, as legislation is expected to make recommendations of the national competence-based learning objectives curriculum for medicine (NKLM) mandatory. This article aims to present an implementation example from epidemiology and biometry as a contribution to the didactic discussions within the data sciences in medicine. Project description: We report on our experiences with a data analysis project for second-year medical students, which has been compulsory at the Faculty of Medicine and Health Sciences since 2019. The project is intended to train the scientific skills required from the subjects of epidemiology and biometry for student research projects. Emphasis is placed on responsible data handling, transparency, and reproducibility. For example, the writing of a statistical analysis plan is required prior to data access. Improved standardization of materials, optional use of the English language, and digital support will be implemented to help manage the project when student numbers increase. Discussion: The experience from five years is very positive, although a formal evaluation of the learning success is still pending. Current challenges concern staffing, additional time and supervision requirements for those students who do statistical programming with R, and improved integration into the medical curriculum.
Computer applications to medicine. Medical informatics, Infectious and parasitic diseases
Enhancing and Adapting in the Clinic: Source-free Unsupervised Domain Adaptation for Medical Image Enhancement
Heng Li, Ziqin Lin, Zhongxi Qiu
et al.
Medical imaging provides many valuable clues involving anatomical structure and pathological characteristics. However, image degradation is a common issue in clinical practice, which can adversely impact the observation and diagnosis by physicians and algorithms. Although extensive enhancement models have been developed, these models require a well pre-training before deployment, while failing to take advantage of the potential value of inference data after deployment. In this paper, we raise an algorithm for source-free unsupervised domain adaptive medical image enhancement (SAME), which adapts and optimizes enhancement models using test data in the inference phase. A structure-preserving enhancement network is first constructed to learn a robust source model from synthesized training data. Then a teacher-student model is initialized with the source model and conducts source-free unsupervised domain adaptation (SFUDA) by knowledge distillation with the test data. Additionally, a pseudo-label picker is developed to boost the knowledge distillation of enhancement tasks. Experiments were implemented on ten datasets from three medical image modalities to validate the advantage of the proposed algorithm, and setting analysis and ablation studies were also carried out to interpret the effectiveness of SAME. The remarkable enhancement performance and benefits for downstream tasks demonstrate the potential and generalizability of SAME. The code is available at https://github.com/liamheng/Annotation-free-Medical-Image-Enhancement.
Materials Informatics: An Algorithmic Design Rule
Bhupesh Bishnoi
Materials informatics, data-enabled investigation, is a "fourth paradigm" in materials science research after the conventional empirical approach, theoretical science, and computational research. Materials informatics has two essential ingredients: fingerprinting materials proprieties and the theory of statistical inference and learning. We have researched the organic semiconductor's enigmas through the materials informatics approach. By applying diverse neural network topologies, logical axiom, and inferencing information science, we have developed data-driven procedures for novel organic semiconductor discovery for the semiconductor industry and knowledge extraction for the materials science community. We have reviewed and corresponded with various algorithms for the neural network design topology for the materials informatics dataset.
en
cond-mat.mtrl-sci, cond-mat.stat-mech
Improving Robustness and Reliability in Medical Image Classification with Latent-Guided Diffusion and Nested-Ensembles
Xing Shen, Hengguan Huang, Brennan Nichyporuk
et al.
Once deployed, medical image analysis methods are often faced with unexpected image corruptions and noise perturbations. These unknown covariate shifts present significant challenges to deep learning based methods trained on "clean" images. This often results in unreliable predictions and poorly calibrated confidence, hence hindering clinical applicability. While recent methods have been developed to address specific issues such as confidence calibration or adversarial robustness, no single framework effectively tackles all these challenges simultaneously. To bridge this gap, we propose LaDiNE, a novel ensemble learning method combining the robustness of Vision Transformers with diffusion-based generative models for improved reliability in medical image classification. Specifically, transformer encoder blocks are used as hierarchical feature extractors that learn invariant features from images for each ensemble member, resulting in features that are robust to input perturbations. In addition, diffusion models are used as flexible density estimators to estimate member densities conditioned on the invariant features, leading to improved modeling of complex data distributions while retaining properly calibrated confidence. Extensive experiments on tuberculosis chest X-rays and melanoma skin cancer datasets demonstrate that LaDiNE achieves superior performance compared to a wide range of state-of-the-art methods by simultaneously improving prediction accuracy and confidence calibration under unseen noise, adversarial perturbations, and resolution degradation.
Cheap Lunch for Medical Image Segmentation by Fine-tuning SAM on Few Exemplars
Weijia Feng, Lingting Zhu, Lequan Yu
The Segment Anything Model (SAM) has demonstrated remarkable capabilities of scaled-up segmentation models, enabling zero-shot generalization across a variety of domains. By leveraging large-scale foundational models as pre-trained models, it is a natural progression to fine-tune SAM for specific domains to further enhance performances. However, the adoption of foundational models in the medical domain presents a challenge due to the difficulty and expense of labeling sufficient data for adaptation within hospital systems. In this paper, we introduce an efficient and practical approach for fine-tuning SAM using a limited number of exemplars, making it suitable for such scenarios. Our approach combines two established techniques from the literature: an exemplar-guided synthesis module and the widely recognized Low-Rank Adaptation (LoRA) fine-tuning strategy, serving as data-level and model-level attempts respectively. Interestingly, our empirical findings suggest that SAM can be effectively aligned within the medical domain even with few labeled data. We validate our approach through experiments on brain tumor segmentation (BraTS) and multi-organ CT segmentation (Synapse). The comprehensive results underscore the feasibility and effectiveness of such an approach, paving the way for the practical application of SAM in the medical domain.
Gossip, sabotage, and friendship network dataset
Meltem Yucel, Gustav R. Sjobeck, Rebecca Glass
et al.
This article describes the data reported in the paper “Being in the know: Social network analysis of gossip and friendship on college campuses” (Yucel et al. 2021). Data were collected from a Men's and Women's collegiate crew team members from a small liberal arts college. Participants (N = 44) reported information about how often they gossip about members of the team (positively, negatively), who they have had hooked-up with on the team, who they consider to be friends with on the team, whether they have to sabotaged or been sabotaged by any teammates, their well-being and feelings of loneliness. This data brief provides detailed information about data preparation and participants responses to all survey items.
Computer applications to medicine. Medical informatics, Science (General)
Patterns of Otorhinolaryngological Manifestations of Covid-19: A Longitudinal Questionnaire-Based Prospective Study in a Tertiary Hospital in Saudi Arabia
Danah Alrusayyis, Hussain Aljubran, Askar Alshaibani
et al.
Objective: Many studied investigated the manifestations of COVID-19, yet few described the pattern and severity of otolaryngological symptoms. We aim to describe the picture of COVID-19-associated otorhinolaryngological manifestations and recovery to explore individualized treatment, onward referral, and complications prevention. Design: Prospective longitudinal questionnaire-based study. Setting: The online questionnaire was filled 3 times through a remote interview over a period of 1 month from June 2020 to July 2020. Participants: Patients with confirmed COVID-19 by RT-PCR who were clinically stable. Main Outcome Measures: Date of diagnosis, sociodemographic data, and the presence of predictive factors, such as nasal and paranasal disease, anosmia and dysgeusia. Validated tools were used, such as Sino-nasal Outcome Test (SNOT-22), smell test (medical academy screening tool), Voice Handicap Index (VHI), and Reflux Symptoms Index (RSI). Result: The questionnaire was sent to 363 patients and the response rate was 70.80% (n = 257). The mean age was 34.58 years (SD = 11.22) and the rate of male participants was 60.7%. The most common otorhinolaryngological symptoms at the time of enrollment was fever (48.6%), whilst the commonest severe symptom was cough (57%). After 1 month, only 11 participants had persistent severe symptoms, especially sleep and psychological symptoms (73%), and the majority were female (63.6%). All of them had at least 1 comorbidity. There was a significant difference between the mean age of participants with severe symptoms (mean = 27.45, SD = 8.39) and without severe symptoms (mean = 34.90, SD = 2.53, t (255) = 2.17, P = .031). Conclusion: COVID-19 has a wide-ranged spectrum of presentations, with otorhinolaryngological symptoms being the commonest and most serious. Studying these symptoms is vital to advance management options.
Computer applications to medicine. Medical informatics, Public aspects of medicine