Hasil "Computer applications to medicine. Medical informatics"

DOAJ Open Access 2025

An effective multi-step feature selection framework for clinical outcome prediction using electronic medical records

Hongnian Wang, Mingyang Zhang, Liyi Mai et al.

Abstract Background Identifying key variables is essential for developing clinical outcome prediction models based on high-dimensional electronic medical records (EMR). However, despite the abundance of feature selection (FS) methods available, challenges remain in choosing the most appropriate method, deciding how many top-ranked variables to include, and ensuring these selections are meaningful from a medical perspective. Methods We developed a practical multi-step feature selection (FS) framework that integrates data-driven statistical inference with a knowledge verification strategy. This framework was validated using two distinct EMR datasets targeting different clinical outcomes. The first cohort, sourced from the Medical Information Mart for Intensive Care III (MIMIC-III), focused on predicting acute kidney injury (AKI) in ICU patients. The second cohort, drawn from the MIMIC-IV Emergency Department (MIMIC-IV-ED), aimed to estimate in-hospital mortality (IHM) for patients transferred from the ED to the ICU. We employed various machine learning (ML) methods and conducted a comparative analysis considering accuracy, stability, similarity, and interpretability. The effectiveness of our FS framework was evaluated using discrimination and calibration metrics, with SHAP applied to enhance the interpretability of model decisions. Results Cohort 1 comprised 48,780 ICU encounters, of which 8,883 (18.21%) developed AKI. Cohort 2 included 29,197 transfers from the ED to the ICU, with 3,219 (11.03%) resulting in IHM. Among the ten ML methods evaluated, the tree-based ensemble method achieved the highest accuracy. As the number of top-ranking features increased, the models’ accuracy began to stabilize, while feature subset stability (considering sample variations) and inter-method feature similarity reached optimal levels, confirming the validity of the FS framework. The integration of interpretative methods and expert knowledge in the final step further improved feature interpretability. The FS framework effectively reduced the number of features (e.g., from 380 to 35 for Cohort 1, and from 273 to 54 for Cohort 2) without significantly affecting prediction performance (Delong test, p > 0.05). Conclusion The multi-step FS method developed in this study successfully reduces the dimensionality of features in EMR while preserving the accuracy of clinical outcome prediction. Furthermore, it improves the interpretability of risk factors by incorporating expert knowledge validation.

Computer applications to medicine. Medical informatics

Detail DOI Sumber

DOAJ Open Access 2025

Knowledge, Readiness, Willingness-to-Use, and Willingness-to-Pay for Telehealth in Nonlife-Threatening Emergency Department Visits

Vahé Heboyan, Phillip Coule, Davide Mariotti et al.

Background: The emergency department (ED) provides a significant portion of health care services in the United States, and its utilization has increased over the past decade. ED overcrowding remains a considerable challenge to many EDs. The objectives of this study were (1) to evaluate the knowledge of telehealth and readiness to use it among patients who visit EDs in a nonurgent triage category and (2) to estimate their willingness-to-use and willingness-to-pay for telehealth consultations. Methods: A structured questionnaire was administered using a tablet to adult patients who visited the ED of a large medical center and who were triaged into a nonurgent category. Respondents were asked about their sociodemographic and ED visit characteristics and health and telehealth utilization history. Then, we presented them with a hypothetical scenario for visiting a board-certified ED doctor through telehealth instead of in-person visits, and, using a double-bound dichotomous choice iterative bidding algorithm, we solicited their willingness-to-pay for such a telehealth visit. Results: A total of 171 patients agreed to participate in the study. More than half of the respondents (n = 107; 62.6%) said they have health insurance. Almost half of the respondents (n = 71; 41.5%) reported the main reason for going to the ED was an ongoing condition or concern. More than two-thirds of the respondents identified themselves as being very proficient with using a smartphone or tablet (n = 116; 67.8%), and only a few (n = 21; 12.3%) reported not having any internet-capable device. Most respondents (n = 148; 86.5%) had never heard about telehealth. However, after a brief description of telehealth, we found that approximately two-thirds of the patients would be willing to use or consider using telehealth (n = 107; 62.6%), and one-third (n = 64; 37.4%) would not be interested. We did not observe any statistically significant differences in willingness-to-use. However, we observed statistically significant differences in the willingness-to-pay $50 by gender (p < 0.01), by currently having a regular doctor/clinic (p < 0.05), and by health insurance status. Conclusions: Hospitals should consider investigating telehealth services that can be provided to their communities as an option instead of visiting their EDs. While technology does not seem to be a barrier to telehealth, more educational initiatives to inform the public about telehealth are desirable. A targeted advertisement campaign to recommend telehealth for nonlife-threatening ED visits could be developed once more user characteristics are collected.

Computer applications to medicine. Medical informatics

Detail DOI Sumber

DOAJ Open Access 2025

Improved Depression Symptoms Among Low-Income Mexican Americans Participating in a Community-Clinical Diabetes Intervention

MinJae Lee, Joseph H. Conroy, Zihan Yang et al.

Background: Adults with type 2 diabetes mellitus and comorbid depression face complex disease management. Salud y Vida, a diabetes management intervention for Mexican Americans in the Rio Grande Valley of Texas, may mitigate depression through social support and community-clinic referrals. Methods: In a cohort study, 292 Salud y Vida participants completed the patient health questionnaire (PHQ-9) at baseline, month 6, and month 12. A PHQ-9 score of ≥5 indicated mild depression and activated a mental health referral. Using SAS 9.4 with significance set at <.05, we conducted multivariable longitudinal negative binomial regression to assess changes in depression level. Results: The proportion of participants with a PHQ-9 ≥5 decreased from 36% at baseline to 18% in month 6 (adjusted risk ratio = 0.51, 95% CI = 0.41-0.62; P < .001). Among those with clinical depression at baseline (n = 121), mean PHQ-9 scores dropped 45% by month 6 (9.60-5.28; adjusted rate ratio = 0.55; 95% CI = 0.47, 0.65, P < .001) and an additional 22% by month 12 (5.28-4.10; adjusted rate ratio = 0.78; 95% CI = 0.66, 0.91; P < .002). Conclusion: Salud y Vida participation is correlated with significant depression symptom improvements in Mexican American adults with diabetes and comorbid depression, demonstrating that chronic care management interventions can address multiple chronic conditions.

Computer applications to medicine. Medical informatics, Public aspects of medicine

Detail DOI Sumber

DOAJ Open Access 2025

DconnLoop: a deep learning model for predicting chromatin loops based on multi-source data integration

Junfeng Wang, Kuikui Cheng, Chaokun Yan et al.

Abstract Background Chromatin loops are critical for the three-dimensional organization of the genome and gene regulation. Accurate identification of chromatin loops is essential for understanding the regulatory mechanisms in disease. However, current mainstream detection methods rely primarily on single-source data, such as Hi-C, which limits these methods’ ability to capture the diverse features of chromatin loop structures. In contrast, multi-source data integration and deep learning approaches, though not yet widely applied, hold significant potential. Results In this study, we developed a method called DconnLoop to integrate Hi-C, ChIP-seq, and ATAC-seq data to predict chromatin loops. This method achieves feature extraction and fusion of multi-source data by integrating residual mechanisms, directional connectivity excitation modules, and interactive feature space decoders. Finally, we apply density estimation and density clustering to the genome-wide prediction results to identify more representative loops. The code is available from https://github.com/kuikui-C/DconnLoop . Conclusions The results demonstrate that DconnLoop outperforms existing methods in both precision and recall. In various experiments, including Aggregate Peak Analysis and peak enrichment comparisons, DconnLoop consistently shows advantages. Extensive ablation studies and validation across different sequencing depths further confirm DconnLoop’s robustness and generalizability.

Computer applications to medicine. Medical informatics, Biology (General)

Detail DOI Sumber

arXiv Open Access 2025

Journal Publications in Medicine: Ranking vs. Interdisciplinarity

Anbang Du, Michael Head, Markus Brede

Interdisciplinary research is critical for innovation and addressing complex societal issues. We characterise the interdisciplinary knowledge structure of PubMed research articles in medicine as correlation networks of medical concepts and compare the interdisciplinarity of articles between high-ranking (impactful) and less high-ranking (less impactful) medical journals. We found that impactful medical journals tend to publish research that are less interdisciplinary than less impactful journals. Observing that they bridge distant knowledge clusters in the networks, we find that cancer-related research can be seen as one of the main drivers of interdisciplinarity in medical science. Using signed difference networks, we also investigate the clustering of deviations between high and low impact journal correlation networks. We generally find a mild tendency for strong link differences to be adjacent. Furthermore, we find topic clusters of deviations that shift over time. In contrast, topic clusters in the original networks are static over time and can be seen as the core knowledge structure in medicine. Overall, journals and policymakers should encourage initiatives to accommodate interdisciplinarity within the existing infrastructures to maximise the potential patient benefits from IDR.

en cs.SI, physics.soc-ph

Detail Sumber

arXiv Open Access 2025

Domain-Specific Machine Translation to Translate Medicine Brochures in English to Sorani Kurdish

Mariam Shamal, Hossein Hassani

Access to Kurdish medicine brochures is limited, depriving Kurdish-speaking communities of critical health information. To address this problem, we developed a specialized Machine Translation (MT) model to translate English medicine brochures into Sorani Kurdish using a parallel corpus of 22,940 aligned sentence pairs from 319 brochures, sourced from two pharmaceutical companies in the Kurdistan Region of Iraq (KRI). We trained a Statistical Machine Translation (SMT) model using the Moses toolkit, conducting seven experiments that resulted in BLEU scores ranging from 22.65 to 48.93. We translated three new brochures to improve the evaluation process and encountered unknown words. We addressed unknown words through post-processing with a medical dictionary, resulting in BLEU scores of 56.87, 31.05, and 40.01. Human evaluation by native Kurdish-speaking pharmacists, physicians, and medicine users showed that 50% of professionals found the translations consistent, while 83.3% rated them accurate. Among users, 66.7% considered the translations clear and felt confident using the medications.

en cs.CL

Detail Sumber

DOAJ Open Access 2024

An ML-based decision support system for reliable diagnosis of ovarian cancer by leveraging explainable AI

Asif Newaz, Abdullah Taharat, Md Sakibul Islam et al.

Ovarian cancer (OC) is one of the most prevalent types of cancer in women. Early and accurate diagnosis is crucial for the survival of the patients. However, the majority of women are diagnosed in advanced stages due to the lack of effective biomarkers and accurate screening tools. While previous studies sought a common biomarker, our study suggests different biomarkers for the premenopausal and postmenopausal populations. This can provide a new perspective in the search for novel predictors for the effective diagnosis of OC. Genetic algorithm has been utilized to identify the most significant biomarkers. The XGBoost classifier is then trained on the selected features and high ROC-AUC scores of 0.864 and 0.911 have been obtained for the premenopausal and postmenopausal populations, respectively. Lack of explainability is one major limitation of current AI systems. The stochastic nature of the ML algorithms raises concerns about the reliability of the system as it is difficult to interpret the reasons behind the decisions. To increase the trustworthiness and accountability of the diagnostic system as well as to provide transparency and explanations behind the predictions, explainable AI has been incorporated into the ML framework. SHAP is employed to quantify the contributions of the selected biomarkers and determine the most discriminative features. Merging SHAP with the ML models enables clinicians to investigate individual decisions made by the model and gain insights into the factors leading to that prediction. Thus, a hybrid decision support system has been established that can eliminate the bottlenecks caused by the black-box nature of the ML algorithms providing a safe and trustworthy AI tool. The diagnostic accuracy obtained from the proposed system outperforms the existing methods as well as the state-of-the-art ROMA algorithm by a substantial margin which signifies its potential to be an effective tool in the differential diagnosis of OC.

Computer applications to medicine. Medical informatics

Detail DOI Sumber

DOAJ Open Access 2024

Impact of an Online Discussion Forum on Self-Guided Internet-Delivered Cognitive Behavioral Therapy for Public Safety Personnel: Randomized Trial

Hugh C McCall, Heather D Hadjistavropoulos

BackgroundInternet-delivered cognitive behavioral therapy (ICBT) is an effective and accessible treatment for various mental health concerns. ICBT has shown promising treatment outcomes among public safety personnel (PSP), who experience high rates of mental health problems and face barriers to accessing other mental health services. Client engagement and clinical outcomes are better in ICBT with therapist guidance, but ICBT is easier to implement on a large scale when it is self-guided. Therefore, it is important to identify strategies to improve outcomes and engagement in self-guided ICBT and other self-guided digital mental health interventions. One such strategy is the use of online discussion forums to provide ICBT clients with opportunities for mutual social support. Self-guided interventions accompanied by online discussion forums have shown excellent treatment outcomes, but there is a need for research experimentally testing the impact of online discussion forums in ICBT. ObjectiveWe aimed to evaluate a transdiagnostic, self-guided ICBT intervention tailored specifically for PSP (which had not previously been assessed), assess the impact of adding a therapist-moderated online discussion forum on outcomes, and analyze participants’ feedback to inform future research and implementation efforts. MethodsIn this randomized trial, we randomly assigned participating PSP (N=107) to access an 8-week transdiagnostic, self-guided ICBT course with or without a built-in online discussion forum. Enrollment and participation were entirely web-based. We assessed changes in depression, anxiety, and posttraumatic stress as well as several secondary outcome measures (eg, treatment engagement and satisfaction) using questionnaires at the pre-enrollment, 8-week postenrollment, and 20-week postenrollment time points. Mixed methods analyses included multilevel modeling and qualitative content analysis. ResultsParticipants engaged minimally with the forum, creating 9 posts. There were no differences in treatment outcomes between participants who were randomly assigned to access the forum (56/107, 52.3%) and those who were not (51/107, 47.7%). Across conditions, participants who reported clinically significant symptoms during enrollment showed large and statistically significant reductions in symptoms (P<.05 and d>0.97 in all cases). Participants also showed good treatment engagement and satisfaction, with 43% (46/107) of participants fully completing the intervention during the course of the study and 96% (79/82) indicating that the intervention was worth their time. ConclusionsPrevious research has shown excellent clinical outcomes for self-guided ICBT accompanied by discussion forums and good engagement with those forums. Although clinical outcomes in our study were excellent across conditions, engagement with the forum was poor, in contrast to previous research. We discuss several possible interpretations of this finding (eg, related to the population under study or the design of the forum). Our findings highlight a need for more research evaluating the impact of online discussion forums and other strategies for improving outcomes and engagement in self-guided ICBT and other digital mental health interventions. Trial RegistrationClinicalTrials.gov NCT05145582; https://clinicaltrials.gov/study/NCT05145582

Computer applications to medicine. Medical informatics, Public aspects of medicine

Detail DOI Sumber

arXiv Open Access 2024

Large Language Models in Biomedical and Health Informatics: A Review with Bibliometric Analysis

Huizi Yu, Lizhou Fan, Lingyao Li et al.

Large Language Models (LLMs) have rapidly become important tools in Biomedical and Health Informatics (BHI), enabling new ways to analyze data, treat patients, and conduct research. This study aims to provide a comprehensive overview of LLM applications in BHI, highlighting their transformative potential and addressing the associated ethical and practical challenges. We reviewed 1,698 research articles from January 2022 to December 2023, categorizing them by research themes and diagnostic categories. Additionally, we conducted network analysis to map scholarly collaborations and research dynamics. Our findings reveal a substantial increase in the potential applications of LLMs to a variety of BHI tasks, including clinical decision support, patient interaction, and medical document analysis. Notably, LLMs are expected to be instrumental in enhancing the accuracy of diagnostic tools and patient care protocols. The network analysis highlights dense and dynamically evolving collaborations across institutions, underscoring the interdisciplinary nature of LLM research in BHI. A significant trend was the application of LLMs in managing specific disease categories such as mental health and neurological disorders, demonstrating their potential to influence personalized medicine and public health strategies. LLMs hold promising potential to further transform biomedical research and healthcare delivery. While promising, the ethical implications and challenges of model validation call for rigorous scrutiny to optimize their benefits in clinical settings. This survey serves as a resource for stakeholders in healthcare, including researchers, clinicians, and policymakers, to understand the current state and future potential of LLMs in BHI.

en cs.DL, cs.AI

Detail Sumber

arXiv Open Access 2024

Segment Anything Model for Medical Image Segmentation: Current Applications and Future Directions

Yichi Zhang, Zhenrong Shen, Rushi Jiao

Due to the inherent flexibility of prompting, foundation models have emerged as the predominant force in the fields of natural language processing and computer vision. The recent introduction of the Segment Anything Model (SAM) signifies a noteworthy expansion of the prompt-driven paradigm into the domain of image segmentation, thereby introducing a plethora of previously unexplored capabilities. However, the viability of its application to medical image segmentation remains uncertain, given the substantial distinctions between natural and medical images. In this work, we provide a comprehensive overview of recent endeavors aimed at extending the efficacy of SAM to medical image segmentation tasks, encompassing both empirical benchmarking and methodological adaptations. Additionally, we explore potential avenues for future research directions in SAM's role within medical image segmentation. While direct application of SAM to medical image segmentation does not yield satisfactory performance on multi-modal and multi-target medical datasets so far, numerous insights gleaned from these efforts serve as valuable guidance for shaping the trajectory of foundational models in the realm of medical image analysis. To support ongoing research endeavors, we maintain an active repository that contains an up-to-date paper list and a succinct summary of open-source projects at https://github.com/YichiZhang98/SAM4MIS.

en eess.IV, cs.CV

Detail Sumber

arXiv Open Access 2024

AliFuse: Aligning and Fusing Multi-modal Medical Data for Computer-Aided Diagnosis

Qiuhui Chen, Yi Hong

Medical data collected for diagnostic decisions are typically multimodal, providing comprehensive information on a subject. While computer-aided diagnosis systems can benefit from multimodal inputs, effectively fusing such data remains a challenging task and a key focus in medical research. In this paper, we propose a transformer-based framework, called Alifuse, for aligning and fusing multimodal medical data. Specifically, we convert medical images and both unstructured and structured clinical records into vision and language tokens, employing intramodal and intermodal attention mechanisms to learn unified representations of all imaging and non-imaging data for classification. Additionally, we integrate restoration modeling with contrastive learning frameworks, jointly learning the high-level semantic alignment between images and texts and the low-level understanding of one modality with the help of another. We apply Alifuse to classify Alzheimer's disease, achieving state-of-the-art performance on five public datasets and outperforming eight baselines.

en cs.CV

Detail Sumber

arXiv Open Access 2023

Nanorobotics in Medicine: A Systematic Review of Advances, Challenges, and Future Prospects

Shishir Rajendran, Prathic Sundararajan, Ashi Awasthi et al.

Nanorobotics offers an emerging frontier in biomedicine, holding the potential to revolutionize diagnostic and therapeutic applications through its unique capabilities in manipulating biological systems at the nanoscale. Following PRISMA guidelines, a comprehensive literature search was conducted using IEEE Xplore and PubMed databases, resulting in the identification and analysis of a total of 414 papers. The studies were filtered to include only those that addressed both nanorobotics and direct medical applications. Our analysis traces the technology's evolution, highlighting its growing prominence in medicine as evidenced by the increasing number of publications over time. Applications ranged from targeted drug delivery and single-cell manipulation to minimally invasive surgery and biosensing. Despite the promise, limitations such as biocompatibility, precise control, and ethical concerns were also identified. This review aims to offer a thorough overview of the state of nanorobotics in medicine, drawing attention to current challenges and opportunities, and providing directions for future research in this rapidly advancing field.

en cs.RO, q-bio.TO

Detail DOI Sumber

arXiv Open Access 2023

Can GPT-4V(ision) Serve Medical Applications? Case Studies on GPT-4V for Multimodal Medical Diagnosis

Chaoyi Wu, Jiayu Lei, Qiaoyu Zheng et al.

Driven by the large foundation models, the development of artificial intelligence has witnessed tremendous progress lately, leading to a surge of general interest from the public. In this study, we aim to assess the performance of OpenAI's newest model, GPT-4V(ision), specifically in the realm of multimodal medical diagnosis. Our evaluation encompasses 17 human body systems, including Central Nervous System, Head and Neck, Cardiac, Chest, Hematology, Hepatobiliary, Gastrointestinal, Urogenital, Gynecology, Obstetrics, Breast, Musculoskeletal, Spine, Vascular, Oncology, Trauma, Pediatrics, with images taken from 8 modalities used in daily clinic routine, e.g., X-ray, Computed Tomography (CT), Magnetic Resonance Imaging (MRI), Positron Emission Tomography (PET), Digital Subtraction Angiography (DSA), Mammography, Ultrasound, and Pathology. We probe the GPT-4V's ability on multiple clinical tasks with or without patent history provided, including imaging modality and anatomy recognition, disease diagnosis, report generation, disease localisation. Our observation shows that, while GPT-4V demonstrates proficiency in distinguishing between medical image modalities and anatomy, it faces significant challenges in disease diagnosis and generating comprehensive reports. These findings underscore that while large multimodal models have made significant advancements in computer vision and natural language processing, it remains far from being used to effectively support real-world medical applications and clinical decision-making. All images used in this report can be found in https://github.com/chaoyi-wu/GPT-4V_Medical_Evaluation.

en cs.CV, cs.CL

Detail Sumber

DOAJ Open Access 2022

Complete genome sequence data of tropical thermophilic bacterium Parageobacillus caldoxylosilyticus ER4B

Xin Jie Ching, Nazalan Najimudin, Yoke Kqueen Cheah et al.

Parageobacillus caldoxylosilyticus, or previously identified as Geobacillus caldoxylosilyticus, is a thermophilic Gram-positive bacterium which can easily withstand growth temperatures ranging from 40 °C to 70 °C. Here, we present the first complete genome sequence of Parageobacillus caldoxylosilyticus ER4B which was isolated from an empty oil palm fruit bunch compost in Malaysia. Whole genome sequencing was performed using the PacBio RSII platform. The genome size of strain ER4B was around 3.9Mbp, with GC content of 44.31%. The genome consists of two contigs, in which the larger contig (3,909,276bp) represents the chromosome, while the smaller one (54,250bp) represents the plasmid. A total of 4,164 genes were successfully predicted, including 3,972 protein coding sequences, 26 rRNAs, 91 tRNAs, 74 miscRNA, and 1 tmRNA. The genome sequence data of strain ER4B reported here may contribute to the current molecular information of the species. It may also facilitate the discovery of molecular traits related to thermal stress, thus, expanding our understanding in the acclimation or adaptation towards extreme temperature in bacteria.

Computer applications to medicine. Medical informatics, Science (General)

Detail DOI Sumber

DOAJ Open Access 2022

Using machine learning to determine the correlation between physiological and environmental parameters and the induction of acute mountain sickness

Chih-Yuan Wei, Ping-Nan Chen, Shih-Sung Lin et al.

Abstract Background Recent studies on acute mountain sickness (AMS) have used fixed-location and fixed-time measurements of environmental and physiological variable to determine the influence of AMS-associated factors in the human body. This study aims to measure, in real time, environmental conditions and physiological variables of participants in high-altitude regions to develop an AMS risk evaluation model to forecast prospective development of AMS so its onset can be prevented. Results Thirty-two participants were recruited, namely 25 men and 7 women, and they hiked from Cuifeng Mountain Forest Park parking lot (altitude: 2300 m) to Wuling (altitude: 3275 m). Regression and classification machine learning analyses were performed on physiological and environmental data, and Lake Louise Acute Mountain Sickness Scores (LLS) to establish an algorithm for AMS risk analysis. The individual R2 coefficients of determination between the LLS and the measured altitude, ambient temperature, atmospheric pressure, relative humidity, climbing speed, heart rate, blood oxygen saturation (SpO2), heart rate variability (HRV), were 0.1, 0.23, 0, 0.24, 0, 0.24, 0.27, and 0.35 respectively; incorporating all aforementioned variables, the R2 coefficient is 0.62. The bagged trees classifier achieved favorable classification results, yielding a model sensitivity, specificity, accuracy, and area under receiver operating characteristic curve of 0.999, 0.994, 0.998, and 1, respectively. Conclusion The experiment results indicate the use of machine learning multivariate analysis have higher AMS prediction accuracies than analyses utilizing single varieties. The developed AMS evaluation model can serve as a reference for the future development of wearable devices capable of providing timely warnings of AMS risks to hikers.

Computer applications to medicine. Medical informatics, Biology (General)

Detail DOI Sumber

arXiv Open Access 2022

A Comparative Study of Confidence Calibration in Deep Learning: From Computer Vision to Medical Imaging

Riqiang Gao, Thomas Li, Yucheng Tang et al.

Although deep learning prediction models have been successful in the discrimination of different classes, they can often suffer from poor calibration across challenging domains including healthcare. Moreover, the long-tail distribution poses great challenges in deep learning classification problems including clinical disease prediction. There are approaches proposed recently to calibrate deep prediction in computer vision, but there are no studies found to demonstrate how the representative models work in different challenging contexts. In this paper, we bridge the confidence calibration from computer vision to medical imaging with a comparative study of four high-impact calibration models. Our studies are conducted in different contexts (natural image classification and lung cancer risk estimation) including in balanced vs. imbalanced training sets and in computer vision vs. medical imaging. Our results support key findings: (1) We achieve new conclusions which are not studied under different learning contexts, e.g., combining two calibration models that both mitigate the overconfident prediction can lead to under-confident prediction, and simpler calibration models from the computer vision domain tend to be more generalizable to medical imaging. (2) We highlight the gap between general computer vision tasks and medical imaging prediction, e.g., calibration methods ideal for general computer vision tasks may in fact damage the calibration of medical imaging prediction. (3) We also reinforce previous conclusions in natural image classification settings. We believe that this study has merits to guide readers to choose calibration models and understand gaps between general computer vision and medical imaging domains.

en cs.CV

Detail Sumber

arXiv Open Access 2022

Approximate Computing and the Efficient Machine Learning Expedition

Jörg Henkel, Hai Li, Anand Raghunathan et al.

Approximate computing (AxC) has been long accepted as a design alternative for efficient system implementation at the cost of relaxed accuracy requirements. Despite the AxC research activities in various application domains, AxC thrived the past decade when it was applied in Machine Learning (ML). The by definition approximate notion of ML models but also the increased computational overheads associated with ML applications-that were effectively mitigated by corresponding approximations-led to a perfect matching and a fruitful synergy. AxC for AI/ML has transcended beyond academic prototypes. In this work, we enlighten the synergistic nature of AxC and ML and elucidate the impact of AxC in designing efficient ML systems. To that end, we present an overview and taxonomy of AxC for ML and use two descriptive application scenarios to demonstrate how AxC boosts the efficiency of ML systems.

en cs.AR, cs.LG

Detail DOI Sumber

arXiv Open Access 2022

Virtual vs. Reality: External Validation of COVID-19 Classifiers using XCAT Phantoms for Chest Computed Tomography

Fakrul Islam Tushar, Ehsan Abadi, Saman Sotoudeh-Paima et al.

Research studies of artificial intelligence models in medical imaging have been hampered by poor generalization. This problem has been especially concerning over the last year with numerous applications of deep learning for COVID-19 diagnosis. Virtual imaging trials (VITs) could provide a solution for objective evaluation of these models. In this work utilizing the VITs, we created the CVIT-COVID dataset including 180 virtually imaged computed tomography (CT) images from simulated COVID-19 and normal phantom models under different COVID-19 morphology and imaging properties. We evaluated the performance of an open-source, deep-learning model from the University of Waterloo trained with multi-institutional data and an in-house model trained with the open clinical dataset called MosMed. We further validated the model's performance against open clinical data of 305 CT images to understand virtual vs. real clinical data performance. The open-source model was published with nearly perfect performance on the original Waterloo dataset but showed a consistent performance drop in external testing on another clinical dataset (AUC=0.77) and our simulated CVIT-COVID dataset (AUC=0.55). The in-house model achieved an AUC of 0.87 while testing on the internal test set (MosMed test set). However, performance dropped to an AUC of 0.65 and 0.69 when evaluated on clinical and our simulated CVIT-COVID dataset. The VIT framework offered control over imaging conditions, allowing us to show there was no change in performance as CT exposure was changed from 28.5 to 57 mAs. The VIT framework also provided voxel-level ground truth, revealing that performance of in-house model was much higher at AUC=0.87 for diffuse COVID-19 infection size >2.65% lung volume versus AUC=0.52 for focal disease with <2.65% volume. The virtual imaging framework enabled these uniquely rigorous analyses of model performance.

en eess.IV, cs.CV

Detail Sumber

arXiv Open Access 2022

Industry applications of neutral-atom quantum computing solving independent set problems

Jonathan Wurtz, Pedro L. S. Lopes, Christoph Gorgulla et al.

Architectures for quantum computing based on neutral atoms have risen to prominence as candidates for both near and long-term applications. These devices are particularly well suited to solve independent set problems, as the combinatorial constraints can be naturally encoded in the low-energy Hilbert space due to the Rydberg blockade mechanism. Here, we approach this connection with a focus on a particular device architecture and explore the ubiquity and utility of independent set problems by providing examples of real-world applications. After a pedagogical introduction of basic graph theory concepts of relevance, we briefly discuss how to encode independent set problems in Rydberg Hamiltonians. We then outline the major classes of independent set problems and include associated example applications with industry and social relevance. We determine a wide range of sectors that could benefit from efficient solutions of independent set problems -- from telecommunications and logistics to finance and strategic planning -- and display some general strategies for efficient problem encoding and implementation on neutral-atom platforms.

en quant-ph

Detail Sumber

arXiv Open Access 2022

The Functional Machine Calculus

Willem Heijltjes

This paper presents the Functional Machine Calculus (FMC) as a simple model of higher-order computation with "reader/writer" effects: higher-order mutable store, input/output, and probabilistic and non-deterministic computation. The FMC derives from the lambda-calculus by taking the standard operational perspective of a call-by-name stack machine as primary, and introducing two natural generalizations. One, "locations", introduces multiple stacks, which each may represent an effect and so enable effect operators to be encoded into the abstraction and application constructs of the calculus. The second, "sequencing", is known from kappa-calculus and concatenative programming languages, and introduces the imperative notions of "skip" and "sequence". This enables the encoding of reduction strategies, including call-by-value lambda-calculus and monadic constructs. The encoding of effects into generalized abstraction and application means that standard results from the lambda-calculus may carry over to effects. The main result is confluence, which is possible because encoded effects reduce algebraically rather than operationally. Reduction generates the familiar algebraic laws for state, and unlike in the monadic setting, reader/writer effects combine seamlessly. A system of simple types confers termination of the machine.

en cs.PL

Detail DOI Sumber

Hasil untuk "Computer applications to medicine. Medical informatics"