Hasil untuk "Arctic medicine. Tropical medicine"

Menampilkan 20 dari ~4482773 hasil · dari arXiv, DOAJ, CrossRef, Semantic Scholar

JSON API
DOAJ Open Access 2026
Evaluating seasonal malaria chemoprevention coverage using the lot quality assurance sampling method in Niger

Almoustapha Mahamane Wazodan, Ibrahim Alkassoum, Mahamane Moustapha Lamine et al.

BackgroundRapid data collection is paramount for monitoring and evaluating public health activities. Several techniques are used, including the Lot Quality Assurance Sampling (LQAS) method, which originated in the industrial sector and has many applications in the public health sector. The objective of this study was to demonstrate the application of the LQAS method to evaluate the coverage of seasonal malaria chemoprevention by classifying geographical areas according to their performance and whether the objectives were achieved.MethodologyThe data were collected by interviewing the mother or caregiver and by looking for evidence (completed SMC cards) of the administration of medicines. The sample included 50 children aged 3 to 59 months per health district, chosen in five localities (village/district) drawn at random, a rate of 10 children per locality. The decision value (D) is the value beyond which the batch is considered inadequate. When the lower decision threshold is 75% and the upper threshold is 90%, for a sample of 50 children, the decision value is 7. It is therefore not acceptable that, with a sample size of 50 children, more than 7 children did not receive treatment during the campaign.ResultsApplying LQAS rules to the data, we found that 100% (19/19) of districts were performing well in the first and fourth cycles of the SMC campaign, as reported by mothers and caregivers. In cycle 1, coverage was 99.47% (945/950), and the administration rate of the second and third doses was 98.4% (935/950) and 98% (931/950), respectively. In cycle 4, 99.3% (943/950) of children had received their first dose of treatment, and for the second and third doses, the administration rate was 99.4% (944/950) and 99,2% (942/950), respectively.Based on documented evidence (SMC card or Blister), 63% (12/19) of districts were performing well in cycle 1, and 63% (12/19) were performing well in cycle 4 of the SMC campaign.ConclusionThis study demonstrates the feasibility of applying the LQAS method in the evaluation of chemoprevention coverage for seasonal malaria. The results suggest that all the districts are performing well for both rounds according to mothers’ reports, but according to the evidence, a certain number of districts are non-performing. This classification of zones helps guide interventions and inform appropriate strategies for each zone.

Arctic medicine. Tropical medicine
arXiv Open Access 2025
Using Statistical Precision Medicine to Identify Optimal Treatments in a Heart Failure Setting

Arti Virkud, Jessie K. Edwards, Michele Jonsson Funk et al.

Identifying optimal medical treatments to improve survival has long been a critical goal of pharmacoepidemiology. Traditionally, we use an average treatment effect measure to compare outcomes between treatment plans. However, new methods leveraging advantages of machine learning combined with the foundational tenets of causal inference are offering an alternative to the average treatment effect. Here, we use three unique, precision medicine algorithms (random forests, residual weighted learning, efficient augmentation relaxed learning) to identify optimal treatment rules where patients receive the optimal treatment as indicated by their clinical history. First, we present a simple hypothetical example and a real-world application among heart failure patients using Medicare claims data. We next demonstrate how the optimal treatment rule improves the absolute risk in a hypothetical, three-modifier setting. Finally, we identify an optimal treatment rule that optimizes the time to outcome in a real-world heart failure setting. In both examples, we compare the average time to death under the optimized, tailored treatment rule with the average time to death under a universal treatment rule to show the benefit of precision medicine methods. The improvement under the optimal treatment rule in the real-world setting is greatest (additional ~9 days under the tailored rule) for survival time free of heart failure readmission.

en stat.AP
arXiv Open Access 2025
Comparisons between a Large Language Model-based Real-Time Compound Diagnostic Medical AI Interface and Physicians for Common Internal Medicine Cases using Simulated Patients

Hyungjun Park, Chang-Yun Woo, Seungjo Lim et al.

Objective To develop an LLM based realtime compound diagnostic medical AI interface and performed a clinical trial comparing this interface and physicians for common internal medicine cases based on the United States Medical License Exam (USMLE) Step 2 Clinical Skill (CS) style exams. Methods A nonrandomized clinical trial was conducted on August 20, 2024. We recruited one general physician, two internal medicine residents (2nd and 3rd year), and five simulated patients. The clinical vignettes were adapted from the USMLE Step 2 CS style exams. We developed 10 representative internal medicine cases based on actual patients and included information available on initial diagnostic evaluation. Primary outcome was the accuracy of the first differential diagnosis. Repeatability was evaluated based on the proportion of agreement. Results The accuracy of the physicians' first differential diagnosis ranged from 50% to 70%, whereas the realtime compound diagnostic medical AI interface achieved an accuracy of 80%. The proportion of agreement for the first differential diagnosis was 0.7. The accuracy of the first and second differential diagnoses ranged from 70% to 90% for physicians, whereas the AI interface achieved an accuracy rate of 100%. The average time for the AI interface (557 sec) was 44.6% shorter than that of the physicians (1006 sec). The AI interface ($0.08) also reduced costs by 98.1% compared to the physicians' average ($4.2). Patient satisfaction scores ranged from 4.2 to 4.3 for care by physicians and were 3.9 for the AI interface Conclusion An LLM based realtime compound diagnostic medical AI interface demonstrated diagnostic accuracy and patient satisfaction comparable to those of a physician, while requiring less time and lower costs. These findings suggest that AI interfaces may have the potential to assist primary care consultations for common internal medicine cases.

en cs.AI, cs.CL
arXiv Open Access 2025
A Survey of LLM-based Agents in Medicine: How far are we from Baymax?

Wenxuan Wang, Zizhan Ma, Zheng Wang et al.

Large Language Models (LLMs) are transforming healthcare through the development of LLM-based agents that can understand, reason about, and assist with medical tasks. This survey provides a comprehensive review of LLM-based agents in medicine, examining their architectures, applications, and challenges. We analyze the key components of medical agent systems, including system profiles, clinical planning mechanisms, medical reasoning frameworks, and external capacity enhancement. The survey covers major application scenarios such as clinical decision support, medical documentation, training simulations, and healthcare service optimization. We discuss evaluation frameworks and metrics used to assess these agents' performance in healthcare settings. While LLM-based agents show promise in enhancing healthcare delivery, several challenges remain, including hallucination management, multimodal integration, implementation barriers, and ethical considerations. The survey concludes by highlighting future research directions, including advances in medical reasoning inspired by recent developments in LLM architectures, integration with physical systems, and improvements in training simulations. This work provides researchers and practitioners with a structured overview of the current state and future prospects of LLM-based agents in medicine.

en cs.CL, cs.AI
arXiv Open Access 2025
Evaluating LLMs in Medicine: A Call for Rigor, Transparency

Mahmoud Alwakeel, Aditya Nagori, Vijay Krishnamoorthy et al.

Objectives: To evaluate the current limitations of large language models (LLMs) in medical question answering, focusing on the quality of datasets used for their evaluation. Materials and Methods: Widely-used benchmark datasets, including MedQA, MedMCQA, PubMedQA, and MMLU, were reviewed for their rigor, transparency, and relevance to clinical scenarios. Alternatives, such as challenge questions in medical journals, were also analyzed to identify their potential as unbiased evaluation tools. Results: Most existing datasets lack clinical realism, transparency, and robust validation processes. Publicly available challenge questions offer some benefits but are limited by their small size, narrow scope, and exposure to LLM training. These gaps highlight the need for secure, comprehensive, and representative datasets. Conclusion: A standardized framework is critical for evaluating LLMs in medicine. Collaborative efforts among institutions and policymakers are needed to ensure datasets and methodologies are rigorous, unbiased, and reflective of clinical complexities.

en cs.CL
arXiv Open Access 2024
Benchmarking Retrieval-Augmented Generation for Medicine

Guangzhi Xiong, Qiao Jin, Zhiyong Lu et al.

While large language models (LLMs) have achieved state-of-the-art performance on a wide range of medical question answering (QA) tasks, they still face challenges with hallucinations and outdated knowledge. Retrieval-augmented generation (RAG) is a promising solution and has been widely adopted. However, a RAG system can involve multiple flexible components, and there is a lack of best practices regarding the optimal RAG setting for various medical purposes. To systematically evaluate such systems, we propose the Medical Information Retrieval-Augmented Generation Evaluation (MIRAGE), a first-of-its-kind benchmark including 7,663 questions from five medical QA datasets. Using MIRAGE, we conducted large-scale experiments with over 1.8 trillion prompt tokens on 41 combinations of different corpora, retrievers, and backbone LLMs through the MedRAG toolkit introduced in this work. Overall, MedRAG improves the accuracy of six different LLMs by up to 18% over chain-of-thought prompting, elevating the performance of GPT-3.5 and Mixtral to GPT-4-level. Our results show that the combination of various medical corpora and retrievers achieves the best performance. In addition, we discovered a log-linear scaling property and the "lost-in-the-middle" effects in medical RAG. We believe our comprehensive evaluations can serve as practical guidelines for implementing RAG systems for medicine.

en cs.CL, cs.AI
arXiv Open Access 2024
AI-driven Alternative Medicine: A Novel Approach to Drug Discovery and Repurposing

Oleksandr Bilokon, Nataliya Bilokon, Paul Bilokon

AIAltMed is a cutting-edge platform designed for drug discovery and repurposing. It utilizes Tanimoto similarity to identify structurally similar non-medicinal compounds to known medicinal ones. This preprint introduces AIAltMed, discusses the concept of `AI-driven alternative medicine,' evaluates Tanimoto similarity's advantages and limitations, and details the system's architecture. Furthermore, it explores the benefits of extending the system to include PubChem and outlines a corresponding implementation strategy.

en q-bio.BM
arXiv Open Access 2024
A Preliminary Study of o1 in Medicine: Are We Closer to an AI Doctor?

Yunfei Xie, Juncheng Wu, Haoqin Tu et al.

Large language models (LLMs) have exhibited remarkable capabilities across various domains and tasks, pushing the boundaries of our knowledge in learning and cognition. The latest model, OpenAI's o1, stands out as the first LLM with an internalized chain-of-thought technique using reinforcement learning strategies. While it has demonstrated surprisingly strong capabilities on various general language tasks, its performance in specialized fields such as medicine remains unknown. To this end, this report provides a comprehensive exploration of o1 on different medical scenarios, examining 3 key aspects: understanding, reasoning, and multilinguality. Specifically, our evaluation encompasses 6 tasks using data from 37 medical datasets, including two newly constructed and more challenging question-answering (QA) tasks based on professional medical quizzes from the New England Journal of Medicine (NEJM) and The Lancet. These datasets offer greater clinical relevance compared to standard medical QA benchmarks such as MedQA, translating more effectively into real-world clinical utility. Our analysis of o1 suggests that the enhanced reasoning ability of LLMs may (significantly) benefit their capability to understand various medical instructions and reason through complex clinical scenarios. Notably, o1 surpasses the previous GPT-4 in accuracy by an average of 6.2% and 6.6% across 19 datasets and two newly created complex QA scenarios. But meanwhile, we identify several weaknesses in both the model capability and the existing evaluation protocols, including hallucination, inconsistent multilingual ability, and discrepant metrics for evaluation. We release our raw data and model outputs at https://ucsc-vlaa.github.io/o1_medicine/ for future research.

en cs.CL, cs.AI
arXiv Open Access 2023
TCM-GPT: Efficient Pre-training of Large Language Models for Domain Adaptation in Traditional Chinese Medicine

Guoxing Yang, Jianyu Shi, Zan Wang et al.

Pre-training and fine-tuning have emerged as a promising paradigm across various natural language processing (NLP) tasks. The effectiveness of pretrained large language models (LLM) has witnessed further enhancement, holding potential for applications in the field of medicine, particularly in the context of Traditional Chinese Medicine (TCM). However, the application of these general models to specific domains often yields suboptimal results, primarily due to challenges like lack of domain knowledge, unique objectives, and computational efficiency. Furthermore, their effectiveness in specialized domains, such as Traditional Chinese Medicine, requires comprehensive evaluation. To address the above issues, we propose a novel domain specific TCMDA (TCM Domain Adaptation) approach, efficient pre-training with domain-specific corpus. Specifically, we first construct a large TCM-specific corpus, TCM-Corpus-1B, by identifying domain keywords and retreving from general corpus. Then, our TCMDA leverages the LoRA which freezes the pretrained model's weights and uses rank decomposition matrices to efficiently train specific dense layers for pre-training and fine-tuning, efficiently aligning the model with TCM-related tasks, namely TCM-GPT-7B. We further conducted extensive experiments on two TCM tasks, including TCM examination and TCM diagnosis. TCM-GPT-7B archived the best performance across both datasets, outperforming other models by relative increments of 17% and 12% in accuracy, respectively. To the best of our knowledge, our study represents the pioneering validation of domain adaptation of a large language model with 7 billion parameters in TCM domain. We will release both TCMCorpus-1B and TCM-GPT-7B model once accepted to facilitate interdisciplinary development in TCM and NLP, serving as the foundation for further study.

en cs.CL, cs.AI
arXiv Open Access 2022
Precision Medicine for the Population-The Hope and Hype of Public Health Genomics

JunBo Wu, Nathaniel Comfort

Public health is the most recent of the biomedical sciences to be seduced by the trendy moniker "precision." Advocates for "precision public health" (PPH) call for a data-driven, computational approach to public health, leveraging swaths of genomic "big data" to inform public health decision-making. Yet, like precision medicine, PPH oversells the value of genomic data to determine health outcomes, but on a population-level. A large historical literature has shown that over-emphasizing heredity tends to disproportionately harm underserved minorities and disadvantaged communities. By comparing and contrasting PPH with an earlier attempt at using big data and genetics, in the Progressive era (1890-1920), we highlight some potential risks of a genotype-driven preventive public health. We conclude by suggesting that such risks may be avoided by prioritizing data integration across many levels of analysis, from the molecular to the social.

en cs.CY
arXiv Open Access 2021
Machine Learning and Deep Learning Methods for Building Intelligent Systems in Medicine and Drug Discovery: A Comprehensive Survey

G Jignesh Chowdary, Suganya G, Premalatha M et al.

With the advancements in computer technology, there is a rapid development of intelligent systems to understand the complex relationships in data to make predictions and classifications. Artificail Intelligence based framework is rapidly revolutionizing the healthcare industry. These intelligent systems are built with machine learning and deep learning based robust models for early diagnosis of diseases and demonstrates a promising supplementary diagnostic method for frontline clinical doctors and surgeons. Machine Learning and Deep Learning based systems can streamline and simplify the steps involved in diagnosis of diseases from clinical and image-based data, thus providing significant clinician support and workflow optimization. They mimic human cognition and are even capable of diagnosing diseases that cannot be diagnosed with human intelligence. This paper focuses on the survey of machine learning and deep learning applications in across 16 medical specialties, namely Dental medicine, Haematology, Surgery, Cardiology, Pulmonology, Orthopedics, Radiology, Oncology, General medicine, Psychiatry, Endocrinology, Neurology, Dermatology, Hepatology, Nephrology, Ophthalmology, and Drug discovery. In this paper along with the survey, we discuss the advancements of medical practices with these systems and also the impact of these systems on medical professionals.

en cs.LG, cs.AI
arXiv Open Access 2020
The implications of outcome truncation in reproductive medicine RCTs: a simulation platform for trialists and simulation study

Jack Wilkinson, Jonathan Huang, Antonia Marsden et al.

Randomised controlled trials in reproductive medicine are often subject to outcome truncation, where study outcomes are only defined in a subset of participants. Examples include birthweight (measurable only in the subgroup of participants who give birth) and miscarriage (which can only occur in participants who become pregnant). These are typically analysed by making a comparison between treatment arms within the subgroup (comparing birthweights in the subgroup who gave birth, or miscarriages in the subgroup who became pregnant). However, this approach does not represent a randomised comparison when treatment influences the probability of being observed (i.e. survival). The practical implications of this for reproductive trials are unclear. We developed a simulation platform to investigate the implications of outcome truncation for reproductive medicine trials. We used this to perform a simulation study, in which we considered the bias, Type 1 error, coverage, and precision of standard statistical analyses for truncated continuous and binary outcomes. Increasing treatment effect on the intermediate variable, strength of confounding between the intermediate and outcome variables, and interactions between treatment and confounder were found to adversely affect performance. However, within parameter ranges we would consider to be more realistic, the adverse effects were generally not drastic. For binary outcomes, the study highlighted that outcome truncation may lead to none of the participants in a study arm experiencing the outcome event. This was found to have severe consequences for inferences, and this may have implications for meta-analysis.

en stat.AP
arXiv Open Access 2020
Technical Background for "A Precision Medicine Approach to Develop and Internally Validate Optimal Exercise and Weight Loss Treatments for Overweight and Obese Adults with Knee Osteoarthritis"

Xiaotong Jiang, Amanda E. Nelson, Rebecca J. Cleveland et al.

We provide additional statistical background for the methodology developed in the clinical analysis of knee osteoarthritis in "A Precision Medicine Approach to Develop and Internally Validate Optimal Exercise and Weight Loss Treatments for Overweight and Obese Adults with Knee Osteoarthritis" (Jiang et al. 2020). Jiang et al. 2020 proposed a pipeline to learn optimal treatment rules with precision medicine models and compared them with zero-order models with a Z-test. The model performance was based on value functions, a scalar that predicts the future reward of each decision rule. The jackknife (i.e., leave-one-out cross validation) method was applied to estimate the value function and its variance of several outcomes in IDEA. IDEA is a randomized clinical trial studying three interventions (exercise (E), dietary weight loss (D), and D+E) on overweight and obese participants with knee osteoarthritis. In this report, we expand the discussion and justification with additional statistical background. We elaborate more on the background of precision medicine, the derivation of the jackknife estimator of value function and its estimated variance, the consistency property of jackknife estimator, as well as additional simulation results that reflect more of the performance of jackknife estimators. We recommend reading Jiang et al. 2020 for clinical application and interpretation of the optimal ITR of knee osteoarthritis as well as the overall understanding of the pipeline and recommend using this article to understand the underlying statistical derivation and methodology.

en stat.ML, cs.LG

Halaman 12 dari 224139