Background: Retrieval augmented generation (RAG) technology can empower large language models (LLMs) to generate more accurate, professional, and timely responses without fine tuning. However, due to the complex reasoning processes and substantial individual differences involved in traditional Chinese medicine (TCM) clinical diagnosis and treatment, traditional RAG methods often exhibit poor performance in this domain. Objective: To address the limitations of conventional RAG approaches in TCM applications, this study aims to develop an improved RAG framework tailored to the characteristics of TCM reasoning. Methods: We developed TCM-DiffRAG, an innovative RAG framework that integrates knowledge graphs (KG) with chains of thought (CoT). TCM-DiffRAG was evaluated on three distinctive TCM test datasets. Results: The experimental results demonstrated that TCM-DiffRAG achieved significant performance improvements over native LLMs. For example, the qwen-plus model achieved scores of 0.927, 0.361, and 0.038, which were significantly enhanced to 0.952, 0.788, and 0.356 with TCM-DiffRAG. The improvements were even more pronounced for non-Chinese LLMs. Additionally, TCM-DiffRAG outperformed directly supervised fine-tuned (SFT) LLMs and other benchmark RAG methods. Conclusions: TCM-DiffRAG shows that integrating structured TCM knowledge graphs with Chain of Thought based reasoning substantially improves performance in individualized diagnostic tasks. The joint use of universal and personalized knowledge graphs enables effective alignment between general knowledge and clinical reasoning. These results highlight the potential of reasoning-aware RAG frameworks for advancing LLM applications in traditional Chinese medicine.
Retrieval-augmented generation (RAG) promises grounded question answering, yet domain settings with multiple heterogeneous knowledge bases (KBs) remain challenging. In Chinese Tibetan medicine, encyclopedia entries are often dense and easy to match, which can dominate retrieval even when classics or clinical papers provide more authoritative evidence. We study a practical setting with three KBs (encyclopedia, classics, and clinical papers) and a 500-query benchmark (cutoff $K{=}5$) covering both single-KB and cross-KB questions. We propose two complementary methods to improve traceability, reduce hallucinations, and enable cross-KB verification. First, DAKS performs KB routing and budgeted retrieval to mitigate density-driven bias and to prioritize authoritative sources when appropriate. Second, we use an alignment graph to guide evidence fusion and coverage-aware packing, improving cross-KB evidence coverage without relying on naive concatenation. All answers are generated by a lightweight generator, \textsc{openPangu-Embedded-7B}. Experiments show consistent gains in routing quality and cross-KB evidence coverage, with the full system achieving the best CrossEv@5 while maintaining strong faithfulness and citation correctness.
Three-dimensional (3D) printed preoperative planning models serve a critical role in the success of many medical procedures. However, many of these models do not portray the patient's complete anatomy due to their monolithic and static nature. The use of dynamic 3D-printed models can better equip physicians by providing a more anatomically accurate model due to its movement capabilities and the ability to remove and replace printed anatomies based on planning stages. A dynamic 3D-printed preoperative planning model has the capability to move in similar ways to the anatomy that is being represented by the model, or reveal additional issues that may arise during the use of a movement mechanism. The 3D-printed models are constructed in a similar manner to their static counterparts; however, in the digital post-processing phase, additional care is needed to ensure the dynamic functionality of the model. Here, we discuss the process of creating a dynamic 3D-printed model and its benefits and uses in modern medicine.
Artificial intelligence technology plays a crucial role in recommending prescriptions for traditional Chinese medicine (TCM). Previous studies have made significant progress by focusing on the symptom-herb relationship in prescriptions. However, several limitations hinder model performance: (i) Insufficient attention to patient-personalized information such as age, BMI, and medical history, which hampers accurate identification of syndrome and reduces efficacy. (ii) The typical long-tailed distribution of herb data introduces training biases and affects generalization ability. (iii) The oversight of the 'monarch, minister, assistant and envoy' compatibility among herbs increases the risk of toxicity or side effects, opposing the 'treatment based on syndrome differentiation' principle in clinical TCM. Therefore, we propose a novel hierarchical structure-enhanced personalized recommendation model for TCM formulas based on knowledge graph diffusion guidance, namely TCM-HEDPR. Specifically, we pre-train symptom representations using patient-personalized prompt sequences and apply prompt-oriented contrastive learning for data augmentation. Furthermore, we employ a KG-guided homogeneous graph diffusion method integrated with a self-attention mechanism to globally capture the non-linear symptom-herb relationship. Lastly, we design a heterogeneous graph hierarchical network to integrate herbal dispensing relationships with implicit syndromes, guiding the prescription generation process at a fine-grained level and mitigating the long-tailed herb data distribution problem. Extensive experiments on two public datasets and one clinical dataset demonstrate the effectiveness of TCM-HEDPR. In addition, we incorporate insights from modern medicine and network pharmacology to evaluate the recommended prescriptions comprehensively. It can provide a new paradigm for the recommendation of modern TCM.
In precision medicine, quantitative multi-omic features, topological context, and textual biological knowledge play vital roles in identifying disease-critical signaling pathways and targets. Existing pipelines capture only part of these-numerical omics ignore topological context, text-centric LLMs lack quantitative grounded reasoning, and graph-only models underuse node semantics and the generalization of LLMs-limiting mechanistic interpretability. Although Process Reward Models (PRMs) aim to guide reasoning in LLMs, they remain limited by unreliable intermediate evaluation, and vulnerability to reward hacking with computational cost. These gaps motivate integrating quantitative multi-omic signals, topological structure with node annotations, and literature-scale text via LLMs, using subgraph reasoning as the principle bridge linking numeric evidence, topological knowledge and language context. Therefore, we propose GALAX (Graph Augmented LAnguage model with eXplainability), an innovative framework that integrates pretrained Graph Neural Networks (GNNs) into Large Language Models (LLMs) via reinforcement learning guided by a Graph Process Reward Model (GPRM), which generates disease-relevant subgraphs in a step-wise manner initiated by an LLM and iteratively evaluated by a pretrained GNN and schema-based rule check, enabling process-level supervision without explicit labels. As an application, we also introduced Target-QA, a benchmark combining CRISPR-identified targets, multi-omic profiles, and biomedical graph knowledge across diverse cancer cell lines, which enables GNN pretraining for supervising step-wise graph construction and supports long-context reasoning over text-numeric graphs (TNGs), providing a scalable and biologically grounded framework for explainable, reinforcement-guided subgraph reasoning toward reliable and interpretable target discovery in precision medicine.
Youngseung Jeon, Christopher Hwang, Ziwen Li
et al.
While drug discovery is vital for human health, the process remains inefficient. Medicinal chemists must navigate a vast protein space to identify target proteins that meet three criteria: physical and functional interactions, therapeutic impact, and docking potential. Prior approaches have provided fragmented support for each criterion, limiting the generation of promising hypotheses for wet-lab experiments. We present HAPPIER, an AI-powered tool that supports hypothesis generation with integrated multi-criteria support for target identification. HAPPIER enables medicinal chemists to 1) efficiently explore and verify proteins in a single integrated graph component showing multi-criteria satisfaction and 2) validate AI suggestions with domain knowledge. These capabilities facilitate iterative cycles of divergent and convergent thinking, essential for hypothesis generation. We evaluated HAPPIER with ten medicinal chemists, finding that it increased the number of high-confidence hypotheses and support for the iterative cycle, and further demonstrated the relationship between engaging in such cycles and confidence in outputs.
Michael S. Yao, Osbert Bastani, Alma Andersson
et al.
The goal of personalized medicine is to discover a treatment regimen that optimizes a patient's clinical outcome based on their personal genetic and environmental factors. However, candidate treatments cannot be arbitrarily administered to the patient to assess their efficacy; we often instead have access to an in silico surrogate model that approximates the true fitness of a proposed treatment. Unfortunately, such surrogate models have been shown to fail to generalize to previously unseen patient-treatment combinations. We hypothesize that domain-specific prior knowledge - such as medical textbooks and biomedical knowledge graphs - can provide a meaningful alternative signal of the fitness of proposed treatments. To this end, we introduce LLM-based Entropy-guided Optimization with kNowledgeable priors (LEON), a mathematically principled approach to leverage large language models (LLMs) as black-box optimizers without any task-specific fine-tuning, taking advantage of their ability to contextualize unstructured domain knowledge to propose personalized treatment plans in natural language. In practice, we implement LEON via 'optimization by prompting,' which uses LLMs as stochastic engines for proposing treatment designs. Experiments on real-world optimization tasks show LEON outperforms both traditional and LLM-based methods in proposing individualized treatments for patients.
Daniel Mas Montserrat, Ray Verma, Míriam Barrabés
et al.
Large-scale genomic workflows used in precision medicine can process datasets spanning tens to hundreds of gigabytes per sample, leading to high memory spikes, intensive disk I/O, and task failures due to out-of-memory errors. Simple static resource allocation methods struggle to handle the variability in per-chromosome RAM demands, resulting in poor resource utilization and long runtimes. In this work, we propose multiple mechanisms for adaptive, RAM-efficient parallelization of chromosome-level bioinformatics workflows. First, we develop a symbolic regression model that estimates per-chromosome memory consumption for a given task and introduces an interpolating bias to conservatively minimize over-allocation. Second, we present a dynamic scheduler that adaptively predicts RAM usage with a polynomial regression model, treating task packing as a Knapsack problem to optimally batch jobs based on predicted memory requirements. Additionally, we present a static scheduler that optimizes chromosome processing order to minimize peak memory while preserving throughput. Our proposed methods, evaluated on simulations and real-world genomic pipelines, provide new mechanisms to reduce memory overruns and balance load across threads. We thereby achieve faster end-to-end execution, showcasing the potential to optimize large-scale genomic workflows.
Medicinal plants have been a key component in producing traditional and modern medicines, especially in the field of Ayurveda, an ancient Indian medical system. Producing these medicines and collecting and extracting the right plant is a crucial step due to the visually similar nature of some plants. The extraction of these plants from nonmedicinal plants requires human expert intervention. To solve the issue of accurate plant identification and reduce the need for a human expert in the collection process; employing computer vision methods will be efficient and beneficial. In this paper, we have proposed a model that solves such issues. The proposed model is a custom convolutional neural network (CNN) architecture with 6 convolution layers, max-pooling layers, and dense layers. The model was tested on three different datasets named Indian Medicinal Leaves Image Dataset,MED117 Medicinal Plant Leaf Dataset, and the self-curated dataset by the authors. The proposed model achieved respective accuracies of 99.5%, 98.4%, and 99.7% using various optimizers including Adam, RMSprop, and SGD with momentum.
Cristina Cravana, Pietro Medica, Esterina Fazio
et al.
The hypothalamic-pituitary-adrenal (HPA) axis is a neuroendocrine system involved in the coping response to stressful challenges during exercise stimuli. Exercise represents a significant disruptor of homeostasis, inducing an ACTH-cortisol co-secretion, based on different characteristics of exercise in sport horses. Based on this statement, the aim of this study is to evaluate the circulating adrenocorticotropin and cortisol changes in Standardbred trotters, after training and racing sessions, considering the different age and sex. In particular, the aim is to determine to what extent the level of ACTH and cortisol increases during maximum effort in competition conditions (racing), and to compare two exercise conditions of different intensity, training and racing sessions, and effects on ACTH and cortisol responses. Ten Standardbreds, three females and seven males, clinically healthy, were enrolled and subjected to two exercise conditions: a non-competitive session (training) and then a competitive event (racing). Four of them were 2-year-olds and a further six were 3-year-olds. Training and racing effects on both ACTH (<i>p</i> < 0.01) and cortisol (<i>p</i> < 0.01) values were obtained. Compared to the training session, horses showed greater ACTH concentrations at rest (<i>p</i> < 0.001), at 5 (<i>p</i> < 0.01) and 30 min (<i>p</i> < 0.001), and lower cortisol concentrations only at rest (<i>p</i> < 0.01) after racing; 2- and 3-year-old horses showed the greater ACTH concentrations at 5 and 30 min (<i>p</i> < 0.01) post-racing; males showed the greater ACTH concentrations at 5 min and 30 min (<i>p</i> < 0.01) post-racing. The different stimuli of the two contexts, and differences in exercise intensity, such as training and competitive event, may have affected the direction of hypothalamic-pituitary-adrenal (HPA) axis response, both as an ability to adapt to physical stress of different intensity and as a preparatory activity for coping with stimuli. In conclusion, training and racing events induced a different HPA axis response in which both emotional experience and physical maturity could induce a significant adaptive response. As ACTH and cortisol concentrations in adult equids are extremely heterogeneous, further investigation is required to explore how different variables can influence the hormonal dynamics and their role as expressions of adaptive strategies to stress in horses.
Belinda Claire Kiam, Aline Gaelle Bouopda-Tuedom, Jean Arthur Mbida Mbida
et al.
Abstract Background Assessing vector bionomics and their role in transmission is crucial to improving vector control strategies. Several entomological studies have been conducted to describe malaria transmission in different eco-epidemiological settings in Cameroon; however, data gaps persist, particularly in the highland areas. This study aimed to characterize malaria vectors in three localities along an altitudinal gradient in the western region: Santchou (700 m), Dschang (1400 m) and Penka Michel (1500 m). Methods Human landing catches were conducted from May to June 2023 in 17 villages (including 10 health zones in Dschang, 4 in Santchou and 3 in Penka Michel) from 6:00 p.m. to 9:00 a.m. Mosquitoes were sorted into genera and all Anopheles species were identified using morphological taxonomic keys and species-specific Polymerase Chain reaction (PCR). Entomological indicators, including species composition, abundance, biting behaviour, infection rate and entomological inoculation rate (EIR) were assessed. Genomic DNA from the head and thorax was extracted and tested for Plasmodium infection by real-time PCR. Results A total of 2835 Anopheles mosquitoes were identified, including Anopheles gambiae sensu lato (s.l.) (82.88%), Anopheles funestus s.l. (15.92%), Anopheles nili (0.09%) and Anopheles ziemanni (1.11%), with An. gambiae s.l. being the most prevalent at all sites. Anopheles gambiae s.l. had a significantly higher human-biting rate at Penka Michel (45.25 bites/human/night) compared to Santchou (3.1 bites/human/night [b/h/n]) and Dschang (0.41 bites/human/night) (p-value < 0.001). It was also the main malaria vector, with an entomological inoculation rate (EIR) 13 times higher in Penka Michel than Santchou (1.11 vs. 0.08 infective bites/human/night). The data suggest a very focal distribution of infective An. gambiae s.l. mosquitoes. Plasmodium falciparum was the dominant malaria parasite (67% in Santchou, 62% in Penka Michel), but Plasmodium malariae (33% in Santchou, 31% in Penka Michel) and Plasmodium ovale (1.21% only in Penka Michel) infections were also detected. Conclusion The study highlights a difference in mosquito composition and host-seeking behaviour across altitudes, emphasizing the need for continued surveillance to monitor vector populations. To combat the persistence of malaria in Cameroon, it is crucial to implement additional tools like larviciding, integrated and environmental management, particularly against outdoor-biting mosquitoes, to prevent potential malaria outbreaks in these highland areas.
Arctic medicine. Tropical medicine, Infectious and parasitic diseases
The recently unprecedented advancements in Large Language Models (LLMs) have propelled the medical community by establishing advanced medical-domain models. However, due to the limited collection of medical datasets, there are only a few comprehensive benchmarks available to gauge progress in this area. In this paper, we introduce a new medical question-answering (QA) dataset that contains massive manual instruction for solving Traditional Chinese Medicine examination tasks, called TCMD. Specifically, our TCMD collects massive questions across diverse domains with their annotated medical subjects and thus supports us in comprehensively assessing the capability of LLMs in the TCM domain. Extensive evaluation of various general LLMs and medical-domain-specific LLMs is conducted. Moreover, we also analyze the robustness of current LLMs in solving TCM QA tasks by introducing randomness. The inconsistency of the experimental results also reveals the shortcomings of current LLMs in solving QA tasks. We also expect that our dataset can further facilitate the development of LLMs in the TCM area.
Anh Le, Amirreza Hashemi, Mark P. Ottensmeyer
et al.
The design of nuclear imaging scanners is crucial for optimizing detection and imaging processes. While advancements have been made in simplistic, symmetrical modalities, current research is progressing towards more intricate structures, however, the widespread adoption of computer-aided design (CAD) tools for modeling and simulation is still limited. This paper introduces FreeCAD and the GDML Workbench as essential tools for designing and testing complex geometries in nuclear imaging modalities. FreeCAD is a parametric 3D CAD modeler, and GDML is an XML-based language for describing complex geometries in simulations. Their integration streamlines the design and simulation of nuclear medicine scanners, including PET and SPECT scanners. The paper demonstrates their application in creating calibration phantoms and conducting simulations with Geant4, showcasing their precision and versatility in generating sophisticated components for nuclear imaging. The integration of these tools is expected to streamline design processes, enhance efficiency, and facilitate widespread application in the nuclear imaging field.
Recent advancements in Vision Language Models (VLMs) have demonstrated remarkable promise in generating visually grounded responses. However, their application in the medical domain is hindered by unique challenges. For instance, most VLMs rely on a single method of visual grounding, whereas complex medical tasks demand more versatile approaches. Additionally, while most VLMs process only 2D images, a large portion of medical images are 3D. The lack of medical data further compounds these obstacles. To address these challenges, we present VividMed, a vision language model with versatile visual grounding for medicine. Our model supports generating both semantic segmentation masks and instance-level bounding boxes, and accommodates various imaging modalities, including both 2D and 3D data. We design a three-stage training procedure and an automatic data synthesis pipeline based on open datasets and models. Besides visual grounding tasks, VividMed also excels in other common downstream tasks, including Visual Question Answering (VQA) and report generation. Ablation studies empirically show that the integration of visual grounding ability leads to improved performance on these tasks. Our code is publicly available at https://github.com/function2-llx/MMMM.
Michael Vollenweider, Manuel Schürch, Chiara Rohrer
et al.
Precision medicine has the potential to tailor treatment decisions to individual patients using machine learning (ML) and artificial intelligence (AI), but it faces significant challenges due to complex biases in clinical observational data and the high-dimensional nature of biological data. This study models various types of treatment assignment biases using mutual information and investigates their impact on ML models for counterfactual prediction and biomarker identification. Unlike traditional counterfactual benchmarks that rely on fixed treatment policies, our work focuses on modeling different characteristics of the underlying observational treatment policy in distinct clinical settings. We validate our approach through experiments on toy datasets, semi-synthetic tumor cancer genome atlas (TCGA) data, and real-world biological outcomes from drug and CRISPR screens. By incorporating empirical biological mechanisms, we create a more realistic benchmark that reflects the complexities of real-world data. Our analysis reveals that different biases lead to varying model performances, with some biases, especially those unrelated to outcome mechanisms, having minimal effect on prediction accuracy. This highlights the crucial need to account for specific biases in clinical observational data in counterfactual ML model development, ultimately enhancing the personalization of treatment decisions in precision medicine.
Clara Agustí, Xavier Manteca, Daniel García-Párraga
et al.
Society is showing a growing concern about the welfare of cetaceans in captivity as well as cetaceans in the wild threatened by anthropogenic disturbances. The study of the physiological stress response is increasingly being used to address cetacean conservation and welfare issues. Within it, a newly described technique of extracting cortisol from epidermal desquamation may serve as a non-invasive, more integrated measure of a cetacean’s stress response and welfare. However, confounding factors are common when measuring glucocorticoid hormones. In this study, we validated a steroid hormone extraction protocol and the use of a commercial enzyme immunoassay (EIA) test to measure cortisol concentrations in common bottlenose dolphin (<i>Tursiops truncatus</i>) and beluga (<i>Delphinapterus leucas</i>) epidermal samples. Moreover, we examined the effect of sample mass and body location on cortisol concentrations. Validation tests (i.e., assay specificity, accuracy, precision, and sensitivity) suggested that the method was suitable for the quantification of cortisol concentrations. Cortisol was extracted from small samples (0.01 g), but the amount of cortisol detected and the variability between duplicate extractions increased as the sample mass decreased. In common bottlenose dolphins, epidermal skin cortisol concentrations did not vary significantly across body locations while there was a significant effect of the individual. Overall, we present a contribution towards advancing and standardizing epidermis hormone assessments in cetaceans.
David Castejón, José Segura, Karen Paola Cruz-Díaz
et al.
For the first time, High-Resolution Magic Angle Nuclear Magnetic Resonance spectroscopy (NMR-HRMAS) was applied to directly identify specific metabolites from a Spanish raw ewe’s milk and enzymatic coagulation pressed-curd cheese (Protected Geographical Indication: <i>Castellano</i>) manufactured by two procedures (traditional/artisanal vs. industrial) and including the ewe’s raw milk. The NMR parameters were optimized to study the complex matrixes of this type of cheese. In addition, conventional overcrowded <sup>1</sup>H-NMR-HRMAS spectra were selectively simplified by a Carr–Purcell–Meiboom–Gill (CPMG) sequence or a stimulated echo pulse sequence by bipolar gradients (DIFF), thus modulating spin–spin relaxation times and diffusion of molecular components, respectively. <sup>1</sup>H-NMR-HRMAS spectroscopy displayed important information about cheese metabolites, which can be associated with different manufacturing processes (industrial vs. traditional) and ripening times (from 2 to 90 days). These results support that this spectroscopy is a useful technique to monitor the ripening process, from raw milk to commercial ripened cheese, using a minimum intact sample, implying the absence of time-consuming sample pretreatments.