Does gravity care about electric charge? Precision tests of the weak equivalence principle achieve remarkable sensitivity but deliberately minimize electric charge on test masses, leaving this fundamental question experimentally open. We present a minimalist framework coupling electromagnetism to linearized gravity through conservation of a complex charge-mass current, predicting charge-dependent violations $Δa/g = κ(q/m)$. Remarkably, this prediction occupies unexplored experimental territory precisely because precision gravity tests avoid charge variation. We identify this as a significant gap and propose a modified torsion balance experiment where $q/m$ is treated as a controlled variable. Such an experiment could test whether gravitational acceleration depends on electric charge, probing physics in genuinely new parameter space. This work exemplifies how theoretical minimalism can reveal overlooked opportunities in fundamental physics.
John Wahhab, Iswarya Vimalan Jeya, Jackson R. Huttner
Introduction: Dislocations of the pisiform bone are rare, and literature on this injury is sparse. The uncommon nature of this condition, as well as limited data, makes recognition and diagnosis difficult, increasing the chances these injuries may be overlooked. Missing this diagnosis can lead to pain, reduced joint function, and nerve damage. Case Report: We present a case of pediatric pisiform dislocation and discuss the diagnosis and treatment in an emergency department setting. Conclusion: Prompt diagnosis and treatment of pisiform dislocations are vital to ensure favorable outcomes.
Medical emergencies. Critical care. Intensive care. First aid
Sergey K. Aityan, Abdolreza Mosaddegh, Rolando Herrero
et al.
Medical decision-support and advising systems are critical for emergency physicians to quickly and accurately assess patients' conditions and make diagnosis. Artificial Intelligence (AI) has emerged as a transformative force in healthcare in recent years and Large Language Models (LLMs) have been employed in various fields of medical decision-support systems. We studied responses of a group of different LLMs to real cases in emergency medicine. The results of our study on five most renown LLMs showed significant differences in capabilities of Large Language Models for diagnostics acute diseases in medical emergencies with accuracy ranging between 58% and 65%. This accuracy significantly exceeds the reported accuracy of human doctors. We built a super-learner MEDAS (Medical Emergency Diagnostic Advising System) of five major LLMs - Gemini, Llama, Grok, GPT, and Claude). The super-learner produces higher diagnostic accuracy, 70%, even with a quite basic meta-learner. However, at least one of the integrated LLMs in the same super-learner produces 85% correct diagnoses. The super-learner integrates a cluster of LLMs using a meta-learner capable of learning different capabilities of each LLM to leverage diagnostic accuracy of the model by collective capabilities of all LLMs in the cluster. The results of our study showed that aggregated diagnostic accuracy provided by a meta-learning approach exceeds that of any individual LLM, suggesting that the super-learner can take advantage of the combined knowledge of the medical datasets used to train the group of LLMs.
Emergency departments worldwide face rising patient volumes, workforce shortages, and variability in triage decisions that threaten the delivery of timely and accurate care. Current triage methods rely primarily on vital signs, routine laboratory values, and clinicians' judgment, which, while effective, often miss emerging biological signals that could improve risk prediction for infection typing or antibiotic administration in acute conditions. To address this challenge, we introduce TriAgent, a large language model (LLM)-based multi-agent framework that couples automated biomarker discovery with deep research for literature-grounded validation and novelty assessment. TriAgent employs a supervisor research agent to generate research topics and delegate targeted queries to specialized sub-agents for evidence retrieval from various data sources. Findings are synthesized to classify biomarkers as either grounded in existing knowledge or flagged as novel candidates, offering transparent justification and highlighting unexplored pathways in acute care risk stratification. Unlike prior frameworks limited to existing routine clinical biomarkers, TriAgent aims to deliver an end-to-end framework from data analysis to literature grounding to improve transparency, explainability and expand the frontier of potentially actionable clinical biomarkers. Given a user's clinical query and quantitative triage data, TriAgent achieved a topic adherence F1 score of 55.7 +/- 5.0%, surpassing the CoT-ReAct agent by over 10%, and a faithfulness score of 0.42 +/- 0.39, exceeding all baselines by more than 50%. Across experiments, TriAgent consistently outperformed state-of-the-art LLM-based agentic frameworks in biomarker justification and literature-grounded novelty assessment. We share our repo: https://github.com/CellFace/TriAgent.
Serving as an emerging and powerful tool, Large Language Model (LLM)-driven Human Digital Twins are showing great potential in healthcare system research. However, its actual simulation ability for complex human psychological traits, such as distrust in the healthcare system, remains unclear. This research gap particularly impacts health professionals' trust and usage of LLM-based Artificial Intelligence (AI) systems in assisting their routine work. In this study, based on the Twin-2K-500 dataset, we systematically evaluated the simulation results of the LLM-driven human digital twin using the Health Care System Distrust Scale (HCSDS) with an established human-subject sample, analyzing item-level distributions, summary statistics, and demographic subgroup patterns. Results showed that the simulated responses by the digital twin were significantly more centralized with lower variance and had fewer selections of extreme options (all p<0.001). While the digital twin broadly reproduces human results in major demographic patterns, such as age and gender, it exhibits relatively low sensitivity in capturing minor differences in education levels. The LLM-based digital twin simulation has the potential to simulate population trends, but it also presents challenges in making detailed, specific distinctions in subgroups of human beings. This study suggests that the current LLM-driven Digital Twins have limitations in modeling complex human attitudes, which require careful calibration and validation before applying them in inferential analyses or policy simulations in health systems engineering. Future studies are necessary to examine the emotional reasoning mechanism of LLMs before their use, particularly for studies that involve simulations sensitive to social topics, such as human-automation trust.
Generative AI systems are increasingly used by patients seeking everyday health guidance, yet their appropriateness in chronic care contexts remains unclear. Focusing on Type 2 Diabetes Mellitus (T2DM), this paper presents a mixed-methods investigation into how AI-generated health information is interpreted by patients and evaluated by physicians in China. Drawing on formative patient grounding and a dimension-based physician evaluation, we examine AI responses along five quality dimensions: Accuracy, Safety, Clarity, Integrity, and Action Orientation. Our findings reveal that while current systems perform well in factual explanation and general lifestyle guidance, they frequently break down in safety signaling, contextual judgment, and responsibility boundaries, particularly when fluent responses invite overtrust. By treating quality dimensions as an interpretive lens rather than a fixed framework, this work highlights the need for intelligent user interfaces that actively mediate AI outputs in chronic disease management, supporting calibrated trust and responsible boundary-setting in long-term care.
Managing patients with respiratory failure increasingly involves noninvasive respiratory support (NIRS) strategies to support respiration, often preventing the need for invasive mechanical ventilation. However, despite the rapidly expanding use of NIRS, there remains a significant challenge to its optimal use across all medical circumstances. It lacks a unified ontological structure, complicating guidance on NIRS modalities across healthcare systems. This study introduced NIRS ontology to support knowledge representation in acute care settings by providing a unified framework that enhances data clarity and interoperability, laying the groundwork for future clinical decision-making. We developed NIRS ontology using the Web Ontology Language (OWL) and Protege to organize clinical concepts and relationships. To enable rule-based clinical reasoning beyond hierarchical structures, we added Semantic Web Rule Language (SWRL) rules. We evaluated logical reasoning by adding a sample of 6 patient scenarios and used SPARQL queries to retrieve and test targeted inferences. The ontology has 145 classes, 11 object properties, and 18 data properties across 949 axioms that establish concept relationships. To standardize clinical concepts, we added 392 annotations, including descriptive definitions based on controlled vocabularies. SPARQL query evaluations across clinical scenarios confirmed the ontology ability to support rule based reasoning and therapy recommendations, providing a foundation for consistent documentation practices, integration into clinical data models, and advanced analysis of NIRS outcomes. In conclusion, we unified NIRS concepts into an ontological framework and demonstrated its applicability through the evaluation of patient scenarios and alignment with standardized vocabularies.
The science and clinical practice of medical physics has been integral to the advancement of radiology and radiation therapy for over a century. In parallel, advances in surgery - including intraoperative imaging, registration, and other technologies within the expertise of medical physicists - have advanced primarily in connection to other disciplines, such as biomedical engineering and computer science, and via somewhat distinct translational paths. This review article briefly traces the parallel and convergent evolution of such scientific, engineering, and clinical domains with an eye to a potentially broader, more impactful role of medical physics in research and clinical practice of surgery. A review of image-guided surgery technologies is offered, including intraoperative imaging, tracking / navigation, image registration, visualization, and surgical robotics across a spectrum of surgical applications. Trends and drivers for research and innovation are traced, including federal funding and academic-industry partnership, and some of the major challenges to achieving major clinical impact are described. Opportunities for medical physicists to expand expertise and contribute to the advancement of surgery in the decade ahead are outlined, including research and innovation, data science approaches, improving efficiency through operations research and optimization, improving patient safety, and bringing rigorous quality assurance to technologies and processes in the circle of care for surgery. Challenges abound but appear tractable, including domain knowledge, professional qualifications, and the need for investment and clinical partnership.
Accurate medical diagnosis often involves progressive visual focusing and iterative reasoning, characteristics commonly observed in clinical workflows. While recent vision-language models demonstrate promising chain-of-thought (CoT) reasoning capabilities via reinforcement learning with verifiable rewards (RLVR), their purely on-policy learning paradigm tends to reinforce superficially coherent but clinically inaccurate reasoning paths. We propose MedEyes, a novel reinforcement learning framework that dynamically models clinician-style diagnostic reasoning by progressively attending to and interpreting relevant medical image regions. By incorporating off-policy expert guidance, MedEyes converts expert visual search trajectories into structured external behavioral signals, guiding the model toward clinically aligned visual reasoning. We design the Gaze-guided Reasoning Navigator (GRN) to emulate the diagnostic process through a dual-mode exploration strategy, scanning for systematic abnormality localization and drilling for detailed regional analysis. To balance expert imitation and autonomous discovery, we introduce the Confidence Value Sampler (CVS), which employs nucleus sampling and adaptive termination to create diverse yet credible exploration paths. Finally, the dual-stream GRPO optimization framework decouples on-policy and off-policy learning signals, mitigating reward assimilation and entropy collapse. Experiments demonstrate that MedEyes achieves an average performance improvement of +8.5pp across multiple medical VQA benchmarks, validating MedEyes's potential in building trustworthy medical AI systems. Code is available at https://github.com/zhcz328/MedEyes.
Angela Mastrianni, Mary Suhyun Kim, Travis M. Sullivan
et al.
AI-enabled decision-support systems aim to help medical providers rapidly make decisions with limited information during medical emergencies. A critical challenge in developing these systems is supporting providers in interpreting the system output to make optimal treatment decisions. In this study, we designed and evaluated an AI-enabled decision-support system to aid providers in treating patients with traumatic injuries. We first conducted user research with physicians to identify and design information types and AI outputs for a decision-support display. We then conducted an online experiment with 35 medical providers from six health systems to evaluate two human-AI interaction strategies: (1) AI information synthesis and (2) AI information and recommendations. We found that providers were more likely to make correct decisions when AI information and recommendations were provided compared to receiving no AI support. We also identified two socio-technical barriers to providing AI recommendations during time-critical medical events: (1) an accuracy-time trade-off in providing recommendations and (2) polarizing perceptions of recommendations between providers. We discuss three implications for developing AI-enabled decision support used in time-critical events, contributing to the limited research on human-AI interaction in this context.
Dear Editor,
Concerning the article of Bollano E et al. “Surgical treatment of uncomplicated Pilonidal Sinus with a simple closed technique” has been gathering our attention. In this LTE, we would like to critically discuss some of the author's statements. Firstly, the stated pathophysiology of PSD is an outdated theory. Furthermore, primary midline closure, as postulated by the author, is not the surgical procedure of first choice, as several large reviews have shown.
The letter discusses the rationale behind adopting the simple closed technique, highlighting its efficacy and potential advantages. By presenting data from our experiences in Albania, we aim to contribute valuable insights to the global discourse on pilonidal sinus treatment.
This letter is a noteworthy addition to AJTES, offering fresh insights into the treatment landscape of pilonidal sinus. We trust the editorial team will find the content aligned with the journal's objectives and scope.
Your consideration of this submission is highly appreciated, and we look forward to the possibility of contributing to the journal's ongoing dialogue on innovative surgical approaches.
Surgery, Medical emergencies. Critical care. Intensive care. First aid
Ahmed Hasanin, Filippo Sanfilippo, Martin W Dünser
et al.
Abstract Acute circulatory shock is a life-threatening emergency requiring an efficient and timely management plan, which varies according to shock etiology and pathophysiology. Specific guidelines have been developed for each type of shock; however, there is a need for a clear timeline to promptly implement initial life-saving interventions during the early phase of shock recognition and management. A simple, easily memorable bundle of interventions could facilitate standardized management with clear targets and specified timeline. The authors propose the “MINUTES” acronym which summarizes essential interventions which should be performed within the first 30 min following shock recognition. All the interventions in the MINUTES bundle are suitable for any patient with undifferentiated shock. In addition to the acronym, we suggest a timeline for each step, balancing the feasibility and urgency of each intervention. The MINUTES acronym includes seven sequential steps which should be performed in the first 30 min following shock recognition: Maintain “ABCs”, INfuse vasopressors and/or fluids (to support hemodynamic/perfusion) and INvestigate with simple blood tests, Ultrasound to detect the type of shock, Treat the underlying Etiology, and Stabilize organ perfusion.
Medical emergencies. Critical care. Intensive care. First aid
Segment Anything Model (SAM) has gained significant attention because of its ability to segment various objects in images given a prompt. The recently developed SAM 2 has extended this ability to video inputs. This opens an opportunity to apply SAM to 3D images, one of the fundamental tasks in the medical imaging field. In this paper, we extensively evaluate SAM 2's ability to segment both 2D and 3D medical images by first collecting 21 medical imaging datasets, including surgical videos, common 3D modalities such as computed tomography (CT), magnetic resonance imaging (MRI), and positron emission tomography (PET) as well as 2D modalities such as X-ray and ultrasound. Two evaluation settings of SAM 2 are considered: (1) multi-frame 3D segmentation, where prompts are provided to one or multiple slice(s) selected from the volume, and (2) single-frame 2D segmentation, where prompts are provided to each slice. The former only applies to videos and 3D modalities, while the latter applies to all datasets. Our results show that SAM 2 exhibits similar performance as SAM under single-frame 2D segmentation, and has variable performance under multi-frame 3D segmentation depending on the choices of slices to annotate, the direction of the propagation, the predictions utilized during the propagation, etc. We believe our work enhances the understanding of SAM 2's behavior in the medical field and provides directions for future work in adapting SAM 2 to this domain. Our code is available at: https://github.com/mazurowski-lab/segment-anything2-medical-evaluation.
Angela Andreella, Gaia Bertarelli, Federico Caldura
et al.
This study investigates how to define and measure inclusivity in Italy's early childhood education and care (ECEC) services, bringing to light the gap between legislative principles and local/regional applications. The Italian legislative decree n. 65/2017 prescribes inclusivity in ECEC, defined as being open to all children and indicating it as a top priority. To delve into this concept, we propose a two-step model. First, a latent trait model estimates an inclusivity index as a latent variable. Then, a mixed quantile model examines the distribution of this novel latent inclusivity index across Italian regions. Our findings reveal a substantial variation in inclusivity across Italy. In addition, a proper indicator based on the latent inclusivity index defined in the first step is provided at the NUTS-3 level using the empirical best predictor approach. From our analysis, public facilities demonstrate a higher level of inclusivity compared to their private counterparts. Despite these challenges, we are compelled to identify positive scenarios that can serve as models for regions facing more critical situations. Besides its methodological advancement, this paper provides policymakers and stakeholders with an evident call to action, offering valuable insights into the inclusivity landscape of Italian ECEC services. It underscores the urgent need to standardize the accessibility characteristics of ECEC services throughout Italy to ensure equitable access for all children.
Data augmentation is one of the most effective techniques to improve the generalization performance of deep neural networks. Yet, despite often facing limited data availability in medical image analysis, it is frequently underutilized. This appears to be due to a gap in our collective understanding of the efficacy of different augmentation techniques across medical imaging tasks and modalities. One domain where this is especially true is breast ultrasound images. This work addresses this issue by analyzing the effectiveness of different augmentation techniques for the classification of breast lesions in ultrasound images. We assess the generalizability of our findings across several datasets, demonstrate that certain augmentations are far more effective than others, and show that their usage leads to significant performance gains.
Shortcomings of current models of moderation have driven policy makers, scholars, and technologists to speculate about alternative models of content moderation. While alternative models provide hope for the future of online spaces, they can fail without proper scaffolding. Community moderators are routinely confronted with similar issues and have therefore found creative ways to navigate these challenges. Learning more about the decisions these moderators make, the challenges they face, and where they are successful can provide valuable insight into how to ensure alternative moderation models are successful. In this study, I perform a collaborative ethnography with moderators of r/AskHistorians, a community that uses an alternative moderation model, highlighting the importance of accounting for power in moderation. Drawing from Black feminist theory, I call this "intersectional moderation." I focus on three controversies emblematic of r/AskHistorians' alternative model of moderation: a disagreement over a moderation decision; a collaboration to fight racism on Reddit; and a period of intense turmoil and its impact on policy. Through this evidence I show how volunteer moderators navigated multiple layers of power through care work. To ensure the successful implementation of intersectional moderation, I argue that designers should support decision-making processes and policy makers should account for the impact of the sociotechnical systems in which moderators work.