Avicenna’s is well known in the world as one of the famous physicians and philosophers. After the publication of the book Al-Mā’a authored by Abu Muḥammad al-ʾAzdī (d. 456 AH), it became clear that Abu Muḥammad al-ʾAzdī was one of Avicenna’s students. In his book, ʾAzdī has mentioned Avicenna nearly a hundred times as his teacher in medicine. The recognition of Avicenna is followed in this study from the point of view of ʾAzdī.This research employs a content analysis method with a historical-descriptive and analytical approach, utilizing library resources for data collection. The process of data collection primarily focuses on the book Al-Mā’a (three volumes). This study was carried out in multiple stages of finding keywords, searching in the text of Kitab al-Mā’a, the historical books, and Islamic encyclopedias, as supplementary data (to find the related data), along with searching in Persian and Latin electronic databases, categorizing, sorting, and analyzing the content.There are numerous quotations from Avicenna in the book Al-Mā’a written by ʾAzdī. New data is obtained from the mentioned book that needs more attention. ʾAzdī has interpreted the medical approach of his master. His work contains unique medical insights attributed to Avicenna that do not appear in Avicenna’s extant works, such as the Canon of Medicine. By migrating to Andalusia and staying in Balansīya (now Valencia), ʾAzdī introduced Avicenna and his Canon of Medicine and other works there.
Medicine, History of medicine. Medical expeditions
Large language models hold promise for addressing medical challenges, such as medical diagnosis reasoning, research knowledge acquisition, clinical decision-making, and consumer health inquiry support. However, they often generate hallucinations due to limited medical knowledge. Incorporating external knowledge is therefore critical, which necessitates multi-source knowledge acquisition. We address this challenge by framing it as a source planning problem, which is to formulate context-appropriate queries tailored to the attributes of diverse sources. Existing approaches either overlook source planning or fail to achieve it effectively due to misalignment between the model's expectation of the sources and their actual content. To bridge this gap, we present MedOmniKB, a repository comprising multigenre and multi-structured medical knowledge sources. Leveraging these sources, we propose the Source Planning Optimisation method, which enhances multi-source utilisation. Our approach involves enabling an expert model to explore and evaluate potential plans while training a smaller model to learn source alignment. Experimental results demonstrate that our method substantially improves multi-source planning performance, enabling the optimised small model to achieve state-of-the-art results in leveraging diverse medical knowledge sources.
Accurate medical diagnosis often involves progressive visual focusing and iterative reasoning, characteristics commonly observed in clinical workflows. While recent vision-language models demonstrate promising chain-of-thought (CoT) reasoning capabilities via reinforcement learning with verifiable rewards (RLVR), their purely on-policy learning paradigm tends to reinforce superficially coherent but clinically inaccurate reasoning paths. We propose MedEyes, a novel reinforcement learning framework that dynamically models clinician-style diagnostic reasoning by progressively attending to and interpreting relevant medical image regions. By incorporating off-policy expert guidance, MedEyes converts expert visual search trajectories into structured external behavioral signals, guiding the model toward clinically aligned visual reasoning. We design the Gaze-guided Reasoning Navigator (GRN) to emulate the diagnostic process through a dual-mode exploration strategy, scanning for systematic abnormality localization and drilling for detailed regional analysis. To balance expert imitation and autonomous discovery, we introduce the Confidence Value Sampler (CVS), which employs nucleus sampling and adaptive termination to create diverse yet credible exploration paths. Finally, the dual-stream GRPO optimization framework decouples on-policy and off-policy learning signals, mitigating reward assimilation and entropy collapse. Experiments demonstrate that MedEyes achieves an average performance improvement of +8.5pp across multiple medical VQA benchmarks, validating MedEyes's potential in building trustworthy medical AI systems. Code is available at https://github.com/zhcz328/MedEyes.
Deep learning has achieved significant breakthroughs in medical imaging, but these advancements are often dependent on large, well-annotated datasets. However, obtaining such datasets poses a significant challenge, as it requires time-consuming and labor-intensive annotations from medical experts. Consequently, there is growing interest in learning paradigms such as incomplete, inexact, and absent supervision, which are designed to operate under limited, inexact, or missing labels. This survey categorizes and reviews the evolving research in these areas, analyzing around 600 notable contributions since 2018. It covers tasks such as image classification, segmentation, and detection across various medical application areas, including but not limited to brain, chest, and cardiac imaging. We attempt to establish the relationships among existing research studies in related areas. We provide formal definitions of different learning paradigms and offer a comprehensive summary and interpretation of various learning mechanisms and strategies, aiding readers in better understanding the current research landscape and ideas. We also discuss potential future research challenges.
Daniel Wolf, Heiko Hillenhagen, Billurvan Taskin
et al.
Clinical decision-making relies heavily on understanding relative positions of anatomical structures and anomalies. Therefore, for Vision-Language Models (VLMs) to be applicable in clinical practice, the ability to accurately determine relative positions on medical images is a fundamental prerequisite. Despite its importance, this capability remains highly underexplored. To address this gap, we evaluate the ability of state-of-the-art VLMs, GPT-4o, Llama3.2, Pixtral, and JanusPro, and find that all models fail at this fundamental task. Inspired by successful approaches in computer vision, we investigate whether visual prompts, such as alphanumeric or colored markers placed on anatomical structures, can enhance performance. While these markers provide moderate improvements, results remain significantly lower on medical images compared to observations made on natural images. Our evaluations suggest that, in medical imaging, VLMs rely more on prior anatomical knowledge than on actual image content for answering relative position questions, often leading to incorrect conclusions. To facilitate further research in this area, we introduce the MIRP , Medical Imaging Relative Positioning, benchmark dataset, designed to systematically evaluate the capability to identify relative positions in medical images.
The research evaluates lightweight medical abstract classification methods to establish their maximum performance capabilities under financial budget restrictions. On the public medical abstracts corpus, we finetune BERT base and Distil BERT with three objectives cross entropy (CE), class weighted CE, and focal loss under identical tokenization, sequence length, optimizer, and schedule. DistilBERT with plain CE gives the strongest raw argmax trade off, while a post hoc operating point selection (validation calibrated, classwise thresholds) sub stantially improves deployed performance; under this tuned regime, focal benefits most. We report Accuracy, Macro F1, and WeightedF1, release evaluation artifacts, and include confusion analyses to clarify error structure. The practical takeaway is to start with a compact encoder and CE, then add lightweight calibration or thresholding when deployment requires higher macro balance.
According to ancient Iranian mythology, the world is depicted as a battleground between the forces of good and evil. Ahuramazda is revered as the ultimate source of goodness, while Ahriman is portrayed as the creator of all things malevolent, such as darkness, ignorance, pain, and disease. These mythological concepts have been passed down through references in the Avesta texts and Pahlavi literature, providing insights into the beliefs of ancient Iranians. In a scholarly exploration of these themes, the research delves into the textual evidence concerning illness, treatment, and the origins of these concepts as presented in the Bundahishn, a reliable source for understanding ancient Iranian beliefs. Ancient Iranian cosmology posits that afflictions and maladies stem from the destructive influence of Ahriman and his cohorts, who seek to wreak havoc upon creation. Conversely, the forces of Ahuramazda strive to counteract this demonic evil by imparting medical knowledge to humanity and teaching healing practices. Within this dualistic worldview, pain and disease are attributed to demonic origins, while medicine and treatment are associated with Ahuramazda. The ancient Iranians viewed the pursuit of medical knowledge and the practice of pharmacy as integral components of the eternal battle between good and evil. In this framework, safeguarding health and administering treatment are essential in the ongoing struggle against evil forces. This holistic perspective underscores the interconnectedness of physical well-being with spiritual beliefs, emphasizing the role of individuals in preserving their health as an act of resistance against demonic influences.
Medicine, History of medicine. Medical expeditions
Jamil Zaghir, Marco Naguib, Mina Bjelogrlic
et al.
Prompt engineering is crucial for harnessing the potential of large language models (LLMs), especially in the medical domain where specialized terminology and phrasing is used. However, the efficacy of prompt engineering in the medical domain remains to be explored. In this work, 114 recent studies (2022-2024) applying prompt engineering in medicine, covering prompt learning (PL), prompt tuning (PT), and prompt design (PD) are reviewed. PD is the most prevalent (78 articles). In 12 papers, PD, PL, and PT terms were used interchangeably. ChatGPT is the most commonly used LLM, with seven papers using it for processing sensitive clinical data. Chain-of-Thought emerges as the most common prompt engineering technique. While PL and PT articles typically provide a baseline for evaluating prompt-based approaches, 64% of PD studies lack non-prompt-related baselines. We provide tables and figures summarizing existing work, and reporting recommendations to guide future research contributions.
Recently, the research community of computerized medical imaging has started to discuss and address potential fairness issues that may emerge when developing and deploying AI systems for medical image analysis. This chapter covers some of the pressing challenges encountered when doing research in this area, and it is intended to raise questions and provide food for thought for those aiming to enter this research field. The chapter first discusses various sources of bias, including data collection, model training, and clinical deployment, and their impact on the fairness of machine learning algorithms in medical image computing. We then turn to discussing open challenges that we believe require attention from researchers and practitioners, as well as potential pitfalls of naive application of common methods in the field. We cover a variety of topics including the impact of biased metrics when auditing for fairness, the leveling down effect, task difficulty variations among subgroups, discovering biases in unseen populations, and explaining biases beyond standard demographic attributes.
Data augmentation is one of the most effective techniques to improve the generalization performance of deep neural networks. Yet, despite often facing limited data availability in medical image analysis, it is frequently underutilized. This appears to be due to a gap in our collective understanding of the efficacy of different augmentation techniques across medical imaging tasks and modalities. One domain where this is especially true is breast ultrasound images. This work addresses this issue by analyzing the effectiveness of different augmentation techniques for the classification of breast lesions in ultrasound images. We assess the generalizability of our findings across several datasets, demonstrate that certain augmentations are far more effective than others, and show that their usage leads to significant performance gains.
Large Vision-Language Models (LVLMs) are capable of handling diverse data types such as imaging, text, and physiological signals, and can be applied in various fields. In the medical field, LVLMs have a high potential to offer substantial assistance for diagnosis and treatment. Before that, it is crucial to develop benchmarks to evaluate LVLMs' effectiveness in various medical applications. Current benchmarks are often built upon specific academic literature, mainly focusing on a single domain, and lacking varying perceptual granularities. Thus, they face specific challenges, including limited clinical relevance, incomplete evaluations, and insufficient guidance for interactive LVLMs. To address these limitations, we developed the GMAI-MMBench, the most comprehensive general medical AI benchmark with well-categorized data structure and multi-perceptual granularity to date. It is constructed from 284 datasets across 38 medical image modalities, 18 clinical-related tasks, 18 departments, and 4 perceptual granularities in a Visual Question Answering (VQA) format. Additionally, we implemented a lexical tree structure that allows users to customize evaluation tasks, accommodating various assessment needs and substantially supporting medical AI research and applications. We evaluated 50 LVLMs, and the results show that even the advanced GPT-4o only achieves an accuracy of 53.96%, indicating significant room for improvement. Moreover, we identified five key insufficiencies in current cutting-edge LVLMs that need to be addressed to advance the development of better medical applications. We believe that GMAI-MMBench will stimulate the community to build the next generation of LVLMs toward GMAI.
Deep Learning is advancing medical imaging Research and Development (R&D), leading to the frequent clinical use of Artificial Intelligence/Machine Learning (AI/ML)-based medical devices. However, to advance AI R&D, two challenges arise: 1) significant data imbalance, with most data from Europe/America and under 10% from Asia, despite its 60% global population share; and 2) hefty time and investment needed to curate proprietary datasets for commercial use. In response, we established the first commercial medical imaging platform, encompassing steps like: 1) data collection, 2) data selection, 3) annotation, and 4) pre-processing. Moreover, we focus on harnessing under-represented data from Japan and broader Asia, including Computed Tomography, Magnetic Resonance Imaging, and Whole Slide Imaging scans. Using the collected data, we are preparing/providing ready-to-use datasets for medical AI R&D by 1) offering these datasets to AI firms, biopharma, and medical device makers and 2) using them as training/test data to develop tailored AI solutions for such entities. We also aim to merge Blockchain for data security and plan to synthesize rare disease data via generative AI. DataHub Website: https://medical-datahub.ai/
Evi M. C. Huijben, Sina Amirrajab, Josien P. W. Pluim
Out-of-distribution (OOD) detection is crucial for the safety and reliability of artificial intelligence algorithms, especially in the medical domain. In the context of the Medical OOD (MOOD) detection challenge 2023, we propose a pipeline that combines a histogram-based method and a diffusion-based method. The histogram-based method is designed to accurately detect homogeneous anomalies in the toy examples of the challenge, such as blobs with constant intensity values. The diffusion-based method is based on one of the latest methods for unsupervised anomaly detection, called DDPM-OOD. We explore this method and propose extensive post-processing steps for pixel-level and sample-level anomaly detection on brain MRI and abdominal CT data provided by the challenge. Our results show that the proposed DDPM method is sensitive to blur and bias field samples, but faces challenges with anatomical deformation, black slice, and swapped patches. These findings suggest that further research is needed to improve the performance of DDPM for OOD detection in medical images.
Rapid integration of large language models (LLMs) in health care is sparking global discussion about their potential to revolutionize health care quality and accessibility. At a time when improving health care quality and access remains a critical concern for countries worldwide, the ability of these models to pass medical examinations is often cited as a reason to use them for medical training and diagnosis. However, the impact of their inevitable use as a self-diagnostic tool and their role in spreading healthcare misinformation has not been evaluated. This study aims to assess the effectiveness of LLMs, particularly ChatGPT, from the perspective of an individual self-diagnosing to better understand the clarity, correctness, and robustness of the models. We propose the comprehensive testing methodology evaluation of LLM prompts (EvalPrompt). This evaluation methodology uses multiple-choice medical licensing examination questions to evaluate LLM responses. We use open-ended questions to mimic real-world self-diagnosis use cases, and perform sentence dropout to mimic realistic self-diagnosis with missing information. Human evaluators then assess the responses returned by ChatGPT for both experiments for clarity, correctness, and robustness. The results highlight the modest capabilities of LLMs, as their responses are often unclear and inaccurate. As a result, medical advice by LLMs should be cautiously approached. However, evidence suggests that LLMs are steadily improving and could potentially play a role in healthcare systems in the future. To address the issue of medical misinformation, there is a pressing need for the development of a comprehensive self-diagnosis dataset. This dataset could enhance the reliability of LLMs in medical applications by featuring more realistic prompt styles with minimal information across a broader range of medical fields.
Фразеологізми відображають особливий світогляд кожного народу, його психологію та мислення. Фразеологізми зазвичай виражають мову автора, надають їй неповторності, навіть чарівності. Мета статті – проаналізувати структуру та семантику фразеологічних одиниць, до складу яких входять соматизми, у художньому дискурсі О. Кобилянської, порівняти ці мовні одиниці з літературними відповідниками. Актуальність статті зумовлена необхідністю подальшого поглибленого вивчення ідіостилю Ольги Кобилянської з метою формування когнітивно-прагматичної концепції художнього дискурсу письменниці. Новизна наукового дослідження зумовлена тим, що структура і семантика фразеологізмів із основним словом-соматизмом у художньому дискурсі О. Кобилянської ще не були предметом поглибленого дослідження. Методи дослідження. У статті використано як основні загальнонаукові методи аналізу та синтезу, так і лінгводетивний, структурно-порівняльно-історичний методи. Висновки. У художньому дискурсі О. Кобилянської найчастіше трапляються фразеологізми із соматизмами рука та її частини (кулак, лікоть), нога та частина обличчя чи голови (face, mouth, tongue, eyes, teeth, hair). У художніх творах буковинського письменника фіксуємо дві групи фразеологізмів з основним словом-соматизмом. До першої групи належать ті, що відповідають літературній мові, зафіксовані у Словнику фразеологізмів української мови (права рука, дерти носа, мати в руках). До другої групи належать фразеологізми, які зазнали різної модифікації, переважно завдяки творчості автора, наприклад: зміни відмінків, порядку слів, заміни одного дієслова іншим, близькими за значенням, у тому числі діалектизми, що надає мові особливого колориту. Буковинський письменник (кулаки гризти, отворяти уста, кров у лице бухнула)
History of medicine. Medical expeditions, Social Sciences
Recently, medical report generation, which aims to automatically generate a long and coherent descriptive paragraph of a given medical image, has received growing research interests. Different from the general image captioning tasks, medical report generation is more challenging for data-driven neural models. This is mainly due to 1) the serious data bias: the normal visual regions dominate the dataset over the abnormal visual regions, and 2) the very long sequence. To alleviate above two problems, we propose an AlignTransformer framework, which includes the Align Hierarchical Attention (AHA) and the Multi-Grained Transformer (MGT) modules: 1) AHA module first predicts the disease tags from the input image and then learns the multi-grained visual features by hierarchically aligning the visual regions and disease tags. The acquired disease-grounded visual features can better represent the abnormal regions of the input image, which could alleviate data bias problem; 2) MGT module effectively uses the multi-grained features and Transformer framework to generate the long medical report. The experiments on the public IU-Xray and MIMIC-CXR datasets show that the AlignTransformer can achieve results competitive with state-of-the-art methods on the two datasets. Moreover, the human evaluation conducted by professional radiologists further proves the effectiveness of our approach.
Viet-Khoa Vo-Ho, Kashu Yamazaki, Hieu Hoang
et al.
Deep learning methods have been successful in solving tasks in machine learning and have made breakthroughs in many sectors owing to their ability to automatically extract features from unstructured data. However, their performance relies on manual trial-and-error processes for selecting an appropriate network architecture, hyperparameters for training, and pre-/post-procedures. Even though it has been shown that network architecture plays a critical role in learning feature representation feature from data and the final performance, searching for the best network architecture is computationally intensive and heavily relies on researchers' experience. Automated machine learning (AutoML) and its advanced techniques i.e. Neural Architecture Search (NAS) have been promoted to address those limitations. Not only in general computer vision tasks, but NAS has also motivated various applications in multiple areas including medical imaging. In medical imaging, NAS has significant progress in improving the accuracy of image classification, segmentation, reconstruction, and more. However, NAS requires the availability of large annotated data, considerable computation resources, and pre-defined tasks. To address such limitations, meta-learning has been adopted in the scenarios of few-shot learning and multiple tasks. In this book chapter, we first present a brief review of NAS by discussing well-known approaches in search space, search strategy, and evaluation strategy. We then introduce various NAS approaches in medical imaging with different applications such as classification, segmentation, detection, reconstruction, etc. Meta-learning in NAS for few-shot learning and multiple tasks is then explained. Finally, we describe several open problems in NAS.
The study of G. Roerichs scientific heritage is at its beginning. An important basis of Roerichs many-sided scientific activities were his investigations during the expeditions in Asia. The longest, most dangerous and laborious among them was the Central Asiatic expedition of his father - N.K. Roerich. The goal of this article is to examine G.N. Roerichs activities on every stage of the Central Asiatic expedition, as well as G.N. Roerichs works, publishing the results of the expedition research. G.N. Roerich presented the basic results in his monograph Trails to Inmost Asia: Five years of exploration with the Roerich Central Asian Expedition published in English in USA in 1931. Roerichs description of North and Central Tibet is unique because the theocratic state in Tibet and nomad tribes, which Roerich had observed, are no more existing. Roerichs field investigations continued the historical tradition of Russian expeditions in Central Asia. It extended our scientific knowledge about the insufficiently known regions in Asia.
Segmenting medical images accurately and reliably is important for disease diagnosis and treatment. It is a challenging task because of the wide variety of objects' sizes, shapes, and scanning modalities. Recently, many convolutional neural networks (CNN) have been designed for segmentation tasks and achieved great success. Few studies, however, have fully considered the sizes of objects, and thus most demonstrate poor performance for small objects segmentation. This can have a significant impact on the early detection of diseases. This paper proposes a Context Axial Reserve Attention Network (CaraNet) to improve the segmentation performance on small objects compared with several recent state-of-the-art models. We test our CaraNet on brain tumor (BraTS 2018) and polyp (Kvasir-SEG, CVC-ColonDB, CVC-ClinicDB, CVC-300, and ETIS-LaribPolypDB) segmentation datasets. Our CaraNet achieves the top-rank mean Dice segmentation accuracy, and results show a distinct advantage of CaraNet in the segmentation of small medical objects.
Chao-Chun Hsu, Shantanu Karnwal, Sendhil Mullainathan
et al.
Machine learning models depend on the quality of input data. As electronic health records are widely adopted, the amount of data in health care is growing, along with complaints about the quality of medical notes. We use two prediction tasks, readmission prediction and in-hospital mortality prediction, to characterize the value of information in medical notes. We show that as a whole, medical notes only provide additional predictive power over structured information in readmission prediction. We further propose a probing framework to select parts of notes that enable more accurate predictions than using all notes, despite that the selected information leads to a distribution shift from the training data ("all notes"). Finally, we demonstrate that models trained on the selected valuable information achieve even better predictive performance, with only 6.8% of all the tokens for readmission prediction.