Songhua Ma,1,2 Qing Zhou,3 Huiqun Wu4 1Department of Physiology, Medical School of Nantong University, Nantong, People’s Republic of China; 2Nantong University Xining College, Nantong, People’s Republic of China; 3Education and Training Department, Affiliated Hospital of Nantong University, Nantong, People’s Republic of China; 4Department of Medical Informatics, Medical School of Nantong University, Nantong, People’s Republic of ChinaCorrespondence: Songhua Ma, Email songhuama@ntu.edu.cnObjective: This study aims to explore the application of artificial intelligence in medical education by comparing research hotspots and evolutionary trends between China and the international community, ultimately proposing informed educational practices and policy recommendations.Methods: Literature was retrieved from the core collections of CNKI and Web of Science for the period 2014– 2024, limited to article and review publications. After applying a unified Boolean search strategy and deduplication, the data were analyzed using CiteSpace 6.4.R1 to examine publication trends, collaboration networks, keyword co-occurrence/clustering/burst detection, and co-citation patterns.Results: A total of 379 Chinese and 552 English records were included. Publications surged after 2018 and peaked during 2023– 2024. International hotspots centered on machine learning, deep learning, and large language models for simulation-based training and clinical reasoning; Chinese studies focused on “New Medical Sciences”, VR/AR, and medical imaging. The emergence of generative artificial intelligence and multimodal large models has become a new frontier in artificial intelligence research within global medical education from 2023 to 2024.Conclusion: This study is based on a comparison of two databases to reveal the hotspots and differences in artificial intelligence and medical education research between China and the international research community. It not only compensates for the time lag of existing research, but also proposes three major trends driven by artificial intelligence in the development of medical education (generative AI, personalized learning, immersive experience). A complementary pattern exists between technology-driven and scenario-driven orientations. We recommend integrating AI literacy and ethics into curricula, establishing Generative-AI teaching/assessment guidelines, and building cross-institutional, yearly knowledge-map monitoring for sustainable innovation in medical education.Keywords: artificial intelligence, AI, medical education, literature visualization, generative AI
Objective: To systematically characterize the developmental trajectory and interdisciplinary integration of intelligent diagnosis in traditional Chinese medicine (TCM) through quantitative topic evolution analysis, we addressed the fragmentation of existing research and clarified the long-term research structure and evolutionary patterns of the field. Methods: A topic evolution analysis was performed on Chinese-language literature pertaining to intelligent diagnosis in TCM. Publications were retrieved from the China National Knowledge Infrastructure (CNKI), Wanfang Data, and China Science and Technology Journal Database (VIP), covering the period from database inception to July 3, 2025. A hybrid segmentation approach, based on cumulative publication growth trends and inflection point detection, was applied to divide the research timeline into distinct stages. Subsequently, the latent Dirichlet allocation (LDA) model was used to extract research topics, followed by alignment and evolutionary analysis of topics across different stages. Results: A total of 3 919 publications published between 2003 and 2025 were included, and the research trajectory was divided into five stages based on data-driven breakpoint detection. The field exhibited a clear evolutionary shift from early rule-based systems and tongue-pulse image and signal analysis (2006 – 2010), to machine-learning-based syndrome and prescription modeling (2011 – 2015), followed by deep-learning-driven pattern recognition and formula association (2016 – 2020). Since 2021, research has increasingly emphasized knowledge-graph construction, multimodal integration, and intelligent clinical decision-support systems, with recent studies (2024 – 2025) showing the emergence of large language models and agent-based diagnostic frameworks. Topic evolution analysis further revealed sustained cross-stage continuity in syndrome modeling and prescription association analysis, alongside the progressive consolidation of integrated intelligent diagnostic platforms. Conclusion: By identifying key technological transitions and persistent core research themes, our findings offer a structured reference framework for the design of intelligent diagnostic systems, the construction of knowledge-driven clinical decision-support tools, and the alignment of AI models with TCM diagnostic logic. Importantly, the stage-based evolutionary insights derived from this analysis can inform future methodological choices, improve model interpretability and clinical applicability, and support the translation of intelligent TCM diagnosis from experimental research to real-world clinical practice.
Large Language Models achieve remarkable performance but incur substantial computational costs unsuitable for resource-constrained deployments. This paper presents the first comprehensive task-specific efficiency analysis comparing 16 language models across five diverse NLP tasks. We introduce the Performance-Efficiency Ratio (PER), a novel metric integrating accuracy, throughput, memory, and latency through geometric mean normalization. Our systematic evaluation reveals that small models (0.5--3B parameters) achieve superior PER scores across all given tasks. These findings establish quantitative foundations for deploying small models in production environments prioritizing inference efficiency over marginal accuracy gains.
Preface 1. Introduction: the problem of language in cross-cultural studies Part I. Between the Nation and the Individual: 2. Translating national character Lu Xun and Arthur Smith 3. The discourse of individualism Part II. Translingual modes of representation: 4. Homo Economicus and the question of novelistic realism 5. Narratives of desire: negotiating the real and the fantastic 6. The deixis of writing in the first person Part III. National Building and Culture Building: 7. Literary criticism as a discourse of legitimation 8. The making of the Compendium of Modern Chinese Literature 9. Rethinking culture and national essence Appendixes Notes Index.
As an important modern drama creator, Cao yu’s first works of art Thunderstorm served as a milestone, marking the phase rapid maturity of modern Chinese drama. Cao Yu’s efforts to bond Chinese reality with Western drama practice to creat an drama rich in length has garnered widespread attention from scholars and has been the subject of in-depth research. Numerous studies have chosen the character of Fan Yi as an analytical focus, but many of her distinctive qualities have not been sufficiently interpreted from the lens of drama theory. This could be expanded through Nietzsche’s Dionysian spirit of artistic origin, thereby broadening the scope of research on Chinese drama. This research starts from Nietzsche’s theory of the Dionysian spirit and employs comparative literature research methods and documentary analysis to conduct a cross-cultural study of Fan Yi. The article unearthed that Fan Yi’s madness has its ancient origin, able to dissolve social bounds while returning to true human existence. In addition, the way Cao Yu expresses Fan Yi’s enthusiasm and constructs the plot was also influenced by Nietzsche’s theory. The paper reveals that the success of Fanyi has a major influence on the forging of Chinese play and modern Western drama, providing inspirations for contemporary drama.
The built environment plays a crucial role in shaping residents’ quality of life and emotional well-being. In the context of growing efforts to promote livable and walkable cities, a key question emerges: how can emerging technologies—particularly virtual reality (VR)—be leveraged to evaluate and enhance urban environments through the lens of pedestrian emotional perception? This study systematically reviewed the literature published between 2015 and 2024 in the China National Knowledge Infrastructure (CNKI) and Web of Science (WOS) databases, ultimately identifying 37 Chinese-language and 113 English-language journal articles. Using bibliometric analysis and CiteSpace, the research mapped publication trends, research hotspots, and disciplinary networks across linguistic contexts. Results reveal that Chinese-language studies often emphasize embodied cognition and electroencephalogram (EEG) monitoring, while English-language studies focus more on VR application in stress recovery and health assessment. Based on this synthesis, this study proposes a “sensory–cognitive–affective” framework and a set of spatial intervention strategies, offering a novel perspective for emotion-driven urban design. The findings highlight a paradigm shift from engineering-oriented planning to human-centered approaches, with VR technologies serving as a critical enabling tool. This review contributes both conceptual and methodological foundations for future research at the intersection of immersive technologies, built environment studies, and urban emotional well-being.
To interact effectively with humans in the real world, it is important for agents to understand language that describes the dynamics of the environment--that is, how the environment behaves--rather than just task instructions specifying "what to do". Understanding this dynamics-descriptive language is important for human-agent interaction and agent behavior. Recent work address this problem using a model-based approach: language is incorporated into a world model, which is then used to learn a behavior policy. However, these existing methods either do not demonstrate policy generalization to unseen games or rely on limiting assumptions. For instance, assuming that the latency induced by inference-time planning is tolerable for the target task or expert demonstrations are available. Expanding on this line of research, we focus on improving policy generalization from a language-conditioned world model while dropping these assumptions. We propose a model-based reinforcement learning approach, where a language-conditioned world model is trained through interaction with the environment, and a policy is learned from this model--without planning or expert demonstrations. Our method proposes Language-aware Encoder for Dreamer World Model (LED-WM) built on top of DreamerV3. LED-WM features an observation encoder that uses an attention mechanism to explicitly ground language descriptions to entities in the observation. We show that policies trained with LED-WM generalize more effectively to unseen games described by novel dynamics and language compared to other baselines in several settings in two environments: MESSENGER and MESSENGER-WM.To highlight how the policy can leverage the trained world model before real-world deployment, we demonstrate the policy can be improved through fine-tuning on synthetic test trajectories generated by the world model.
The growing demand for automated writing assistance in diverse academic domains highlights the need for robust Chinese Grammatical Error Correction (CGEC) systems that can adapt across disciplines. However, existing CGEC research largely lacks dedicated benchmarks for multi-disciplinary academic writing, overlooking continual learning (CL) as a promising solution to handle domain-specific linguistic variation and prevent catastrophic forgetting. To fill this crucial gap, we introduce CL$^2$GEC, the first Continual Learning benchmark for Chinese Literature Grammatical Error Correction, designed to evaluate adaptive CGEC across multiple academic fields. Our benchmark includes 10,000 human-annotated sentences spanning 10 disciplines, each exhibiting distinct linguistic styles and error patterns. CL$^2$GEC focuses on evaluating grammatical error correction in a continual learning setting, simulating sequential exposure to diverse academic disciplines to reflect real-world editorial dynamics. We evaluate large language models under sequential tuning, parameter-efficient adaptation, and four representative CL algorithms, using both standard GEC metrics and continual learning metrics adapted to task-level variation. Experimental results reveal that regularization-based methods mitigate forgetting more effectively than replay-based or naive sequential approaches. Our benchmark provides a rigorous foundation for future research in adaptive grammatical error correction across diverse academic domains.
As large language models (LLMs) evolve into tool-using agents, the ability to browse the web in real-time has become a critical yardstick for measuring their reasoning and retrieval competence. Existing benchmarks such as BrowseComp concentrate on English and overlook the linguistic, infrastructural, and censorship-related complexities of other major information ecosystems -- most notably Chinese. To address this gap, we introduce BrowseComp-ZH, a high-difficulty benchmark purpose-built to comprehensively evaluate LLM agents on the Chinese web. BrowseComp-ZH consists of 289 multi-hop questions spanning 11 diverse domains. Each question is reverse-engineered from a short, objective, and easily verifiable answer (e.g., a date, number, or proper noun). A two-stage quality control protocol is applied to strive for high question difficulty and answer uniqueness. We benchmark over 20 state-of-the-art language models and agentic search systems on our proposed BrowseComp-ZH. Despite their strong conversational and retrieval capabilities, most models struggle severely: a large number achieve accuracy rates below 10%, and only a handful exceed 20%. Even the best-performing system, OpenAI's DeepResearch, reaches just 42.9%. These results demonstrate the considerable difficulty of BrowseComp-ZH, where success demands not only effective retrieval strategies, but also sophisticated reasoning and information reconciliation -- capabilities that current models still struggle to master. Our dataset, construction guidelines, and benchmark results have been publicly released at https://github.com/PALIN2018/BrowseComp-ZH.
Assessing language proficiency is essential for education, as it enables instruction tailored to learners needs. This paper investigates the use of Large Language Models (LLMs) for automatically classifying German texts according to the Common European Framework of Reference for Languages (CEFR) into different proficiency levels. To support robust training and evaluation, we construct a diverse dataset by combining multiple existing CEFR-annotated corpora with synthetic data. We then evaluate prompt-engineering strategies, fine-tuning of a LLaMA-3-8B-Instruct model and a probing-based approach that utilizes the internal neural state of the LLM for classification. Our results show a consistent performance improvement over prior methods, highlighting the potential of LLMs for reliable and scalable CEFR classification.
Section identification is an important task for library science, especially knowledge management. Identifying the sections of a paper would help filter noise in entity and relation extraction. In this research, we studied the paper section identification problem in the context of Chinese medical literature analysis, where the subjects, methods, and results are more valuable from a physician's perspective. Based on previous studies on English literature section identification, we experiment with the effective features to use with classic machine learning algorithms to tackle the problem. It is found that Conditional Random Fields, which consider sentence interdependency, is more effective in combining different feature sets, such as bag-of-words, part-of-speech, and headings, for Chinese literature section identification. Moreover, we find that classic machine learning algorithms are more effective than generic deep learning models for this problem. Based on these observations, we design a novel deep learning model, the Structural Bidirectional Long Short-Term Memory (SLSTM) model, which models word and sentence interdependency together with the contextual information. Experiments on a human-curated asthma literature dataset show that our approach outperforms the traditional machine learning methods and other deep learning methods and achieves close to 90% precision and recall in the task. The model shows good potential for use in other text mining tasks. The research has significant methodological and practical implications.
Mainstream Natural Language Processing (NLP) research has ignored the majority of the world's languages. In moving from excluding the majority of the world's languages to blindly adopting what we make for English, we first risk importing the same harms we have at best mitigated and at least measured for English. However, in evaluating and mitigating harms arising from adopting new technologies into such contexts, we often disregard (1) the actual community needs of Language Technologies, and (2) biases and fairness issues within the context of the communities. In this extended abstract, we consider fairness, bias, and inclusion in Language Technologies through the lens of the Capabilities Approach. The Capabilities Approach centers on what people are capable of achieving, given their intersectional social, political, and economic contexts instead of what resources are (theoretically) available to them. We detail the Capabilities Approach, its relationship to multilingual and multicultural evaluation, and how the framework affords meaningful collaboration with community members in defining and measuring the harms of Language Technologies.
Objective: This study examines how well leading Chinese and Western large language models understand and apply Chinese social work principles, focusing on their foundational knowledge within a non-Western professional setting. We test whether the cultural context in the developing country influences model reasoning and accuracy. Method: Using a published self-study version of the Chinese National Social Work Examination (160 questions) covering jurisprudence and applied knowledge, we administered three testing conditions to eight cloud-based large language models - four Chinese and four Western. We examined their responses following official guidelines and evaluated their explanations' reasoning quality. Results: Seven models exceeded the 60-point passing threshold in both sections. Chinese models performed better in jurisprudence (median = 77.0 vs. 70.3) but slightly lower in applied knowledge (median = 65.5 vs. 67.0). Both groups showed cultural biases, particularly regarding gender equality and family dynamics. Models demonstrated strong professional terminology knowledge but struggled with culturally specific interventions. Valid reasoning in incorrect answers ranged from 16.4% to 45.0%. Conclusions: While both Chinese and Western models show foundational knowledge of Chinese social work principles, technical language proficiency does not ensure cultural competence. Chinese models demonstrate advantages in regulatory content, yet both Chinese and Western models struggle with culturally nuanced practice scenarios. These findings contribute to informing responsible AI integration into cross-cultural social work practice.
In this study, we delve into the validity of conventional personality questionnaires in capturing the human-like personality traits of Large Language Models (LLMs). Our objective is to assess the congruence between the personality traits LLMs claim to possess and their demonstrated tendencies in real-world scenarios. By conducting an extensive examination of LLM outputs against observed human response patterns, we aim to understand the disjunction between self-knowledge and action in LLMs.
Several reviews have been conducted to assess the association between greenspace and overweight or obesity, but the conclusions were inconsistent. However, an updated comprehensive review and meta‐analysis is warranted, because several high‐quality papers have been published more recently. The objectives of this study are to systematically and quantitatively assess the evidence for a link between greenspace with overweight/obesity and to make specific recommendations for further research. We searched three English language databases, four Chinese language databases and the reference lists of previously published reviews for epidemiological studies on greenspace and overweight/obesity published before January 2020. We developed inclusion criteria, screened the literature and extracted key data from selected papers. We assessed methodological quality and risk of bias, and we graded the credibility of the pooled evidence. We also performed sensitivity analyses. Fifty‐seven records met our inclusion criteria and were included in the study. Most studies were cross‐sectional designs (81%) and were from developed nations (88%). More than half (55%) of the included studies found beneficial associations between greenspace and overweight/obesity in overall or subpopulations. Our meta‐analytical results showed that greater normalized difference vegetation index was associated with lower odds of overweight/obesity in a statistically significant fashion (odds ratio [OR]: 0.88; 95% CI: 0.84, 0.91) but not residential proximity to greenspace (OR: 0.99; 95% CI: 0.99, 1.00), proportion of greenspace (OR: 0.96; 95% CI: 0.85, 1.08) or number of parks in an area (OR: 0.99; 95% CI: 0.97, 1.01). However, we detected high between‐study heterogeneity in two of the four meta‐analyses, which reduced the credibility of the pooled evidence. Current evidence indicates that there might be an association between greater access to greenspace and lower odds of overweight/obesity. However, additional high‐quality studies are needed to more definitively assess the evidence for a causal association.
The development of Chinese teachers in Thailand has been going on for a long time. At the present time, Thai Chinese teachers are the important resource persons on driving Chinese language teaching in Thailand. The study of the establishment of a panel of Chinese language proficient teachers and the development model of Chinese language proficient teachers in the primary and secondary level in Thailand is one of the important education issues at present. The objective of this research is to study the establishment of a panel of teachers who are experts in Chinese language and the development model of Chinese language proficient teachers in the primary and secondary schools in Thailand. The data was collected from 21 training sessions from Thai Chinese teachers nationwide. Altogether 1,400 teachers participated in the training during 2020-2022. The research methods include literature analysis, observation, and focus group interview. The results of the study showed that: 1) A panel of teachers who specialize in Chinese at various levels should be established concretely. 2) The development model of Chinese proficient teachers should be categorized in development, and 3) To promote and develop in order to increase the number of expert Chinese teachers.
Apatóczky Ákos Bertalan recenziója a The Routledge Handbook of the Mongols and Central-Eastern Europe című kötetről (szerk: Alexander V. Maiorov és Roman Hautala, Oxon – New York, Routledge, 2021).
Recent advancements in biological research leverage the integration of molecules, proteins, and natural language to enhance drug discovery. However, current models exhibit several limitations, such as the generation of invalid molecular SMILES, underutilization of contextual information, and equal treatment of structured and unstructured knowledge. To address these issues, we propose $\mathbf{BioT5}$, a comprehensive pre-training framework that enriches cross-modal integration in biology with chemical knowledge and natural language associations. $\mathbf{BioT5}$ utilizes SELFIES for $100%$ robust molecular representations and extracts knowledge from the surrounding context of bio-entities in unstructured biological literature. Furthermore, $\mathbf{BioT5}$ distinguishes between structured and unstructured knowledge, leading to more effective utilization of information. After fine-tuning, BioT5 shows superior performance across a wide range of tasks, demonstrating its strong capability of capturing underlying relations and properties of bio-entities. Our code is available at $\href{https://github.com/QizhiPei/BioT5}{Github}$.
Recently, significant public efforts have been directed towards developing low-cost models with capabilities akin to ChatGPT, thereby fostering the growth of open-source conversational models. However, there remains a scarcity of comprehensive and in-depth evaluations of these models' performance. In this study, we examine the influence of training data factors, including quantity, quality, and linguistic distribution, on model performance. Our analysis is grounded in several publicly accessible, high-quality instruction datasets, as well as our own Chinese multi-turn conversations. We assess various models using a evaluation set of 1,000 samples, encompassing nine real-world scenarios. Our goal is to supplement manual evaluations with quantitative analyses, offering valuable insights for the continued advancement of open-source chat models. Furthermore, to enhance the performance and training and inference efficiency of models in the Chinese domain, we extend the vocabulary of LLaMA - the model with the closest open-source performance to proprietary language models like GPT-3 - and conduct secondary pre-training on 3.4B Chinese words. We make our model, data, as well as code publicly available.