Hasil untuk "Chinese language and literature"

Menampilkan 20 dari ~3643479 hasil · dari DOAJ, CrossRef, arXiv

JSON API
DOAJ Open Access 2025
Administrative practices to promote Chinese language education and cultural exchange at Confucius Institutes around the world: a systematic review

Meng Song, Lies Sercu

As an intercultural educational program to promote Chinese language education and cultural exchange worldwide, the Confucius Institute (CI) has operated through specific managerial models since its establishment in 2004. This contribution reports the results of a systematic review conducted on the contents relevant to CIs’ administration in the post-2010 literature, aiming to map the status quo and the salient issues of CIs’ administration and provide take-away messages for administrators and researchers. Following PRISMA guidelines, 94 publications in English with research cases from 38 countries/regions were selected for thematic analysis conducted in NVivo. The analysis has revealed that CIs’ achievements of official administrative objectives are mainly situated in higher-order aspects (e.g., policies), whereas the practical sides (e.g., daily management) are prone to challenges. The lack of localization is also a recurring issue. To tackle the challenges, suggestions are provided for strengthening localization, improving educational leadership, and enhancing the cultivation of intercultural competence. The research outcomes emphasize significant theoretical advancements, including an improved theoretical framework for studying the administration of CIs and localization guidelines. These guidelines offer structured frameworks for administrators and researchers to systematically localize and conduct research on onsite operations at CIs.

Education (General)
DOAJ Open Access 2025
Wohlfahrtiimonas chitiniclastica: current insights and complementary review from Chinese cases

Qin Yuan, Cheng Peng, Xin-Lin Sun et al.

ABSTRACT Wohlfahrtiimonas chitiniclastica is an emerging zoonotic pathogen associated with bacteremia, myiasis, and soft tissue infections. It is insufficiently identified and underestimated due to reasons, such as shortcomings of the traditional identification techniques and language barriers in local case reports from different regions. In this review, we summarize the currently available literature. In particular, we added previously overlooked cases from Chinese and other medical communities. The clinical characteristics, identification, and treatment of W. chitiniclastica are discussed. This work provides a complete review of the previous work including cases from human, animal, and other sources.

Biology (General)
CrossRef Open Access 2025
Teaching English Translation of Traditional Chinese Medicine Terminology: Lasswell's 5W Model Perspective

Xiaoxian Chen

The English translation of Traditional Chinese Medicine (TCM) terminology is a crucial step in the international dissemination of TCM. Its accuracy directly impacts global understanding and acceptance of TCM culture. From the perspec-tive of Lasswell's 5W Model, this paper explores the current state, challenges, and optimization strategies in teaching the English translation of TCM termi-nology. The study finds that current teaching practices face issues such as insuf-ficient standardization of terminology, barriers in cross-cultural communica-tion, and a shortage of qualified instructors. By analyzing domestic and interna-tional terminology translation standards, cross-cultural communication theo-ries, and practical teaching cases, this paper proposes strengthening terminolo-gy standardization, optimizing teaching methods, cultivating interdisciplinary talent, and leveraging new media technologies to enhance communication ef-fectiveness. The research provides theoretical support and practical pathways for the internationalization of TCM education.

CrossRef Open Access 2025
A Functional Study of Chinese and Western Painting in the 13th Century

Weiwei Meng

In the 13th century, China and the West were in different historical periods. The destruction of the Southern Song Dynasty by the Yuan Dynasty in 1279 caused unprecedented difficulties and torture to the intellectuals of the Yuan Dynasty. Therefore, the literati paintings of the Yuan Dynasty mostly conveyed the pain of losing the country and their fate through the use of pen and ink. The rulers also used painting to publicize their power, and more literati hermits made a living by painting. In the West, the Middle Ages were the era of religious rule, and the holy portrait was the representative work of medieval painting. The picture focused on propagating doctrines and symbolizing power. At the same time, painting played a significant role in shaping society and had a profound impact on future generations.

CrossRef Open Access 2025
A Comparative Study of Conceptual Metaphors for Sadness in English and Chinese

JiangLin Qiu

Metaphor is a key cognitive tool for understanding abstract concepts, with 'sadness' expressions in English and Chinese rich in conceptual metaphors. This study compares 'sadness' metaphors in both languages to reveal commonalities and differences. Traditionally seen as linguistic decoration, metaphors are now viewed, through cognitive linguistics, as fundamental to human thought. Lakoff and Johnson's work suggests metaphors map concrete domains onto abstract ones, shaping our worldview. Emotional metaphors, especially for 'sadness,' of-fer insights into cognitive patterns, cultural traditions, and thinking styles in English and Chinese. Commonalities include metaphors based on physical expe-rience, such as 'SADNESS IS DOWN,' reflecting psychological heaviness. Meta-phors rooted in natural phenomena, like 'SADNESS IS DARKNESS' and 'SADNESS IS COLD,' show shared perceptions of darkness and coldness evoking loneliness and despair. However, differences exist. Culturally, Chinese often uses 'gray' to symbolize sadness, while English associates 'blue' with melancholy. Seasonal imagery also varies, with 'autumn' linked to sadness in Chinese due to decay connotations, but to harvest in English. Cognitively, English expressions like 'heartbroken' emphasize inner turmoil, whereas Chinese phrases such as '肝肠寸断' highlight physical reactions to grief. These reflect cultural emphases: English on individual feelings, Chinese on bodily sensations influenced by traditional medicine. Differences stem from cultural traditions, religious beliefs, and think-ing styles. China's poetic heritage contrasts with Western literature's inner world focus. Religious influences, like Christian divine punishment in English and Buddhist/Taoist ideas in Chinese, further shape 'sadness' metaphors. This study deepens understanding of emotional linguistic expressions and highlights cognitive and cultural divergences, contributing to cross-cultural communica-tion and mutual understanding.

DOAJ Open Access 2024
An Evaluation of The Compilation of a Country-Specific Mandarin Textbook: Tourism and Hotel Management for Special Purposes

Low Hiang Loon

Driven by the “One Belt One Road Initiative”, Malaysia has a huge Chinese tourist market. In cultivating non-native Chinese to provide tourism and hotel services for Chinese tourists, does Malaysian country-specific "Mandarin Textbook: Tourism and Hotel Management for Special Purposes" meet the writing principle of Mandarin as a foreign language? This article evaluates and analyzes this teaching material based on three major principles, namely pertinence, practicality and scientificity. This article basically uses literature research method and comparative method to explore the global country-specific tourism Mandarin textbooks and related research, and compares and discusses with the Malaysian Mandarin Textbook. The findings show that this teaching material demonstrates its practicality: (1) It brings out the phenomenon of multi-racial and multi-cultural Malaysia; (2) It highlights the characteristics of tourist attractions in West and East Malaysia; (3) It shows the unique culture of Malaysia. However, the pertinence of this textbook has shortcomings. Although this textbook includes the role of local tour guide, the text repeatedly arranges the tour guide to travel as a tourist. In addition, the scientificity of this textbook needs to be strengthened. The compilation of the textbooks was mainly based on the writers’ years of teaching experience and reference to China's tourism Chinese textbooks. This article recommends that future textbooks need to be compiled based on learner needs analysis to improve the scientificity of the textbooks. Relevant content and vocabulary should be selected based on related corpora or published vocabulary lists to enhance the practicality of the teaching materials.

Chinese language and literature
arXiv Open Access 2024
CHBench: A Chinese Dataset for Evaluating Health in Large Language Models

Chenlu Guo, Nuo Xu, Yi Chang et al.

With the rapid development of large language models (LLMs), assessing their performance on health-related inquiries has become increasingly essential. The use of these models in real-world contexts-where misinformation can lead to serious consequences for individuals seeking medical advice and support-necessitates a rigorous focus on safety and trustworthiness. In this work, we introduce CHBench, the first comprehensive safety-oriented Chinese health-related benchmark designed to evaluate LLMs' capabilities in understanding and addressing physical and mental health issues with a safety perspective across diverse scenarios. CHBench comprises 6,493 entries on mental health and 2,999 entries on physical health, spanning a wide range of topics. Our extensive evaluations of four popular Chinese LLMs highlight significant gaps in their capacity to deliver safe and accurate health information, underscoring the urgent need for further advancements in this critical domain. The code is available at https://github.com/TracyGuo2001/CHBench.

en cs.CL
arXiv Open Access 2024
E-EVAL: A Comprehensive Chinese K-12 Education Evaluation Benchmark for Large Language Models

Jinchang Hou, Chang Ao, Haihong Wu et al.

With the accelerating development of Large Language Models (LLMs), many LLMs are beginning to be used in the Chinese K-12 education domain. The integration of LLMs and education is getting closer and closer, however, there is currently no benchmark for evaluating LLMs that focuses on the Chinese K-12 education domain. Therefore, there is an urgent need for a comprehensive natural language processing benchmark to accurately assess the capabilities of various LLMs in the Chinese K-12 education domain. To address this, we introduce the E-EVAL, the first comprehensive evaluation benchmark specifically designed for the Chinese K-12 education field. The E-EVAL consists of 4,351 multiple-choice questions at the primary, middle, and high school levels across a wide range of subjects, including Chinese, English, Politics, History, Ethics, Physics, Chemistry, Mathematics, and Geography. We conducted a comprehensive evaluation of E-EVAL on advanced LLMs, including both English-dominant and Chinese-dominant models. Findings show that Chinese-dominant models perform well compared to English-dominant models, with many scoring even above the GPT 4.0. However, almost all models perform poorly in complex subjects such as mathematics. We also found that most Chinese-dominant LLMs did not achieve higher scores at the primary school level compared to the middle school level. We observe that the mastery of higher-order knowledge by the model does not necessarily imply the mastery of lower-order knowledge as well. Additionally, the experimental results indicate that the Chain of Thought (CoT) technique is effective only for the challenging science subjects, while Few-shot prompting is more beneficial for liberal arts subjects. With E-EVAL, we aim to analyze the strengths and limitations of LLMs in educational applications, and to contribute to the progress and development of Chinese K-12 education and LLMs.

en cs.CL
arXiv Open Access 2024
CLongEval: A Chinese Benchmark for Evaluating Long-Context Large Language Models

Zexuan Qiu, Jingjing Li, Shijue Huang et al.

Developing Large Language Models (LLMs) with robust long-context capabilities has been the recent research focus, resulting in the emergence of long-context LLMs proficient in Chinese. However, the evaluation of these models remains underdeveloped due to a lack of benchmarks. To address this gap, we present CLongEval, a comprehensive Chinese benchmark for evaluating long-context LLMs. CLongEval is characterized by three key features: (1) Sufficient data volume, comprising 7 distinct tasks and 7,267 examples; (2) Broad applicability, accommodating to models with context windows size from 1K to 100K; (3) High quality, with over 2,000 manually annotated question-answer pairs in addition to the automatically constructed labels. With CLongEval, we undertake a comprehensive assessment of 6 open-source long-context LLMs and 2 leading commercial counterparts that feature both long-context abilities and proficiency in Chinese. We also provide in-depth analysis based on the empirical results, trying to shed light on the critical capabilities that present challenges in long-context settings. The dataset, evaluation scripts, and model outputs are released.

en cs.CL
arXiv Open Access 2024
Large Language Models for Classical Chinese Poetry Translation: Benchmarking, Evaluating, and Improving

Andong Chen, Lianzhang Lou, Kehai Chen et al.

Different from the traditional translation tasks, classical Chinese poetry translation requires both adequacy and fluency in translating culturally and historically significant content and linguistic poetic elegance. Large language models (LLMs) with impressive multilingual capabilities may bring a ray of hope to achieve this extreme translation demand. This paper first introduces a suitable benchmark (PoetMT) where each Chinese poetry has a recognized elegant translation. Meanwhile, we propose a new metric based on GPT-4 to evaluate the extent to which current LLMs can meet these demands. Our empirical evaluation reveals that the existing LLMs fall short in the challenging task. Hence, we propose a Retrieval-Augmented Machine Translation (RAT) method which incorporates knowledge related to classical poetry for advancing the translation of Chinese Poetry in LLMs. Experimental results show that RAT consistently outperforms all comparison methods regarding wildly used BLEU, COMET, BLEURT, our proposed metric, and human evaluation.

en cs.CL, cs.AI
arXiv Open Access 2024
Pragmatic Competence Evaluation of Large Language Models for the Korean Language

Dojun Park, Jiwoo Lee, Hyeyun Jeong et al.

Benchmarks play a significant role in the current evaluation of Large Language Models (LLMs), yet they often overlook the models' abilities to capture the nuances of human language, primarily focusing on evaluating embedded knowledge and technical skills. To address this gap, our study evaluates how well LLMs understand context-dependent expressions from a pragmatic standpoint, specifically in Korean. We use both Multiple-Choice Questions (MCQs) for automatic evaluation and Open-Ended Questions (OEQs) assessed by human experts. Our results show that GPT-4 leads with scores of 81.11 in MCQs and 85.69 in OEQs, closely followed by HyperCLOVA X. Additionally, while few-shot learning generally improves performance, Chain-of-Thought (CoT) prompting tends to encourage literal interpretations, which may limit effective pragmatic inference. Our findings highlight the need for LLMs to better understand and generate language that reflects human communicative norms.

en cs.CL
arXiv Open Access 2024
Part-of-Speech Tagger for Bodo Language using Deep Learning approach

Dhrubajyoti Pathak, Sanjib Narzary, Sukumar Nandi et al.

Language Processing systems such as Part-of-speech tagging, Named entity recognition, Machine translation, Speech recognition, and Language modeling (LM) are well-studied in high-resource languages. Nevertheless, research on these systems for several low-resource languages, including Bodo, Mizo, Nagamese, and others, is either yet to commence or is in its nascent stages. Language model plays a vital role in the downstream tasks of modern NLP. Extensive studies are carried out on LMs for high-resource languages. Nevertheless, languages such as Bodo, Rabha, and Mising continue to lack coverage. In this study, we first present BodoBERT, a language model for the Bodo language. To the best of our knowledge, this work is the first such effort to develop a language model for Bodo. Secondly, we present an ensemble DL-based POS tagging model for Bodo. The POS tagging model is based on combinations of BiLSTM with CRF and stacked embedding of BodoBERT with BytePairEmbeddings. We cover several language models in the experiment to see how well they work in POS tagging tasks. The best-performing model achieves an F1 score of 0.8041. A comparative experiment was also conducted on Assamese POS taggers, considering that the language is spoken in the same region as Bodo.

en cs.CL, cs.AI
arXiv Open Access 2024
MulCogBench: A Multi-modal Cognitive Benchmark Dataset for Evaluating Chinese and English Computational Language Models

Yunhao Zhang, Xiaohan Zhang, Chong Li et al.

Pre-trained computational language models have recently made remarkable progress in harnessing the language abilities which were considered unique to humans. Their success has raised interest in whether these models represent and process language like humans. To answer this question, this paper proposes MulCogBench, a multi-modal cognitive benchmark dataset collected from native Chinese and English participants. It encompasses a variety of cognitive data, including subjective semantic ratings, eye-tracking, functional magnetic resonance imaging (fMRI), and magnetoencephalography (MEG). To assess the relationship between language models and cognitive data, we conducted a similarity-encoding analysis which decodes cognitive data based on its pattern similarity with textual embeddings. Results show that language models share significant similarities with human cognitive data and the similarity patterns are modulated by the data modality and stimuli complexity. Specifically, context-aware models outperform context-independent models as language stimulus complexity increases. The shallow layers of context-aware models are better aligned with the high-temporal-resolution MEG signals whereas the deeper layers show more similarity with the high-spatial-resolution fMRI. These results indicate that language models have a delicate relationship with brain language representations. Moreover, the results between Chinese and English are highly consistent, suggesting the generalizability of these findings across languages.

en cs.CL
DOAJ Open Access 2023
A quantitative study on the effects of an interactive multimodal application to promote students' learning motivation and comprehension in studying Tang poetry

Chuang Chen, Nurullizam Jamiat

Studying China's Tang poetry is a crucially integrated part of the language curriculum in primary schools because it is an important part of its cultural heritage and classical literature. However, due to the fact that Tang poetry is written in classical Chinese, which is quite different from modern Chinese Mandarin, and the complex categories of this poetry style, learning Tang poetry can be a challenging experience for many students. To address this problem, this study developed an interactive multimodal application based on the cognitive-affective theory of learning with media to learn Tang poetry in an interactive way. In order to assess the effectiveness of this method, a pretest-posttest control group experiment was conducted. The experiment included eighty third-grade students randomly and equally divided into experimental and control groups from an elementary school in Xinzheng, Henan Province, to test (1) whether the interactive multimodal application improves students' reading comprehension of Tang poetry and (2) whether the application enhances students' intrinsic and/or extrinsic motivation in learning Tang poetry. A multimodal interactive application was used by the experimental group to learn Tang poetry, while the control group used a traditional classroom method. According to the study's findings, it was found that students' intrinsic motivation and comprehension of Tang poetry improved through the use of the interactive multimodal application mode.

DOAJ Open Access 2023
Hepatitis B reactivation in cancer patients receiving immune checkpoint inhibitors: a systematic review and meta-analysis

Zhengzheng Xia, Jianyu Zhang, Wenjun Chen et al.

Abstract Background Immunotherapy shows promise as a treatment option for various cancers. However, there is growing concern over potential complications from hepatitis B virus (HBV) reactivation after checkpoint blockade immunotherapy. Although most of the previous clinical trials on immune checkpoint inhibitors (ICIs) excluded patients with HBV, a few case reports and retrospective studies of HBV reactivation have been published. The aim of this study is to assess the risk of hepatitis B virus reactivation (HBVr) in patients receiving ICIs for advanced cancer. Methods English and Chinese language literature published prior to April 30, 2023, was searched in PubMed, EMBASE, Web of Science, Cochrane, SinoMed, CNKI and Wanfang Data for studies reporting HBVr rates in cancer patients treated with ICIs. A pooled risk estimate was calculated for HBVr rates with 95% confidence intervals (CI). Results Data from 34 studies including 7126 patients were retrieved and analyzed. The pooled HBVr rate in cancer patients treated with ICIs was 1.3% (I 2 = 90.44%, 95% CI: 0.2–2.9%, P < 0.001). Subgroup analysis revealed that patients diagnosed with hepatocellular carcinoma (HCC), HBV carriers, and patients from Asian regions or in developing countries have a higher rate of HBVr. Conclusions Our meta-analysis demonstrated a low risk of HBVr in patients treated with ICIs for advanced cancer. ICI treatment may be safely used in patients with existing HBV infection or chronic hepatitis B, accompanied by regular monitoring and appropriate antiviral prophylaxis if necessary. Graphical Abstract

Infectious and parasitic diseases, Public aspects of medicine
arXiv Open Access 2023
CValues: Measuring the Values of Chinese Large Language Models from Safety to Responsibility

Guohai Xu, Jiayi Liu, Ming Yan et al.

With the rapid evolution of large language models (LLMs), there is a growing concern that they may pose risks or have negative social impacts. Therefore, evaluation of human values alignment is becoming increasingly important. Previous work mainly focuses on assessing the performance of LLMs on certain knowledge and reasoning abilities, while neglecting the alignment to human values, especially in a Chinese context. In this paper, we present CValues, the first Chinese human values evaluation benchmark to measure the alignment ability of LLMs in terms of both safety and responsibility criteria. As a result, we have manually collected adversarial safety prompts across 10 scenarios and induced responsibility prompts from 8 domains by professional experts. To provide a comprehensive values evaluation of Chinese LLMs, we not only conduct human evaluation for reliable comparison, but also construct multi-choice prompts for automatic evaluation. Our findings suggest that while most Chinese LLMs perform well in terms of safety, there is considerable room for improvement in terms of responsibility. Moreover, both the automatic and human evaluation are important for assessing the human values alignment in different aspects. The benchmark and code is available on ModelScope and Github.

en cs.CL
arXiv Open Access 2023
Efficiently Adapting Pretrained Language Models To New Languages

Zoltan Csaki, Pian Pawakapan, Urmish Thakker et al.

Recent large language models (LLM) exhibit sub-optimal performance on low-resource languages, as the training data of these models is usually dominated by English and other high-resource languages. Furthermore, it is challenging to train models for low-resource languages, especially from scratch, due to a lack of high quality training data. Adapting pretrained LLMs reduces the need for data in the new language while also providing cross lingual transfer capabilities. However, naively adapting to new languages leads to catastrophic forgetting and poor tokenizer efficiency. In this work, we study how to efficiently adapt any existing pretrained LLM to a new language without running into these issues. In particular, we improve the encoding efficiency of the tokenizer by adding new tokens from the target language and study the data mixing recipe to mitigate forgetting. Our experiments on adapting an English LLM to Hungarian and Thai show that our recipe can reach better performance than open source models on the target language, with minimal regressions on English.

en cs.CL, cs.AI
arXiv Open Access 2023
MedBench: A Large-Scale Chinese Benchmark for Evaluating Medical Large Language Models

Yan Cai, Linlin Wang, Ye Wang et al.

The emergence of various medical large language models (LLMs) in the medical domain has highlighted the need for unified evaluation standards, as manual evaluation of LLMs proves to be time-consuming and labor-intensive. To address this issue, we introduce MedBench, a comprehensive benchmark for the Chinese medical domain, comprising 40,041 questions sourced from authentic examination exercises and medical reports of diverse branches of medicine. In particular, this benchmark is composed of four key components: the Chinese Medical Licensing Examination, the Resident Standardization Training Examination, the Doctor In-Charge Qualification Examination, and real-world clinic cases encompassing examinations, diagnoses, and treatments. MedBench replicates the educational progression and clinical practice experiences of doctors in Mainland China, thereby establishing itself as a credible benchmark for assessing the mastery of knowledge and reasoning abilities in medical language learning models. We perform extensive experiments and conduct an in-depth analysis from diverse perspectives, which culminate in the following findings: (1) Chinese medical LLMs underperform on this benchmark, highlighting the need for significant advances in clinical knowledge and diagnostic precision. (2) Several general-domain LLMs surprisingly possess considerable medical knowledge. These findings elucidate both the capabilities and limitations of LLMs within the context of MedBench, with the ultimate goal of aiding the medical research community.

en cs.CL, cs.AI
arXiv Open Access 2023
Deep Learning Based Code Generation Methods: Literature Review

Zezhou Yang, Sirong Chen, Cuiyun Gao et al.

This paper focuses on Code Generation task that aims at generating relevant code fragments according to given natural language descriptions. In the process of software development, developers often encounter two scenarios. One is requested to write a large amount of repetitive and low-technical code for implementing common functionalities. The other is writing code that depends on specific task requirements, which may necessitate the use of external resources such as documentation or other tools. Therefore, code generation has received a lot of attention among academia and industry for assisting developers in coding. In fact, it has also been one of the key concerns in the field of software engineering to make machines understand users' requirements and write programs on their own. The recent development of deep learning techniques especially pre-training models make the code generation task achieve promising performance. In this paper, we systematically review the current work on deep learning-based code generation and classify the current deep learning-based code generation methods into three categories: methods based on code features, methods incorporated with retrieval, and methods incorporated with post-processing. The first category refers to the methods that use deep learning algorithms for code generation based on code features, and the second and third categories of methods improve the performance of the methods in the first category. In this paper, the existing research results of each category of methods are systematically reviewed, summarized and commented. Besides, the paper summarizes and analyzes the corpus and the popular evaluation metrics used in the existing code generation work. Finally, the paper summarizes the overall literature review and provides a prospect on future research directions worthy of attention.

Halaman 11 dari 182174