Hasil "English language"

arXiv Open Access 2026

Learning from Child-Directed Speech in Two-Language Scenarios: A French-English Case Study

Liel Binyamin, Elior Sulem

Research on developmentally plausible language models has largely focused on English, leaving open questions about multilingual settings. We present a systematic study of compact language models by extending BabyBERTa to English-French scenarios under strictly size-matched data conditions, covering monolingual, bilingual, and cross-lingual settings. Our design contrasts two types of training corpora: (i) child-directed speech (about 2.5M tokens), following BabyBERTa and related work, and (ii) multi-domain corpora (about 10M tokens), extending the BabyLM framework to French. To enable fair evaluation, we also introduce new resources, including French versions of QAMR and QASRL, as well as English and French multi-domain corpora. We evaluate the models on both syntactic and semantic tasks and compare them with models trained on Wikipedia-only data. The results reveal context-dependent effects: training on Wikipedia consistently benefits semantic tasks, whereas child-directed speech improves grammatical judgments in monolingual settings. Bilingual pretraining yields notable gains for textual entailment, with particularly strong improvements for French. Importantly, similar patterns emerge across BabyBERTa, RoBERTa, and LTG-BERT, suggesting consistent trends across architectures.

en cs.CL, cs.AI

Detail Sumber

CrossRef Open Access 2025

English as a Lingua Franca and World Englishes in English Language Teaching

Anna Mendoza, Pramod K. Sah, Shakina Rajendram

en

Detail DOI Sumber

arXiv Open Access 2025

Exploring the Structure of AI-Induced Language Change in Scientific English

Riley Galpin, Bryce Anderson, Tom S. Juzek

Scientific English has undergone rapid and unprecedented changes in recent years, with words such as "delve," "intricate," and "crucial" showing significant spikes in frequency since around 2022. These changes are widely attributed to the growing influence of Large Language Models like ChatGPT in the discourse surrounding bias and misalignment. However, apart from changes in frequency, the exact structure of these linguistic shifts has remained unclear. The present study addresses this and investigates whether these changes involve the replacement of synonyms by suddenly 'spiking words,' for example, "crucial" replacing "essential" and "key," or whether they reflect broader semantic and pragmatic qualifications. To further investigate structural changes, we include part of speech tagging in our analysis to quantify linguistic shifts over grammatical categories and differentiate between word forms, like "potential" as a noun vs. as an adjective. We systematically analyze synonym groups for widely discussed 'spiking words' based on frequency trends in scientific abstracts from PubMed. We find that entire semantic clusters often shift together, with most or all words in a group increasing in usage. This pattern suggests that changes induced by Large Language Models are primarily semantic and pragmatic rather than purely lexical. Notably, the adjective "important" shows a significant decline, which prompted us to systematically analyze decreasing lexical items. Our analysis of "collapsing" words reveals a more complex picture, which is consistent with organic language change and contrasts with the patterns of the abrupt spikes. These insights into the structure of language change contribute to our understanding of how language technology continues to shape human language.

en cs.CL, cs.AI

Detail DOI Sumber

arXiv Open Access 2025

PersianMedQA: Evaluating Large Language Models on a Persian-English Bilingual Medical Question Answering Benchmark

Mohammad Javad Ranjbar Kalahroodi, Amirhossein Sheikholselami, Sepehr Karimi et al.

Large Language Models (LLMs) have achieved remarkable performance on a wide range of Natural Language Processing (NLP) benchmarks, often surpassing human-level accuracy. However, their reliability in high-stakes domains such as medicine, particularly in low-resource languages, remains underexplored. In this work, we introduce PersianMedQA, a large-scale dataset of 20,785 expert-validated multiple-choice Persian medical questions from 14 years of Iranian national medical exams, spanning 23 medical specialties and designed to evaluate LLMs in both Persian and English. We benchmark 40 state-of-the-art models, including general-purpose, Persian fine-tuned, and medical LLMs, in zero-shot and chain-of-thought (CoT) settings. Our results show that closed-source general models (e.g., GPT-4.1) consistently outperform all other categories, achieving 83.09% accuracy in Persian and 80.7% in English, while Persian fine-tuned models such as Dorna underperform significantly (e.g., 34.9% in Persian), often struggling with both instruction-following and domain reasoning. We also analyze the impact of translation, showing that while English performance is generally higher, 3-10% of questions can only be answered correctly in Persian due to cultural and clinical contextual cues that are lost in translation. Finally, we demonstrate that model size alone is insufficient for robust performance without strong domain or language adaptation. PersianMedQA provides a foundation for evaluating bilingual and culturally grounded medical reasoning in LLMs. The PersianMedQA dataset is available: https://huggingface.co/datasets/MohammadJRanjbar/PersianMedQA .

en cs.CL, cs.IT

Detail Sumber

arXiv Open Access 2025

Whisper-LM: Improving ASR Models with Language Models for Low-Resource Languages

Xabier de Zuazo, Eva Navas, Ibon Saratxaga et al.

Automatic speech recognition systems have undoubtedly advanced with the integration of multilingual and multitask models such as Whisper, which have shown a promising ability to understand and process speech across a wide range of languages. Despite their robustness, these models often fall short in handling the linguistic distinctions of minority languages. This study addresses this gap by integrating traditional and novel language models with fine-tuned Whisper models to raise their performance in less commonly studied languages. Through rigorous fine-tuning and evaluation across multiple datasets, we demonstrate substantial improvements in word error rate, particularly in low-resource scenarios. Our approach not only does take advantage of the extensive data Whisper was pre-trained on, but also complements its linguistic adaptability by incorporating language models. We obtained improvements up to 51% for in-distribution datasets and up to 34% for out-of-distribution sentences using statistical language models, while large language models provided moderate but consistently robust improvement across diverse linguistic contexts. The findings reveal that, while the integration reliably benefits all model sizes, the extent of improvement varies, highlighting the importance of optimized language model parameters. Finally, we emphasize the importance of selecting appropriate evaluation parameters when reporting the results using transformer-based ASR models. In summary, this research clears the way for more inclusive ASR technologies that perform better across languages by enriching their linguistic knowledge. For further implementation details of this study, the technical documentation and source code are available at http://www.github.com/hitz-zentroa/whisper-lm.

en cs.CL

Detail Sumber

DOAJ Open Access 2025

Factors Influencing Lecturers' Perceptions of the Use of Role Play Method in Postpartum Midwifery Teaching Using English

Kholifatul Ummah, Bilqis Mustofa, Nurul Fathiyyah et al.

English is the primary language used in global communication. In the context of midwifery services, the application of English aims to equip students with communication skills that support international clinical practice. One instructional method considered effective in achieving this goal is role play, which enhances students’ clinical communication skills and critical thinking abilities. However, the success of its implementation largely depends on lecturers’ perceptions and readiness. This study aims to explore the factors influencing lecturers’ perceptions of using role play in postpartum midwifery care instruction delivered in English. Grounded in constructivist theory and the communicative language teaching approach, this research employs a qualitative design through in-depth interviews with midwifery lecturers at Dr. Soetomo University, Surabaya. Additional data were collected via questionnaires covering demographic backgrounds, teaching experience, English proficiency, and attitudes toward instructional innovation. The findings reveal that lecturers’ perceptions are influenced by English language proficiency, previous experience using role play, positive attitudes toward active learning, and institutional support. This study contributes to a deeper understanding of the dynamics between English language integration and active learning methods in midwifery education. The novelty of this study lies in its qualitative approach, which uncovers pedagogical adaptation processes in bilingual instruction contexts and highlights implications for professional development and the design of supportive learning environments.

English language, English literature

Detail DOI Sumber

DOAJ Open Access 2025

Antibiotic awareness: exploring knowledge among culturally and linguistically diverse patients

Kylie Tran, Vinushan Kuganathan, Jessica Lam et al.

Abstract Background: Effective antimicrobial stewardship (AMS) programs must address the needs of culturally and linguistically diverse (CALD) patients who often experience language barriers and varying cultural beliefs regarding antibiotics. They are at greater risk of receiving suboptimal or inappropriate care, yet guidance to support AMS practices for this population remains limited. Aim: To investigate antibiotic knowledge, perspectives, and experiences of CALD patients. Methods: A cross-sectional survey was conducted between May to November 2023 at a Western Sydney tertiary hospital. Adult patients of CALD background on systemic antibiotics for more than 72 hours under surgical, respiratory, and geriatric specialties were surveyed on their understanding of their antibiotic treatment. Results: Of the 177 patients, median age was 70 years old (21–99 years), and 95/177 (53.7%) were males. Of the 177 patients, 171/177 (96.6%) reported speaking a language other than English at home. While 160/177 (90.4%) patients were told that they were treated with antibiotics, only 67/177 (37.9%) were told about duration, 35/177 (19.8%) were told about the side effects, and 27/177 (15.3%) were given written information. Information was provided by doctors to 125/177 (70.6%) patients, 72/177 (40.7%) by nurses, and 3/177 (1.7%) by pharmacists. Patients preferred to have received information from their doctor 79/177 (44.6%) or any healthcare professional 91/177 (51.4%). Conclusion: Improving antibiotic education for CALD patients is essential to address communication gaps. Enhancing knowledge will support appropriate use, improved adherence and outcomes, and promote shared decision-making. Strengthening health literacy in CALD populations should be a priority for AMS programs.

Infectious and parasitic diseases, Public aspects of medicine

Detail DOI Sumber

DOAJ Open Access 2025

A mixed-methods approach to the psychological predictors of boredom in second language learning: mindfulness, grit, and self-regulation

Jingjing Lyu

This mixed methods study explores the relationships among mindfulness, grit, self-regulation, and L2 boredom in Chinese undergraduate English majors. Using structural equation modeling (SEM) with a sample of 516 students from various universities, the quantitative phase found that mindfulness and grit were negatively related to L2 boredom, with self-regulation partially mediating these relationships. Mindfulness and self-regulation were the strongest predictors of reduced boredom, while grit had a smaller yet significant impact. Multi-group analysis showed these relationships were consistent across gender and years of English learning experience. The qualitative phase, involving focus group discussions with 40 participants, offered insights into students’ experiences. Boredom was described as a complex emotional, cognitive, and behavioral state leading to disengagement. Mindfulness helped students maintain focus, grit provided perseverance, and self-regulation offered strategies to manage boredom. Additionally, dynamic teaching methods and supportive environments were identified as crucial for reducing boredom and enhancing engagement. These findings enhance the understanding of psychological factors influencing L2 boredom and suggest practical strategies for educators to foster mindfulness, grit, and self-regulation, thereby improving student engagement and learning outcomes.

Psychology

Detail DOI Sumber

DOAJ Open Access 2025

Exploring Flipped Classroom Integration with Gamified Applications for Junior High School Students

Tommy Hastomo, Yazid Basthomi, Utami Widiati et al.

Flipped classroom models have gained attention for promoting learner-centred instruction, mainly when supported by digital technology. In English as a Foreign Language (EFL) contexts, gamified applications such as Edpuzzle and Kahoot! have improved student engagement and motivation. However, limited research has examined how integrating gamified applications within flipped classrooms affects junior high school students’ learner autonomy, particularly in vocabulary learning. This study aimed to investigate the level of learner autonomy among junior high school students after experiencing flipped instruction with gamified applications and to explore how this instructional model supports the development of autonomy. This study employed a qualitative multi-site case study design, supported by descriptive quantitative data. The participants comprised five English teachers and 50 eighth-grade students from five junior high schools in Bandar Lampung, Indonesia. The researchers used two main instruments: a learner autonomy questionnaire and semi-structured group interviews. Data were collected after the implementation of flipped vocabulary instruction over 20 weeks, during which students engaged with Edpuzzle and Kahoot!. The questionnaire results were analysed using descriptive statistics with SPSS version 26, while interview data were analysed thematically. The findings revealed a consistently high level of learner autonomy across five domains. Thematic analysis identified three key themes supporting this development: self-paced learning, progress tracking, and continuous feedback. These results suggest that integrating gamified applications into flipped classrooms can promote student autonomy in vocabulary learning. The study recommends the broader adoption of the flipped teaching approach with gamified applications to foster independent learning among EFL students.

Philology. Linguistics

Detail DOI Sumber

CrossRef Open Access 2024

English Teacher Style in English Language Teaching

Grace Natalia Roma Uli Simanjuntak, Siti Aisyah

This study aims to analyze the interactions between teachers and students in the twelfth grade at Senior High School 8 Jambi. The study uses a qualitative research design and a descriptive qualitative research approach. The research instruments used in this study were observation and a questionnaire. The data were acquired by giving a questionnaire to the students and observing the interaction between the teacher and the students. The research used the technique to analyze data reduction, data display, and make conclusions. According to the statistics, 63.6% of students frequently paid attention to the teacher's explanation. Furthermore, 33.3% of students stated that they sometimes feel comfortable studying in class. Finally, 27,3% of students reported feeling bored with the learning methods.

en

Detail DOI Sumber

arXiv Open Access 2024

From English-Centric to Effective Bilingual: LLMs with Custom Tokenizers for Underrepresented Languages

Artur Kiulian, Anton Polishko, Mykola Khandoga et al.

In this paper, we propose a model-agnostic cost-effective approach to developing bilingual base large language models (LLMs) to support English and any target language. The method includes vocabulary expansion, initialization of new embeddings, model training and evaluation. We performed our experiments with three languages, each using a non-Latin script - Ukrainian, Arabic, and Georgian. Our approach demonstrates improved language performance while reducing computational costs. It mitigates the disproportionate penalization of underrepresented languages, promoting fairness and minimizing adverse phenomena such as code-switching and broken grammar. Additionally, we introduce new metrics to evaluate language quality, revealing that vocabulary size significantly impacts the quality of generated text.

en cs.CL, cs.AI

Detail Sumber

DOAJ Open Access 2024

Defining Human Rights in Times of Covid: Human Rights Discourse in the UK and Devolved Legislatures

Anne Cousson

The British government’s reaction to the Covid-19 pandemic has meant wide-ranging restrictions imposed on people living in the UK with minimal parliamentary oversight. Thus, human rights and civil liberties were affected, as far as both individual freedoms and constitutional guarantees are concerned. However, given the urgency created by the health crisis and the controversial nature of human rights speech in the UK, using it to criticize the government’s measures was bound to be a politically charged choice. Through an analysis of parliamentary discourse in the main Covid-related debates both in the British Parliament and in the devolved legislatures, this article argues that human rights were not used as an expression of common values in a time of national crisis, but as a divisive rhetorical tool. Focusing thus on political discourse rather than on the effective effects of Covid restrictions on human rights allows us to identify ideological fault lines. Indeed, the analysis shows a highly differentiated definition of human rights between political parties on the one hand and between the different nations on the other.

History of Great Britain, English literature

Detail DOI Sumber

CrossRef Open Access 2023

Culture in English Language Teaching: Let the Language Do the Talking

Kieran Harrington

en

Detail DOI Sumber

arXiv Open Access 2023

Learning to Plan with Natural Language

Yiduo Guo, Yaobo Liang, Chenfei Wu et al.

Large Language Models (LLMs) have shown remarkable performance in various basic natural language tasks. For completing the complex task, we still need a plan for the task to guide LLMs to generate the specific solutions step by step. LLMs can directly generate task plans, but these plans may still contain factual errors or are incomplete. A high-quality task plan contains correct step-by-step solutions for solving all situations and behavioral instructions for avoiding mistakes. To obtain it, we propose the Learning to Plan method, which involves two phases: (1) In the first learning task plan phase, it iteratively updates the task plan with new step-by-step solutions and behavioral instructions, which are obtained by prompting LLMs to derive from training error feedback. (2) In the subsequent test phase, the LLM uses the learned task plan to guide the inference of LLM on the test set. We demonstrate the effectiveness of our method on the five different reasoning type tasks (8 datasets). Further, our analysis experiment shows that the task plan learned by one LLM can directly guide another LLM to improve its performance, which reveals a new transfer learning paradigm. We release the code at \url{https://github.com/Eureka6174/LearnNLPlan}

en cs.CL

Detail Sumber

arXiv Open Access 2023

PyThaiNLP: Thai Natural Language Processing in Python

Wannaphong Phatthiyaphaibun, Korakot Chaovavanich, Charin Polpanumas et al.

We present PyThaiNLP, a free and open-source natural language processing (NLP) library for Thai language implemented in Python. It provides a wide range of software, models, and datasets for Thai language. We first provide a brief historical context of tools for Thai language prior to the development of PyThaiNLP. We then outline the functionalities it provided as well as datasets and pre-trained language models. We later summarize its development milestones and discuss our experience during its development. We conclude by demonstrating how industrial and research communities utilize PyThaiNLP in their work. The library is freely available at https://github.com/pythainlp/pythainlp.

en cs.CL

Detail DOI Sumber

CrossRef Open Access 2022

Developing Awareness of ELF in English Language Education

Mona E. Flognfeldt

Language educators in today’s classrooms face the complex responsibility of teaching English to prepare students for a variety of requirements in the field of education and work. At the same time, they need to empower students to make use of their English resources to communicate as effectively as possible with speakers in local and global contexts where English is used as a contact language, i.e., English as a lingua franca (ELF), by people who do not speak and understand each other’s primary languages. The concept of ELF is regarded in diverse ways in various educational settings, and often it is described negatively in comparison with the norms of native-speaker English. However, this deficiency orientation is not conducive to the development of confident language users, which is an aim clearly outlined in the revised national English subject curriculum in Norway. This chapter proposes a post-deficiency approach to the teaching and learning of English, calling for a change of attitude and arguing for the inclusion of ELF discourse in learning resources, heightened genre awareness, and the development of contextually appropriate pragmatic strategies.

en

Detail DOI Sumber

arXiv Open Access 2022

AfroLM: A Self-Active Learning-based Multilingual Pretrained Language Model for 23 African Languages

Bonaventure F. P. Dossou, Atnafu Lambebo Tonja, Oreen Yousuf et al.

In recent years, multilingual pre-trained language models have gained prominence due to their remarkable performance on numerous downstream Natural Language Processing tasks (NLP). However, pre-training these large multilingual language models requires a lot of training data, which is not available for African Languages. Active learning is a semi-supervised learning algorithm, in which a model consistently and dynamically learns to identify the most beneficial samples to train itself on, in order to achieve better optimization and performance on downstream tasks. Furthermore, active learning effectively and practically addresses real-world data scarcity. Despite all its benefits, active learning, in the context of NLP and especially multilingual language models pretraining, has received little consideration. In this paper, we present AfroLM, a multilingual language model pretrained from scratch on 23 African languages (the largest effort to date) using our novel self-active learning framework. Pretrained on a dataset significantly (14x) smaller than existing baselines, AfroLM outperforms many multilingual pretrained language models (AfriBERTa, XLMR-base, mBERT) on various NLP downstream tasks (NER, text classification, and sentiment analysis). Additional out-of-domain sentiment analysis experiments show that \textbf{AfroLM} is able to generalize well across various domains. We release the code source, and our datasets used in our framework at https://github.com/bonaventuredossou/MLM_AL.

en cs.CL, cs.AI

Detail Sumber

arXiv Open Access 2022

Just-DREAM-about-it: Figurative Language Understanding with DREAM-FLUTE

Yuling Gu, Yao Fu, Valentina Pyatkin et al.

Figurative language (e.g., "he flew like the wind") is challenging to understand, as it is hard to tell what implicit information is being conveyed from the surface form alone. We hypothesize that to perform this task well, the reader needs to mentally elaborate the scene being described to identify a sensible meaning of the language. We present DREAM-FLUTE, a figurative language understanding system that does this, first forming a "mental model" of situations described in a premise and hypothesis before making an entailment/contradiction decision and generating an explanation. DREAM-FLUTE uses an existing scene elaboration model, DREAM, for constructing its "mental model." In the FigLang2022 Shared Task evaluation, DREAM-FLUTE achieved (joint) first place (Acc@60=63.3%), and can perform even better with ensemble techniques, demonstrating the effectiveness of this approach. More generally, this work suggests that adding a reflective component to pretrained language models can improve their performance beyond standard fine-tuning (3.3% improvement in Acc@60).

en cs.CL

Detail Sumber

arXiv Open Access 2022

Resources for Turkish Natural Language Processing: A critical survey

Çağrı Çöltekin, A. Seza Doğruöz, Özlem Çetinoğlu

This paper presents a comprehensive survey of corpora and lexical resources available for Turkish. We review a broad range of resources, focusing on the ones that are publicly available. In addition to providing information about the available linguistic resources, we present a set of recommendations, and identify gaps in the data available for conducting research and building applications in Turkish Linguistics and Natural Language Processing.

en cs.CL

Detail DOI Sumber

DOAJ Open Access 2022

Peran Perawat pada Pasien Sepsis di Unit Critical Care: A Narrative Review

Regina Saragih Turnip, Nur Fadilla Bahri, Irmawati Irmawati et al.

Sepsis is a medical emergency that describes the body's systemic immunological response to infectious processes that can lead to end-stage organ dysfunction and death. The role of nurses in tackling sepsis patients is critical to identifying the initial condition of sepsis and reducing the continuation of even more severe sepsis. This narrative review aims to identify the role of nurses in treating sepsis patients in critical care units. This research method was the narrative review by analyzing online search databases namely Pubmed, Google Scholar, and Science Direct. The search inclusion criteria were an English-language article, published in 2008-2021, full text, the keywords used were "sepsis", "the role of nurses", "critical care". Our search found 11 articles showed that the role of nurses in treating sepsis patients is early introduction to sepsis to start treatment, team training to ensure a safe and effective approach and the implementation of infection prevention and control measures as prevention of sepsis. So the role of nurses in tackling sepsis patients is critical to identifying the early conditions of sepsis, controlling and preventing sepsis, preventing the progression of the disease and contributing to decreased morbidity and mortality.Keywords: sepsis; the role of nurses; critical care ABSTRAK Sepsis merupakan keadaan darurat medis yang menggambarkan respons imunologi sistemik tubuh terhadap proses infeksi yang dapat menyebabkan disfungsi organ stadium akhir dan kematian. Peran perawat dalam menanggulangi pasien sepsis sangatlah penting untuk mengidentifikasi kondisi awal sepsis serta mengurangi kelanjutan sepsis yang lebih berat lagi. Tinjauan naratif ini bertujuan untuk mengidentifikasi peran perawat dalam merawat pasien sepsis di unit critical care.Metode penelitian ini adalah narrative review dengan menganalisis database online pencarian yaitu Pubmed, Google Scholar, dan Science Direct. Kriteria inklusi pencarian adalah artikel berbahasa inggris, terbit tahun 2008-2021, full text, dengan kata kunci pencarian “sepsis”, “the role of nurses”, “critical care”. Dari hasil penelusuran kami menemukan 11 artikel yang menunjukkan bahwa peran perawat dalam merawat pasien sepsis yaitu adanya pengenalan dini terhadap sepsis untuk memulai pengobatan, pelatihan tim untuk memastikan pendekatan yang aman dan efektif serta penerapan langkah-langkah pencegahan dan pengendalian infeksi sebagai pencegahan sepsis. Sehingga peran perawat dalam menanggulangi pasien sepsis sangatlah penting untuk mengidentifikasi kondisi awal sepsis, pengendalian dan pencegahan sepsis, mencegah perkembangan penyakit dan berkontribusi pada penurunan morbiditas dan mortalitas.Kata kunci: sepsis; peran perawat; perawatan kritis

Medicine (General)

Detail DOI Sumber

Hasil untuk "English language"