A. Fernald, V. Marchman, A. Weisleder
Hasil untuk "English language"
Menampilkan 20 dari ~6565864 hasil · dari CrossRef, DOAJ, Semantic Scholar, arXiv
K. Lemhöfer, M. Broersma
The increasing number of experimental studies on second language (L2) processing, frequently with English as the L2, calls for a practical and valid measure of English vocabulary knowledge and proficiency. In a large-scale study with Dutch and Korean speakers of L2 English, we tested whether LexTALE, a 5-min vocabulary test, is a valid predictor of English vocabulary knowledge and, possibly, even of general English proficiency. Furthermore, the validity of LexTALE was compared with that of self-ratings of proficiency, a measure frequently used by L2 researchers. The results showed the following in both speaker groups: (1) LexTALE was a good predictor of English vocabulary knowledge; 2) it also correlated substantially with a measure of general English proficiency; and 3) LexTALE was generally superior to self-ratings in its predictions. LexTALE, but not self-ratings, also correlated highly with previous experimental data on two word recognition paradigms. The test can be carried out on or downloaded from www.lextale.com.
Mofareh Alqahtani
Vocabulary learning is an essential part in foreign language learning as the meanings of new words are very often emphasized, whether in books or in classrooms. It is also central to language teaching and is of paramount importance to a language learner. Recent research indicate that teaching vocabulary may be problematic because many teachers are not confident about best practice in vocabulary teaching and at times don?t know where to begin to form an instructional emphasis on word learning (Berne & Blachowicz, 2008)In this article, I summarizes important research on the impotence of vocabulary and explaining many techniques used by English teachers when teaching English, as well as my own personal view of these issues.
Viorica Marian, Henrike K. Blumenfeld, Margarita Kaushanskaya
L. Vygotsky
J. Werker, R. C. Tees
M. Egger, Tanja Zellweger-Zähner, Martina Schneider et al.
N. Sloane, A. Wyner
M. Rice, K. Wexler
Lindy Woodrow
J. Paradis
Masudul Hasan Masud Bhuiyan, Manish Kumar Bala Kumar, Cristian-Alexandru Staicu
The open-source software (OSS) community has historically been dominated by English as the primary language for code, documentation, and developer interactions. However, with growing global participation and better support for non-Latin scripts through standards like Unicode, OSS is gradually becoming more multilingual. This study investigates the extent to which OSS is becoming more multilingual, analyzing 9.14 billion GitHub issues, pull requests, and discussions, and 62,500 repositories across five programming languages and 30 natural languages, covering the period from 2015 to 2025. We examine six research questions to track changes in language use across communication, code, and documentation. We find that multilingual participation has steadily increased, especially in Korean, Chinese, and Russian. This growth appears not only in issues and discussions but also in code comments, string literals, and documentation files. While this shift reflects greater inclusivity and language diversity in OSS, it also creates language tension. The ability to express oneself in a native language can clash with shared norms around English use, especially in collaborative settings. Non-English or multilingual projects tend to receive less visibility and participation, suggesting that language remains both a resource and a barrier, shaping who gets heard, who contributes, and how open collaboration unfolds.
Tharindu Ranasinghe, Marcos Zampieri
Offensive content is pervasive in social media and a reason for concern to companies and government organizations. Several studies have been recently published investigating methods to detect the various forms of such content (e.g. hate speech, cyberbulling, and cyberaggression). The clear majority of these studies deal with English partially because most annotated datasets available contain English data. In this paper, we take advantage of English data available by applying cross-lingual contextual word embeddings and transfer learning to make predictions in languages with less resources. We project predictions on comparable data in Bengali, Hindi, and Spanish and we report results of 0.8415 F1 macro for Bengali, 0.8568 F1 macro for Hindi, and 0.7513 F1 macro for Spanish. Finally, we show that our approach compares favorably to the best systems submitted to recent shared tasks on these three languages, confirming the robustness of cross-lingual contextual embeddings and transfer learning for this task.
Anat Prior, Tamar H. Gollan
Monika Zabrocka
This article explores the dynamic evolution of audiovisual translation (AVT) and media accessibility amid technological advances and shifting societal expectations. Once viewed as simple linguistic aids, AVT and media accessibility now play a vital role in inclusive cultural participation and global communication. The text highlights diverse practices – from subtitling and dubbing to game localisation and services such as audio description and closed captions – showcasing their creative, user-focused transformations. It also serves as an introduction, previewing the contributions in this issue that collectively reimagine AVT and accessibility as interconnected, evolving fields essential for fostering equitable, immersive, and culturally sensitive media experiences worldwide.
Nerea Gutiérrez-Fernández, Lourdes Villardón-Gallego, Lirio Flores-Moncada
Perceived competence is considered an essential predictor of learner´s performance in language learning. It is therefore important to identify strategies that favor its development. This study aims to analyze whether the perceived linguistic competence of English as a L2 of pre-service teachers improves after implementing Dialogic Pedagogical Gatherings (DPGs), and if so, in which skills have there been improvements. Likewise, the study also pretends to identify which characteristics of the DPGs can favor this evolution. DPG is an educational strategy based on egalitarian dialogue among participants. The research involved 26 university students who participated in 8 DPGs during a whole academic year. Data was gathered qualitatively though an open-ended questionnaire and a focus group. The results show that the participants consider that they have improved their level of English after participating in the DPGs, especially in speaking and reading skills, as well as in pronunciation, listening comprehension, and confidence in using the language. They also identify some characteristics of the intervention as key to fostering this improvement: collaboration among peers, solidarity, reduced groups and classroom climate. The results with respect to this teaching strategy have implications for second language learning.
Rena Gao, Ming-Bin Chen, Lea Frermann et al.
English as a Second Language (ESL) speakers often struggle to engage in group discussions due to language barriers. While moderators can facilitate participation, few studies assess conversational engagement and evaluate moderation effectiveness. To address this gap, we develop a dataset comprising 17 sessions from an online ESL conversation club, which includes both moderated and non-moderated discussions. We then introduce an approach that integrates automatic ESL dialogue assessment and a framework that categorizes moderation strategies. Our findings indicate that moderators help improve the flow of topics and start/end a conversation. Interestingly, we find active acknowledgement and encouragement to be the most effective moderation strategy, while excessive information and opinion sharing by moderators has a negative impact. Ultimately, our study paves the way for analyzing ESL group discussions and the role of moderators in non-native conversation settings.
Avinash Patil, Siru Tao, Aryan Jadon
Accurate translation of bug reports is critical for efficient collaboration in global software development. In this study, we conduct the first comprehensive evaluation of machine translation (MT) performance on bug reports, analyzing the capabilities of DeepL, AWS Translate, and large language models such as ChatGPT, Claude, Gemini, LLaMA, and Mistral using data from the Visual Studio Code GitHub repository, specifically focusing on reports labeled with the english-please tag. To assess both translation quality and source language identification accuracy, we employ a range of MT evaluation metrics-including BLEU, BERTScore, COMET, METEOR, and ROUGE-alongside classification metrics such as accuracy, precision, recall, and F1-score. Our findings reveal that while ChatGPT (gpt-4o) excels in semantic and lexical translation quality, it does not lead in source language identification. Claude and Mistral achieve the highest F1-scores (0.7182 and 0.7142, respectively), and Gemini records the best precision (0.7414). AWS Translate shows the highest accuracy (0.4717) in identifying source languages. These results highlight that no single system dominates across all tasks, reinforcing the importance of task-specific evaluations. This study underscores the need for domain adaptation when translating technical content and provides actionable insights for integrating MT into bug-triaging workflows. The code and dataset for this paper are available at GitHub-https://github.com/av9ash/English-Please
Cong-Thanh Do, Rama Doddipatla, Kate Knill
Chain-of-Thought (CoT) prompting is a widely used method to improve the reasoning capability of Large Language Models (LLMs). More recently, CoT has been leveraged in Knowledge Distillation (KD) to transfer reasoning capability from a larger LLM to a smaller one. This paper examines the role of CoT in distilling the reasoning capability from larger LLMs to smaller LLMs using white-box KD, analysing its effectiveness in improving the performance of the distilled models for various natural language reasoning and understanding tasks. We conduct white-box KD experiments using LLMs from the Qwen and Llama2 families, employing CoT data from the CoT-Collection dataset. The distilled models are then evaluated on natural language reasoning and understanding tasks from the BIG-Bench-Hard (BBH) benchmark, which presents complex challenges for smaller LLMs. Experimental results demonstrate the role of CoT in improving white-box KD effectiveness, enabling the distilled models to achieve better average performance in natural language reasoning and understanding tasks from BBH.
Ece Takmaz, Lisa Bylinina, Jakub Dotlacil
State-of-the-art vision-and-language models consist of many parameters and learn from enormous datasets, surpassing the amounts of linguistic data that children are exposed to as they acquire a language. This paper presents our approach to the multimodal track of the BabyLM challenge addressing this discrepancy. We develop language-only and multimodal models in low-resource settings using developmentally plausible datasets, with our multimodal models outperforming previous BabyLM baselines. One finding in the multimodal language model literature is that these models tend to underperform in \textit{language-only} tasks. Therefore, we focus on maintaining language-only abilities in multimodal models. To this end, we experiment with \textit{model merging}, where we fuse the parameters of multimodal models with those of language-only models using weighted linear interpolation. Our results corroborate the findings that multimodal models underperform in language-only benchmarks that focus on grammar, and model merging with text-only models can help alleviate this problem to some extent, while maintaining multimodal performance.
Halaman 20 dari 328294