CamemBERT: a Tasty French Language Model
Louis Martin, Benjamin Muller, Pedro Ortiz Suarez
et al.
Pretrained language models are now ubiquitous in Natural Language Processing. Despite their success, most available models have either been trained on English data or on the concatenation of data in multiple languages. This makes practical use of such models –in all languages except English– very limited. In this paper, we investigate the feasibility of training monolingual Transformer-based language models for other languages, taking French as an example and evaluating our language models on part-of-speech tagging, dependency parsing, named entity recognition and natural language inference tasks. We show that the use of web crawled data is preferable to the use of Wikipedia data. More surprisingly, we show that a relatively small web crawled dataset (4GB) leads to results that are as good as those obtained using larger datasets (130+GB). Our best performing model CamemBERT reaches or improves the state of the art in all four downstream tasks.
1072 sitasi
en
Computer Science
Revisiting Pre-Trained Models for Chinese Natural Language Processing
Yiming Cui, Wanxiang Che, Ting Liu
et al.
Bidirectional Encoder Representations from Transformers (BERT) has shown marvelous improvements across various NLP tasks, and consecutive variants have been proposed to further improve the performance of the pre-trained language models. In this paper, we target on revisiting Chinese pre-trained language models to examine their effectiveness in a non-English language and release the Chinese pre-trained language model series to the community. We also propose a simple but effective model called MacBERT, which improves upon RoBERTa in several ways, especially the masking strategy that adopts MLM as correction (Mac). We carried out extensive experiments on eight Chinese NLP tasks to revisit the existing pre-trained language models as well as the proposed MacBERT. Experimental results show that MacBERT could achieve state-of-the-art performances on many NLP tasks, and we also ablate details with several findings that may help future research. https://github.com/ymcui/MacBERT
834 sitasi
en
Computer Science
VaTeX: A Large-Scale, High-Quality Multilingual Dataset for Video-and-Language Research
Xin Eric Wang, Jiawei Wu, Junkun Chen
et al.
We present a new large-scale multilingual video description dataset, VATEX, which contains over 41,250 videos and 825,000 captions in both English and Chinese. Among the captions, there are over 206,000 English-Chinese parallel translation pairs. Compared to the widely-used MSR-VTT dataset, \vatex is multilingual, larger, linguistically complex, and more diverse in terms of both video and natural language descriptions. We also introduce two tasks for video-and-language research based on \vatex: (1) Multilingual Video Captioning, aimed at describing a video in various languages with a compact unified captioning model, and (2) Video-guided Machine Translation, to translate a source language description into the target language using the video information as additional spatiotemporal context. Extensive experiments on the \vatex dataset show that, first, the unified multilingual model can not only produce both English and Chinese descriptions for a video more efficiently, but also offer improved performance over the monolingual models. Furthermore, we demonstrate that the spatiotemporal video context can be effectively utilized to align source and target languages and thus assist machine translation. In the end, we discuss the potentials of using \vatex for other video-and-language research.
680 sitasi
en
Computer Science
The Impact of English as a Global Language on Educational Policies and Practices in the Asia-Pacific Region.
D. Nunan
Language, Identity, and the Ownership of English
B. Norton
The Phonology of English as an International Language: New Models, New Norms, New Goals
J. Jenkins
1292 sitasi
en
Computer Science, Sociology
The Cambridge Encyclopedia of the English Language
D. Crystal
Now in its third edition, The Cambridge Encyclopedia of the English Language provides the most comprehensive coverage of the history, structure and worldwide use of English. Fully updated and expanded, with a fresh redesigned layout, and over sixty audio resources to bring language extracts to life, it covers all aspects of the English language including the history of English, with new pages on Shakespeare's vocabulary and pronunciation, updated statistics on global English use that now cover all countries and the future of English in a post-Brexit Europe, regional and social variations, with fresh insights into the growing cultural identities of 'new Englishes', English in everyday use with new sections on gender identities, forensic studies, and 'big data' in corpus linguistics, and digital developments, including the emergence of new online varieties in social media platforms such as Facebook, Twitter and WhatsApp. Packed with brand new colour illustrations, photographs, maps, tables and graphs, this new edition is an essential tool for a new generation of twenty-first-century English language enthusiasts.
Does English proficiency matter? Testing its moderating role in the TAM for AI-enhanced MOOC adoption in vocational education
Shuhua Hou, Xianhe Liu, Xiaoqing Shen
et al.
This study investigates whether English proficiency moderates core Technology Acceptance Model (TAM) pathways in the context of AI-enhanced English MOOCs for vocational students. Drawing on an extended TAM that links Perceived Ease of Use (PEOU), Perceived Usefulness (PU), Behavioral Intention (BI), and Perceived Learning Outcomes (PLO), we surveyed 516 learners from a provincial AI-powered MOOC. Confirmatory factor analysis confirmed strong measurement properties (all factor loadings > 0.74, AVE > 0.57, CR > 0.80). Structural analysis revealed robust direct effects: PEOU → PU (β = 0.756), PU → BI (β = 0.696), and BI → PLO (β = 0.814). Hierarchical regression showed no significant moderating by English proficiency on any TAM path, though a small positive direct effect on BI was observed (β = 0.064, p = 0.042). Results suggest that well-designed AI personalization can mitigate language-related barriers, allowing core TAM mechanisms to operate consistently across proficiency levels. The findings highlight the potential of adaptive AI tools to foster equitable engagement in vocational language learning. Future research should employ multi-item or objective proficiency measures and incorporate actual usage data to further validate these insights.
Investigating Undergraduate Students’ Learning Styles and Preferences in English as a Second Language: The Case of a Public University in Ghana
Gifty Budu, Edward Owusu , Levina N. Abunya
et al.
This study examines selected undergraduate students’ learning styles and preferences at a public university in Ghana. These participants were Level 100 students pursuing the Bachelor of Information Technology programme. The study sought to answer three research questions: What are the students’ learning styles? What are the students’ preferred learning styles? What characteristics do students demonstrate to establish their learning styles? Thirty participants (30) were purposefully sampled from the Faculty of Computing and Information Systems of a public university in Accra, Ghana. In this study, the research instruments used for the data collection were Fleming’s questionnaire on learning styles and semi-structured interviews. The outcome of this study revealed that the participants used all the learning styles such as Visual, Auditory, Read and Write, and Kinesthetic (VARK). The most favoured learning style indicated by the questionnaire responses was Auditory. The interview responses indicated that they used different learning styles, but each participant had the most preferred learning style. The study concluded that learners should be allowed to integrate varied learning styles to improve their learning and make learning interesting and relaxed. The study will be beneficial in formulating policies (at the tertiary education level) that seek to provide different opportunities for students regarding learning styles and preferences for studying English as a second language.
Utilization for Non-Communicable Diseases Management in Southeast Asia
Farah Luthfi Kaulina, Sukihananto Sukihananto
Non-communicable diseases (NCDs) are still a morbidity and mortality problem in Southeast Asia. However, NCD in Southeast Asia still needs to be handled faster. WHO recommends the use of digital in treating NCDs in Southeast Asia. Therefore, this literature review study aims to describe how mHealth is utilized to overcome the problem of NCDs in Southeast Asian countries. The author collected articles using Google Scholar and Proquest, which were published in 2019-2023. The focus of the search was articles published in English-language Research Journals. Researchers used advanced search with the keywords NCD, Non-communicable diseases, mHealth, Mobile Health, Nursing, and Health Promotion. Keywords are combined using Boolean and/or the online database that the researcher chose. Articles that have been filtered are filtered again by selecting research locations in Southeast Asian countries. Ten articles obtained came from research in Southeast Asian countries Indonesia (n=4), Malaysia (n=1), Singapore (n=1), Vietnam (n=1), Thailand (1), Cambodia (n=1), Philippines (n=1). All articles discussed the use of mHealth for NCD management in their countries and aimed to determine the barriers (n=3), feasibility (n=1), effectiveness (n=2), impact (n=2), potential (n=1), perception (n=1), and perspective (n=1) of service providers, as well as the experience of using mHealth in remote areas (n=1). It can be concluded that mHealth can be used for independent screening for PTM, providing education about NCDs, and can be applied in rural areas as a comprehensive effort to handle NCDs.
Medicine, Medicine (General)
Vitamin D and Osteogenesis Imperfecta in Pediatrics
Francesco Coccia, Angelo Pietrobelli, Thomas Zoller
et al.
Osteogenesis Imperfecta (OI) is a heterogeneous group of inherited skeletal dysplasias characterized by bone fragility. The study of bone metabolism, in these disease, is problematic in terms of clinical and genetic variability. The aims of our study were to evaluate the importance of Vitamin D levels in OI bone metabolism, reviewing studies performed on this topic and providing advice reflecting our experience using vitamin D supplementation. A comprehensive review on all English-language articles was conducted in order to analyze the influence of vitamin D in OI bone metabolism in pediatric patients. Reviewing the studies, contradictory data were found on the relationship between 25OH vitamin D levels and bone parameters in OI, and in several studies the baseline levels of 25OH D were below the threshold value of 75 nmol/L. In conclusion, according to the literature and to our experience, we highlight the importance of adequate vitamin D supplementation in children with OI.
Medicine, Pharmacy and materia medica
Developing Mobile Learning Application as An Instructional Medium for Reading Comprehension
Ahmad Thoyyib Shofi, Wardatul Jannah
Students nowadays are no longer enjoy reading printed books. They mostly read from their mobile phone or computers' screens. Teachers need to encourage students to read and comprehend texts. Teachers should adopt and adapt technology in teaching reading comprehension to students in light of the rapidly changing existence of technology. The purpose of this research is to develop appropriate media for learning reading. This research has employed Research and Development design based on ADDIE model. This model is used for media development by requiring 5 stages, namely: 1) Analysis, 2) Design, 3) Development, 4) Implementation, and 5) Evaluation. The research subjects are the tenth graders and the English teacher. Both the media and language experts validated the developed product. The instruments used in the needs analysis are questionnaires for the students and interviews with the English teacher. The final result is in the form of a mobile application that suits the needs of teachers and students. This media consists of 63 slides. The size of this mobile application is 52 Mb. The developed application consists of three main menus and eleven sub-menus. They are the intro page, loading section, menu page, materials page, core competence page, vocabulary building page, learning summary page, exercise page, glossary page, verb form page, user’s guide page, lesson plan page, description of the product page, and exit page. The result of this research could be seen in the enthusiasm, development, and students’ interest in mobile application media. Therefore, it is recommended for teachers to use mobile applications in teaching reading and for other developers to develop applications for the learning process.
English language, Philology. Linguistics
English linguistic knowledge of police trainees in SAPS training academies
Tebogo Johannes Kekana, Malesela Edward Montle
This article reports on the findings of a study about South African Police Services (SAPS) training with specific reference to English linguistic knowledge of police trainees. English linguistic training in (SAPS) training academies have become central to both training and teaching and learning. Despite several benefits identified in literature regarding adequate English linguistic knowledge, the training in SAPS leaves much to be desired. Therefore, the impetus of this paper is to make a case to challenge the tacit and poignant factors affecting effective SAPS training program with specific reference to English linguistic competence. The researcher makes a case that the training program in its current state is faced with many challenges and intricacies that hampers it from achieving one of its goals which is to produce police officers with adequate workplace English linguistic competency. A mixed research approach was adopted to investigate the phenomena. The research instruments were a locally designed questionnaire complimented by in-depth interviews with a selected sample and extensive literature review of scholarly literature on the matter. Needs Analysis theory was adopted as the pillar in this study. Among the findings was the lack of expertise in teaching English writing by police instructors. The study also found that SAPS Language policy is ‘completely’ silent as far as pedagogy is concerned in SAPS training academies. The study also found the other a systemic problem called ‘placement conundrum’. Furthermore, the study also found that ineffective English writing screening measures for police recruits contributes to the problem. In addition, the other unsurprising was the over dominance of physical training over academic teaching in the training colleges. This study underscores the crucial aspect of reflective research as a source of information for improving training in SAPS training academies.
Education (General), English language
Tracing writing progression in English for academic purposes: A data-driven possibility in the post-COVID era in Hong Kong
Dennis Foung, Julia Chen
It is rare to use “big data” in writing progression studies in the field of second language acquisition around the globe. The difficulty of recruiting participants for longitudinal studies often results in sample sizes that are too small for quantitative analysis. Due to the global pandemic, students began to face more academic and emotional challenges, and it became more important to track the progression of their writing across courses. This study utilizes big data in a study of over 4,500 students who took a basic English for Academic Purposes (EAP) course followed by an advanced one at a university in Hong Kong. The findings suggest that analytics studies can provide a range of insights into course design and strategic planning, including how students’ language use and citation skills improve. They can also allow researchers to study the progression of students based on the level of achievement and the time elapsed between the two EAP courses. Further, studies using mega-sized datasets will be more generalizable than previous studies with smaller sample sizes. These results indicate that data-driven analytics can be a helpful approach to writing progression studies, especially in the post-COVID era.
La Lézarde (1958) et Malemort (1975) d’Édouard Glissant : dire l’esthétique archipélique depuis la Martinique et l’aire des Caraïbes
Rhimi, Mohamed Lamine
In this work, we will mainly focus on Edouard Glissant’s archipelago aesthetics. Indeed, the West Indian novelist-orator uses in La Lézarde (1958) and Malemort (1975) a kind of rhetoric which reflects not only the history of the slave trade, but also the Caribbean island landscape. This is how the writer cultivates Caribbean ethnopoetics, without falling into the trap of standardisation or reductionism. In others words, the fiery indictment that the writer has drawn up against the oppressors is inextricably linked to a plea made to defend the cause of West Indian culture, by exhorting the Caribbean people to recover their historical memory and to take their destiny into their own hands. It is specifically in that context that the Edouard Glissant’s romantic epic and sublime beauty are fully in line with his archipelago aesthetics.
English literature, French literature - Italian literature - Spanish literature - Portuguese literature
Violence and Rejection: The Hegemony of White Culture and Its Influence on the Mother–Daughter Relationship in Toni Morrison’s The Bluest Eye
Magda Szolc
For quite a long time, mainstream academic discourse has ignored the significance of the mother–daughter relationship and excluded it from thorough scholarly analysis. However, the theme developed the interest of twentieth-century women’s literature, and the bond between a mother and her daughter marked its presence with the emergence of motherhood studies in the 1970s. Toni Morrison – one of the finest black female writers – in her debut novel, The Bluest Eye (1970), illustrates the complex bond between Pauline and Pecola Breedlove. Their relationship is shaped by the women’s fascination with white culture and the standards it promotes. In the novel, Morrison raises awareness of black women’s marginalisation and the way in which white culture shapes a woman’s vision of herself. The aim of this paper is to analyse the destructive influence of white hegemony on the perception of the black female self and its devastating effect on the mother–daughter relationship in The Bluest Eye.
English language, English literature
English as a Lingua Franca in the International University: The Politics of Academic English Language Policy
J. Jenkins
272 sitasi
en
Political Science
“Secularism” or “no-secularism”? A complex case of Bangladesh
Abdul Wohab
The incidents (in 2017) of changing the secular content of textbooks and removing a sculpture from the Supreme Court premises in Bangladesh raise a question among people who are sympathetic to secularism that Bangladesh is moving towards a theological state like Pakistan or becoming an Islamic country. They also refer to the remark that the current Prime Minister (Sheikh Hasina) made in 2014 that Bangladesh’s state administration would run under the rule of the Medina Charter (an Islamic constitution based on the Holy Quran and Sunnah, which aims to establish peace and unity by creating universal rules), as an indication of the religious characteristic that would remain at the centre of the state political activities in Bangladesh. By examining the historical and social context of Bangladesh since 1971 and reviewing the relevant contents of four newspapers—the Daily Inqilab (Bengali), the Prothom Alo (Bengali), the Daily Naya Diganta (Bengali) and the Daily Star (English)—from 2014 to 2017, this article rejects the claim made by the people who are sympathetic to secularism. This article, however, argues that Islam was traditionally/historically integrated in Bangladeshi society and culture as a unique (syncretistic) tradition in which political parties were forced to apply religious symbols and language in the political environments to stay in the government’s power. The article concludes by raising a question with the current integration of secular political party and Islamist force (Hefazat-e-Islam), being although there is a functional relationship remaining between secularism and Islam at the state level, is Bangladesh stepping into a “no secularism” era?
Strategies used by Grade Four educators to decode science terminology: A case study
Monde Kazeni, Morongwa Maleka
In this paper, we discuss the results of a case study about the teaching strategies used by three primary school educators to decode Grade 4 science terminology. In South Africa, the study of science is formally introduced to learners in Grade 4. Additionally, Grade 4 is the year when learners transition from being taught in their native languages in Grades 1 to 3 to being taught in English. This presents the challenge of learning a new subject in an unfamiliar language. Research shows that the majority of South African primary school learners find science terminology difficult to comprehend due to linguistic challenges, which could account for their poor performance in science assessments. The way educators decode science terminology during science lessons could affect learners’ comprehension of science vocabulary and
consequently their performance in science. Semi-structured interviews were used to collect data in a qualitative case study in order to determine the strategies used by three science educators to teach and decode science terminology in Grade 4. The study findings suggest that the participating educators use ad hoc, teacher-centered teaching strategies to decode science concepts. These findings have implications for the preparation of primary school science educators in teacher training institutions.
Education (General), Special aspects of education
Learning and Consolidation of Declarative Memory in Good and Poor Readers of English as a Second Language
Kuppuraj Sengottuvel, Kuppuraj Sengottuvel, Arpitha Vasudevamurthy
et al.
Declarative memory abilities may be important for children who are learning to read in a second language. In the present study, we investigated declarative memory in a recognition memory task in 7-to-13-year-old, Kannada native-speaking, good (n = 22) and poor (n = 22) readers of English, in Karnataka, India. Recognition memory was tested shortly (∼10 min) after encoding (day 1) and again on the next (day 2). Analyses revealed that the two groups did not differ in recognition memory performance on day 1. On day 2, the good readers improved from day 1, whereas poor readers did not. A partial correlation analysis suggests that consolidation – the change in performance in recognition memory between the 2 days – is associated with reading skills in good readers, but not in poor readers. Taken together, these results suggest that children who struggle to read in a second language may have deficits in declarative memory consolidation.