Individual and Combined Effects of English as a Second Language and Typos on LLM Performance
Serena Liu, Yutong Yang, Prisha Sheth
et al.
Large language models (LLMs) are used globally, and because much of their training data is in English, they typically perform best on English inputs. As a result, many non-native English speakers interact with them in English as a second language (ESL), and these inputs often contain typographical errors. Prior work has largely studied the effects of ESL variation and typographical errors separately, even though they often co-occur in real-world use. In this study, we use the Trans-EnV framework to transform standard English inputs into eight ESL variants and apply MulTypo to inject typos at three levels: low, moderate, and severe. We find that combining ESL variation and typos generally leads to larger performance drops than either factor alone, though the combined effect is not simply additive. This pattern is clearest on closed-ended tasks, where performance degradation can be characterized more consistently across ESL variants and typo levels, while results on open-ended tasks are more mixed. Overall, these findings suggest that evaluations on clean standard English may overestimate real-world model performance, and that evaluating ESL variation and typographical errors in isolation does not fully capture model behavior in realistic settings.
Rational Exponents for General Graphs
Sean English, Sam Spiro
A rational number $r$ is a \textbf{realizable exponent} for a graph $H$ if there exists a finite family of graphs $\mathcal{F}$ such that $\mathrm{ex}(n,H,\mathcal{F})=Θ(n^r)$, where $\mathrm{ex}(n,H,\mathcal{F})$ denotes the maximum number of copies of $H$ that an $n$-vertex $\mathcal{F}$-free graph can have. Results for realizable exponents are currently known only when $H$ is either a star or a clique, with the full resolution of the $H=K_2$ case being a major breakthrough of Bukh and Conlon. In this paper, we establish the first set of results for realizable exponents which hold for arbitrary graphs $H$ by showing that for any graph $H$ with maximum degree $Δ\ge 1$, every rational in the interval $\left[v(H)-\frac{e(H)}{2Δ^2},\ v(H)\right]$ is realizable for $H$. We also prove a ``stability'' result for generalized Turán numbers of trees which implies that if $T\ne K_2$ is a tree with $\ell$ leaves, then $T$ has no realizable exponents in $[0,\ell]\setminus \mathbb{Z}$. Our proof of this latter result uses a new variant of the classical Helly theorem for trees, which may be of independent interest.
Automatic Fact-checking in English and Telugu
Ravi Kiran Chikkala, Tatiana Anikina, Natalia Skachkova
et al.
False information poses a significant global challenge, and manually verifying claims is a time-consuming and resource-intensive process. In this research paper, we experiment with different approaches to investigate the effectiveness of large language models (LLMs) in classifying factual claims by their veracity and generating justifications in English and Telugu. The key contributions of this work include the creation of a bilingual English-Telugu dataset and the benchmarking of different veracity classification approaches based on LLMs.
Neuroleptic malignant syndrome and serotonin syndrome: a comparative bibliometric analysis
Waleed M. Sweileh
Abstract Objective This study aimed to analyze and map scientific literature on Neuroleptic Malignant Syndrome (NMS) and Serotonin Syndrome (SS) from prestigious, internationally indexed journals. The objective was to identify key topics, impactful articles, prominent journals, research output, growth patterns, hotspots, and leading countries in the field, providing valuable insights for scholars, medical students, and international funding agencies. Methods A systematic search strategy was implemented in the PubMed MeSH database using specific keywords for NMS and SS. The search was conducted in the Scopus database, renowned for its extensive coverage of scholarly publications. Inclusion criteria comprised articles published from 1950 to December 31st, 2022, restricted to journal research and review articles written in English. Data were analyzed using Microsoft Excel for descriptive analysis, and VOSviewer was employed for bibliometric mapping. Results The search yielded 1150 articles on NMS and 587 on SS, with the majority being case reports. Growth patterns revealed a surge in NMS research between 1981 and 1991, while SS research increased notably between 1993 and 1997. Active countries and journals differed between NMS and SS, with psychiatry journals predominating for NMS and pharmacology/toxicology journals for SS. Authorship analysis indicated higher multi-authored articles for NMS. Top impactful articles focused on review articles and pathogenic mechanisms. Research hotspots included antipsychotics and catatonia for NMS, while SS highlighted drug interactions and specific medications like linezolid and tramadol. Conclusions NMS and SS represent rare but life-threatening conditions, requiring detailed clinical and scientific understanding. Differential diagnosis and management necessitate caution in prescribing medications affecting central serotonin or dopamine systems, with awareness of potential drug interactions. International diagnostic tools and genetic screening tests may aid in safe diagnosis and prevention. Reporting rare cases and utilizing bibliometric analysis enhance knowledge dissemination and research exploration in the field of rare drug-induced medical conditions.
Current status of scales on subjective well-being and proximity concepts
Yuho Shimizu, Yasuyuki Kudo, Shuhei Fukuyama
et al.
Assessing whether people are living comfortably and happily is in great demand not only in psychology, but also in such diverse fields as policy making and urban development. In response to this demand, many psychological studies have used a questionnaire method in which participants are presented with several questionnaire items and asked to answer them. In recent years, however, there are so many scales that measure the degree of people’s subjective well-being. To clarify which academic disciplines address these scales and to obtain suggestions for future use of the scales, this study conducted a review of scales measuring people’s subjective well-being and proximity concepts. We conducted a literature review using the Google Scholar and CiNii Research databases. After screening, we found 70 publications that were eligible for the review in this study. The specific constructs addressed by these publications were: 10 reporting on subjective well-being, 12 reporting on happiness, 10 reporting on satisfaction, 10 reporting on quality of life, 7 reporting on purpose in life, 11 reporting on emotions and moods, and 10 reporting on self-esteem. These were examined in a wide range of academic disciplines, not just in psychology. None of the 7 scales measuring purpose in life were translated from English, but rather the items were developed based on research with Japanese participants. Given the varying scales for subjective well-being and proximity concepts, the need for a new scale in this area should be thoroughly considered once again when conducting a survey. If a new scale is not highly needed, it is important to use representative scales that have been frequently used in previous studies. Our findings are significant for the appropriate use of scales on subjective well-being and proximity concepts by psychological researchers and for their proper advice to researchers in other fields.
Social sciences (General), Environmental sciences
Undifferentiated pleomorphic sarcoma of the adrenal gland: a case report and literature review
Gong Xiaochuan, Zhao Wei, Yuan Chaoyong
et al.
Undifferentiated pleomorphic sarcoma (UPS) is a rare type of tumor, and UPS originating in the adrenal gland is even rarer. Up to now, there have been no reports in English literature of UPS originating from the adrenal gland. This case report presents a 44-year-old female patient with UPS of the adrenal gland, who has shown no signs of recurrence or metastasis half a year after undergoing resection of a left adrenal tumor. A retrospective analysis of the patient’s diagnosis and treatment process is conducted, with the aim of providing a reference for the diagnosis and treatment of adrenal UPS.
Neoplasms. Tumors. Oncology. Including cancer and carcinogens
Pronominal Clitics in Persian: A Distributed Morphology Approach
Hannah Hosseini, Shoja Tafakkori Rezayi, Amer Gheitury
Abstract
Persian independent pronouns can be considered as the head of noun phrases) Shaghaghi, 2015). However, the head of the noun phrase with a prepositional phrase (preposition + pronominal clitic) can be occupied by a pronominal clitic. To deal with this dual behavior, we adopt distributed morphology, developed by Halle & Marantz (1993), to analyze the internal structure of dependent pronouns or pronominal clitics. This research draws upon descriptive-analytical method, and the data come from Persian language. In this paper, theoretical arguments as well as linguistic evidence will be employed to examine the hypotheses. We suggest that the so-called “pronominal clitics” are actually different elements with different semantic features and structural relationships. Finally, we argue that the (M) index, as a secondary product of morphological merger (Merger) between the two heads, is supposed to be the distinctive element of the dependent pronouns, not of the independent pronouns.
Keywords: Clitics, Distributed Morphology, pronominal clitics, Object agreement, Morphological merger
Introduction
The present paper intends to examine the nature of Persian pronominal clitics or dependent personal pronouns within the framework of distributed morphology. Independent personal pronouns can appear at the head position of a noun phrase. However, it is not true for their clitic forms unless their host be a preposition phrase, wherein that the pronominal clitics can function as the head of the noun phrase. This points to the underlying distinctions between dependent personal pronouns. The present research is grounded upon the very behaviors of pronouns and intends to describe the content features and internal structure of the dependent pronouns.
Materials and methods
In the framework of generative grammar, the intuition of native speakers is an important criterion to judge the well-formedness of sentences. It is thus understood that research into different structures of a language can be based on the intuition of a competent native speaker of the language in question. The data for analysis were collected based on the intuition of one of the co-authors which were then reviewed and confirmed by other members of the research team. In the next step, data obtained were analyzed for descriptive goals related to clitics in Persian language and were then explained on the basis of Distributed morphology framework. Distributed morphology is a post syntactic framework developed by Halle and Marantz (1993). It assumes that there is no generative lexicon, and the formal features are solely restored in a basic storage. According to Marantz (1998), this non-generative storage (which Marantz calls “pure lexicon”) does not participate in the word formation process. In this theory, the other functions of the generative lexicon are now distributed along the syntactic, morphological and phonological components. In Marantz’s (1998) approach, the output of the computational system would be manipulated across the morphological component. In other words, the morphological operations that apply at the post-syntactic level can modify the structure mainly before the phonological component. Furthermore, Bobaljik (2008) argues that agreement features are assigned post-syntactically. In this descriptive analytic research, we mainly use this framework to investigate the nature of pronominal clitics in Persian
Discussion of results and conclusion
This paper is intended to analyze the internal structure and nature of morphemes known as pronominal clitics. These morphemes are attached to different hosts including nouns, prepositions and verbs. We are also aimed at finding out if they possess the same content feature. Considering this inquiry is significant from two perspectives: Firstly, the research in this area claims that one of the most important features that places the pronominal clitics in the category of Clitics is their ability to connect to different hosts. If the analysis shows that the content feature of these elements are different in structures with discrepant hosts, they can no longer be categorized as clitics. Secondly, the dual behavior and the morphological peculiarities that have been shown in combination of “base + pronominal clitic” indicate the underlying distinctions between dependent personal pronouns. Such questions lie at the bedrock of suggesting (M) index. It is assumed that this index, considered as a secondary product of morphological merger between two heads, is the distinguishing factor between dependent and independent forms of pronouns. To examine the explanatory efficiency of the (M) index, we investigated the performance of this index in the reflexive structures with local dislocation and some noun phrases. Finally, the current research investigated the structural position of the dependent pronouns attached to verbs and their structural relation to the agreement mechanism; In this part, we describe the content feature and internal structure of these pronouns. The important issue regarding these elements is that these pronominal systems really resemble affixation more than clitic systems.
Within the framework of distributed morphology, it was found that the dependent personal pronouns attached to nouns, verbs, and prepositions are separate elements with different content features and internal structures. The dependent personal pronouns can also be placed first before other morphemes making thus pronouns closer to the stem. In fact, the findings of this research indicate that these pronominal systems do not really resemble clitic systems.
PersianMind: A Cross-Lingual Persian-English Large Language Model
Pedram Rostami, Ali Salemi, Mohammad Javad Dousti
Large language models demonstrate remarkable proficiency in various linguistic tasks and have extensive knowledge across various domains. Although they perform best in English, their ability in other languages is notable too. In contrast, open-source models, such as LLaMa, are primarily trained on English datasets, resulting in poor performance in non-English languages. In this paper, we introduce PersianMind, an open-source bilingual large language model which demonstrates comparable performance to closed-source GPT-3.5-turbo in the Persian language. By expanding LLaMa2's vocabulary with 10,000 Persian tokens and training it on a dataset comprising nearly 2 billion Persian tokens, we show that our approach preserves the model's English knowledge and employs transfer learning to excel at transferring task knowledge from one language to another.
BEnQA: A Question Answering and Reasoning Benchmark for Bengali and English
Sheikh Shafayat, H M Quamran Hasan, Minhajur Rahman Chowdhury Mahim
et al.
In this study, we introduce BEnQA, a dataset comprising parallel Bengali and English exam questions for middle and high school levels in Bangladesh. Our dataset consists of approximately 5K questions covering several subjects in science with different types of questions, including factual, application, and reasoning-based questions. We benchmark several Large Language Models (LLMs) with our parallel dataset and observe a notable performance disparity between the models in Bengali and English. We also investigate some prompting methods, and find that Chain-of-Thought prompting is beneficial mostly on reasoning questions, but not so much on factual ones. We also find that appending English translation helps to answer questions in Bengali. Our findings point to promising future research directions for improving the performance of LLMs in Bengali and more generally in low-resource languages.
Unraveling the Italian and English Telegram Conspiracy Spheres through Message Forwarding
Lorenzo Alvisi, Serena Tardelli, Maurizio Tesconi
Telegram has grown into a significant platform for news and information sharing, favored for its anonymity and minimal moderation. This openness, however, makes it vulnerable to misinformation and conspiracy theories. In this study, we explore the dynamics of conspiratorial narrative dissemination within Telegram, focusing on Italian and English landscapes. In particular, we leverage the mechanism of message forwarding within Telegram and collect two extensive datasets through snowball strategy. We adopt a network-based approach and build the Italian and English Telegram networks to reveal their respective communities. By employing topic modeling, we uncover distinct narratives and dynamics of misinformation spread. Results highlight differences between Italian and English conspiracy landscapes, with Italian discourse involving assorted conspiracy theories and alternative news sources intertwined with legitimate news sources, whereas English discourse is characterized by a more focused approach on specific narratives such as QAnon and political conspiracies. Finally, we show that our methodology exhibits robustness across initial seed selections, suggesting broader applicability. This study contributes to understanding information and misinformation spread on Italian and English Telegram ecosystems through the mechanism of message forwarding
JANET: Joint Adaptive predictioN-region Estimation for Time-series
Eshant English, Eliot Wong-Toi, Matteo Fontana
et al.
Conformal prediction provides machine learning models with prediction sets that offer theoretical guarantees, but the underlying assumption of exchangeability limits its applicability to time series data. Furthermore, existing approaches struggle to handle multi-step ahead prediction tasks, where uncertainty estimates across multiple future time points are crucial. We propose JANET (Joint Adaptive predictioN-region Estimation for Time-series), a novel framework for constructing conformal prediction regions that are valid for both univariate and multivariate time series. JANET generalises the inductive conformal framework and efficiently produces joint prediction regions with controlled K-familywise error rates, enabling flexible adaptation to specific application needs. Our empirical evaluation demonstrates JANET's superior performance in multi-step prediction tasks across diverse time series datasets, highlighting its potential for reliable and interpretable uncertainty quantification in sequential data.
Underutilization of Syntactic Processing by Chinese Learners of English in Comprehending English Sentences, Evidenced from Adapted Garden-Path Ambiguity Experiment
Jiapeng Xu
Many studies have revealed that sentence comprehension relies more on semantic processing than on syntactic processing. However, previous studies have predominantly emphasized the preference for semantic processing, focusing on the semantic perspective. In contrast, this current study highlights the under-utilization of syntactic processing, from a syntactic perspective. Based on the traditional garden-path experiment, which involves locally ambiguous but globally unambiguous sentences, this study's empirical experiment innovatively crafted an adapted version featuring semantically ambiguous but syntactically unambiguous sentences to meet its specific research objective. This experiment, involving 140 subjects, demonstrates through descriptive and inferential statistical analyses using SPSS, Graph Pad Prism, and Cursor that Chinese learners of English tend to under-utilize syntactic processing when comprehending English sentences. The study identifies two types of parsing under-utilization: partial and complete. Further exploration reveals that trial and error in syntactic processing contributes to both. Consequently, this study lays a foundation for the development of a novel parsing method designed to fully integrate syntactic processing into sentence comprehension, thereby enhancing the level of English sentence comprehension for Chinese learners of English.
Comparative effectiveness of hybrid and laparoscopic techniques for repairing complex incisional ventral hernias: a systematic review and meta-analysis
Quan Wu, Weijie Ma, Qianqian Wang
et al.
Abstract Background The recently developed Hybrid Hernia Repair technique (HHR), an adaptation of the laparoscopic method, has been proposed as a potential alternative for the treatment of complex Incisional Ventral Hernias (IVH). While single-arm studies have reported promising outcomes, a comprehensive meta-analysis affirming these benefits is lacking. This meta-analysis aims to compare the clinical outcomes of HHR and Laparoscopic Hernia Repair (LHR) in the management of IVH. Methods An exhaustive search of the literature was conducted, targeting publications in both English and Chinese that compare HHR and LHR up to March 31, 2023. The primary outcomes examined were operation time, blood loss, and intestinal injury. Secondary outcomes included rates of seroma, wound infection, post-operative acute/chronic pain, recurrence, and mesh bulging. The RevMan 5.0 software facilitated the statistical meta-analysis. Results The final analysis incorporated data from 14 studies, encompassing a total of 1158 patients, with 555 undergoing HHR and 603 treated with LHR. Follow-up data, ranging from 12 to 88 months, were available in 12 out of the 14 identified studies. The HHR method was associated with a significantly lower risk of seroma (OR = 0.29, P = 0.0004), but a higher risk of wound infection (OR = 2.10, P = 0.04). No significant differences were observed between the two techniques regarding operation time, blood loss, intestinal injury, intestinal obstruction, post-operative pain, mesh bulging, and recurrence. Conclusions The HHR technique did not demonstrate a clear advantage over LHR in reducing surgical complications, apart from a lower incidence of postoperative seroma. Surgeons with substantial expertise may choose to avoid incidental conversion or intentional hybrid procedures. Further research is needed to clarify the optimal surgical approach for IVH.
United States Department of Agriculture nutrition assistance programs during the COVID-19 pandemic: A scoping review protocol.
Jessica Soldavini, Margaret Read, Lauren Clay
<h4>Objective</h4>The goal of this scoping review is to examine the published research on federal nutrition assistance programs administered by the United States (U.S.) Department of Agriculture during the COVID-19 pandemic, in the U.S., U.S. territories, and tribal nations. The review will identify the scope of the available research and provide research and policy recommendations.<h4>Introduction</h4>The COVID-19 pandemic made individuals more vulnerable to experiencing food insecurity. Federal nutrition assistance programs help to address food insecurity and have been rapidly adapting to meet food and nutrition needs among affected communities during the COVID-19 pandemic. It is important to understand the scope of the current research on this topic to help inform future research, practice, and policy recommendations.<h4>Inclusion criteria</h4>This review will include studies focused on federal nutrition assistance programs administered by the U.S. Department of Agriculture during the COVID-19 pandemic. The scoping review will consider all primary research designs.<h4>Methods</h4>Pubmed, CINHAL, Scopus, and Proquest's Health Management databases will be used for the literature search. Only articles published in English since March 1, 2020 will be considered. Titles/abstracts followed by full-text articles will be reviewed to determine which articles meet the inclusion criteria and should be included in the review. Data will be extracted from each included article using a data extraction template in Covidence that will be developed by the study team. Data extracted will include information on key findings related to the review questions. At each step, two independent reviewers will be assigned to each article. Data will be summarized and presented in tables, charts, and narrative summary.
Cross-Dialect Sentence Transformation: A Comparative Analysis of Language Models for Adapting Sentences to British English
Shruti Dutta, Shashwat Mookherjee
This study explores linguistic distinctions among American, Indian, and Irish English dialects and assesses various Language Models (LLMs) in their ability to generate British English translations from these dialects. Using cosine similarity analysis, the study measures the linguistic proximity between original British English translations and those produced by LLMs for each dialect. The findings reveal that Indian and Irish English translations maintain notably high similarity scores, suggesting strong linguistic alignment with British English. In contrast, American English exhibits slightly lower similarity, reflecting its distinct linguistic traits. Additionally, the choice of LLM significantly impacts translation quality, with Llama-2-70b consistently demonstrating superior performance. The study underscores the importance of selecting the right model for dialect translation, emphasizing the role of linguistic expertise and contextual understanding in achieving accurate translations.
TADA: Task-Agnostic Dialect Adapters for English
Will Held, Caleb Ziems, Diyi Yang
Large Language Models, the dominant starting point for Natural Language Processing (NLP) applications, fail at a higher rate for speakers of English dialects other than Standard American English (SAE). Prior work addresses this using task-specific data or synthetic data augmentation, both of which require intervention for each dialect and task pair. This poses a scalability issue that prevents the broad adoption of robust dialectal English NLP. We introduce a simple yet effective method for task-agnostic dialect adaptation by aligning non-SAE dialects using adapters and composing them with task-specific adapters from SAE. Task-Agnostic Dialect Adapters (TADA) improve dialectal robustness on 4 dialectal variants of the GLUE benchmark without task-specific supervision.
Fairness in Language Models Beyond English: Gaps and Challenges
Krithika Ramesh, Sunayana Sitaram, Monojit Choudhury
With language models becoming increasingly ubiquitous, it has become essential to address their inequitable treatment of diverse demographic groups and factors. Most research on evaluating and mitigating fairness harms has been concentrated on English, while multilingual models and non-English languages have received comparatively little attention. This paper presents a survey of fairness in multilingual and non-English contexts, highlighting the shortcomings of current research and the difficulties faced by methods designed for English. We contend that the multitude of diverse cultures and languages across the world makes it infeasible to achieve comprehensive coverage in terms of constructing fairness datasets. Thus, the measurement and mitigation of biases must evolve beyond the current dataset-driven practices that are narrowly focused on specific dimensions and types of biases and, therefore, impossible to scale across languages and cultures.
“Time Has Caught on Fire:” Eco-Anxiety and Anger in Selected Australian Poetry
Anna Kowalcze-Pawlik
This essay discusses fire as a significant factor shaping Australian social and cultural life. It focuses first on the climate-change induced emotions such as eco-anxiety and anger that can be tied with the Australian landscape, and then moves on to a discussion of the presence and function of fire in selected contemporary Australian poetry. The reflection on the poetics of trauma in the second part of the essay is accompanied by a discussion of solastalgia connected with land dispossession as an experience of the First Nations expressed in the Aboriginal literature in English.
Political science (General)
Haïti chérie
Sarner, Eric
English literature, French literature - Italian literature - Spanish literature - Portuguese literature
Evidence-based practice implementation in healthcare in China: a living scoping review
Junqiang Zhao, Wenhui Bai, Qian Zhang
et al.
Summary: Background: Evidence-based practice (EBP) implementation plays a crucial role in bridging the knowledge-action gaps and reducing health inequities. Little is known about its development in China. This study aims to provide an overview of the EBP implementation research progress in healthcare in China and identify gaps for future studies. Methods: We conducted a scoping review following the Joanna Briggs Institute scoping review methodology and the Cochrane Collaboration's guidance on living reviews. We performed a literature search in four Chinese databases (i.e., China National Knowledge Infrastructure, Wan Fang Database, The VIP Database, and China Biology Medicine) and three English databases (i.e., Ovid MEDLINE, the Cumulative Index to Nursing and Allied Health Literature, and EMBASE), Google scholar, and Baidu scholar from 1996 to 2021. We included EBP implementation studies conducted in healthcare settings in China and were published in Chinese and English literature. The search will be run on a regular basis to monitor the development of new literature and determine when to update the review. Findings: Of the 11,276 records identified, we finally included 309 papers. The publications were on a sharp rise since 2013 and were predominantly from the nursing field (292/309, 94.50%). The commonly researched areas were symptom management (75/309, 24.27%), tube care (46/309, 14.89%), perioperative care (43/309, 13.92%), and fundamental care (43/309, 13.92%). Joanna Briggs Institute model was the most frequently used model to guide the implementation process (92/159, 59.75%). A median number of 8 people often comprised an implementation team, with 113 studies (36.57%) taking a multidisciplinary approach. 204 studies reported utilizing audit criteria to assist evaluation of evidence implementation rate with diversified methods measuring the criteria. Lack of knowledge, skills, and resources, and incomplete procedures or pathways were top barriers impeding EBP implementation. Leadership support was considered the most common facilitator. Education and training were the most frequently described implementation strategies for healthcare professionals and patients. Optimizing workflows and developing evaluation tools were the primary strategies adopted by organizations. 291 studies measured patient outcomes and 174 studies measured healthcare professional outcomes. Interpretation: To our knowledge, this scoping review is the first one to systematically examine the EBP implementation research progress in healthcare in China. Based on this review, we identified contributions that Chinese EBP implementation research made to the global community, and provided eight recommendations for Chinese researchers in conducting implementation studies in the future. Funding: None.
Public aspects of medicine