M. Brüne
Hasil untuk "Language and Literature"
Menampilkan 20 dari ~3360238 hasil · dari CrossRef, DOAJ, arXiv, Semantic Scholar
Anne-Marie Masgoret, R. Gardner
R. Huddleston, G. Pullum
E. Hoff, Cynthia Core, S. Place et al.
Sebastian Holm, MD, Mario Zambrana, MD, Juan E. Berner, MD, PhD et al.
Summary:. Generative artificial intelligence (AI) large language models are an emerging technology, with ChatGPT and Gemini being 2 well-known examples. The current literature discusses clinical applications and limitations of AI, but its role in research has not yet been extensively evaluated. This study aimed to assess the role of ChatGPT and Gemini in developing novel and clinically relevant research ideas (RIs) for systematic reviews (SRs) in head and neck reconstruction. ChatGPT and Gemini were prompted to provide 10 novel and clinically relevant RIs for SRs in the following domains: head and neck reconstruction in general, microsurgery, and complications in reconstructive head and neck procedures. A comprehensive search was then performed for SRs in MEDLINE, Cochrane Library, and Embase to determine the novelty of the RIs generated. A total of 60 RIs were generated, with half created by ChatGPT and the other half by Gemini. Overall, 3613 entries were found through the literature search. After deduplication and screening, a total of 50 studies that partially addressed the AI-generated RIs were identified and were included in the present review. Out of the 60 AI-generated RIs, 42 had not been previously studied and were therefore considered novel. No statistically significant differences were found between the outputs generated by Gemini and ChatGPT. Both ChatGPT and Gemini were able to effectively generate novel and clinically relevant RIs for SRs, although their suggestions were generally broad. This study demonstrated that AI could potentially aid in the process of conducting novel SRs.
Ranjan Sapkota, Manoj Karkee
The fusion of language and vision in large vision-language models (LVLMs) has revolutionized deep learning-based object detection by enhancing adaptability, contextual reasoning, and generalization beyond traditional architectures. This in-depth review presents a structured exploration of the state-of-the-art in LVLMs, systematically organized through a three-step research review process. First, we discuss the functioning of vision language models (VLMs) for object detection, describing how these models harness natural language processing (NLP) and computer vision (CV) techniques to revolutionize object detection and localization. We then explain the architectural innovations, training paradigms, and output flexibility of recent LVLMs for object detection, highlighting how they achieve advanced contextual understanding for object detection. The review thoroughly examines the approaches used in integration of visual and textual information, demonstrating the progress made in object detection using VLMs that facilitate more sophisticated object detection and localization strategies. This review presents comprehensive visualizations demonstrating LVLMs' effectiveness in diverse scenarios including localization and segmentation, and then compares their real-time performance, adaptability, and complexity to traditional deep learning systems. Based on the review, its is expected that LVLMs will soon meet or surpass the performance of conventional methods in object detection. The review also identifies a few major limitations of the current LVLM modes, proposes solutions to address those challenges, and presents a clear roadmap for the future advancement in this field. We conclude, based on this study, that the recent advancement in LVLMs have made and will continue to make a transformative impact on object detection and robotic applications in the future.
Fred Philippy, Siwen Guo, Cedric Lothritz et al.
In NLP, Zero-Shot Classification (ZSC) has become essential for enabling models to classify text into categories unseen during training, particularly in low-resource languages and domains where labeled data is scarce. While pretrained language models (PLMs) have shown promise in ZSC, they often rely on large training datasets or external knowledge, limiting their applicability in multilingual and low-resource scenarios. Recent approaches leveraging natural language prompts reduce the dependence on large training datasets but struggle to effectively incorporate available labeled data from related classification tasks, especially when these datasets originate from different languages or distributions. Moreover, existing prompt-based methods typically rely on manually crafted prompts in a specific language, limiting their adaptability and effectiveness in cross-lingual settings. To address these challenges, we introduce RoSPrompt, a lightweight and data-efficient approach for training soft prompts that enhance cross-lingual ZSC while ensuring robust generalization across data distribution shifts. RoSPrompt is designed for small multilingual PLMs, enabling them to leverage high-resource languages to improve performance in low-resource settings without requiring extensive fine-tuning or high computational costs. We evaluate our approach on multiple multilingual PLMs across datasets covering 106 languages, demonstrating strong cross-lingual transfer performance and robust generalization capabilities over unseen classes.
Carol Griffiths
Jorge Alcón Borrega
Après des années dans un arrière-plan, l'œuvre de Madeleine Bourdouxhe (1906-1996) a été redécouverte et la critique a trouvé dans ses récits un message d'affirmation de l'identité féminine, précoce pour son époque. Sept nouvelles rassemble certaines des nouvelles écrits par l'auteure tout au long de sa vie, qui mettent en scène différentes femmes, contemporaines à l'écrivaine, qui souffrent du manque d'identité auquel elles sont soumises par la société et qui les contraint à être conditionnées par leurs maris. À travers ces histoires, l'auteur exprime le besoin des femmes de la moitié du XXe siècle d'atteindre leur propre identité. Toutefois, ce message n'est pas présenté explicitement dans l'œuvre, mais il est développé à travers différentes ressources littéraires et linguistiques qui parviendront à transmettre au lecteur le vide ressenti par ses personnages. Les brèves interventions des femmes, leurs caractéristiques et, surtout, leurs silences constitueront le point central de notre analyse.
Carrie Brooke-Sumner, Yandisa Sikweyiya, Mercilene T Machisa et al.
Introduction Young people in higher education face various stressors that can make them vulnerable to mental ill-health. Mental health promotion in this group therefore has important potential benefits. Peer-facilitated and group-format interventions may be feasible and sustainable. The scoping review outlined in this protocol aims to map the literature on group-format, peer-facilitated, in-person interventions for mental health promotion for higher education students attending courses on campuses in high and low/middle-income countries.Methods and analysis Relevant studies will be identified through conducting searches of electronic databases, including Medline, CINAHL, Scopus, ERIC and PsycINFO. Searches will be conducted using Boolean operators (AND, OR, NOT) and truncation functions appropriate for each database. We will include a grey literature search. We will include articles from student participants of any gender, and published in peer-reviewed journals between 2008 and 2023. We will include English-language studies and all study types including randomised controlled trials, pilot studies and descriptive studies of intervention development. A draft charting table has been developed, which includes the fields: author, publication date, country/countries, aims, population and sample size, demographics, methods, intervention type, comparisons, peer training, number of sessions/duration of intervention, outcomes and details of measures.Ethics and dissemination No primary data will be collected from research participants to produce this review so ethics committee approval is not required. All data will be collated from published peer-reviewed studies already in the public domain. We will publish the review in an open-access, peer-reviewed journal accessible to researchers in low/middle-income countries. This protocol is registered on Open Science Framework (https://osf.io/agbfj/).
Wengang Zhou, Weichao Zhao, Hezhen Hu et al.
Sign language serves as the primary meaning of communication for the deaf-mute community. Different from spoken language, it commonly conveys information by the collaboration of manual features, i.e., hand gestures and body movements, and non-manual features, i.e., facial expressions and mouth cues. To facilitate communication between the deaf-mute and hearing people, a series of sign language understanding (SLU) tasks have been studied in recent years, including isolated/continuous sign language recognition (ISLR/CSLR), gloss-free sign language translation (GF-SLT) and sign language retrieval (SL-RT). Sign language recognition and translation aims to understand the semantic meaning conveyed by sign languages from gloss-level and sentence-level, respectively. In contrast, SL-RT focuses on retrieving sign videos or corresponding texts from a closed-set under the query-by-example search paradigm. These tasks investigate sign language topics from diverse perspectives and raise challenges in learning effective representation of sign language videos. To advance the development of sign language understanding, exploring a generalized model that is applicable across various SLU tasks is a profound research direction.
Adam Karvonen
Language models have shown unprecedented capabilities, sparking debate over the source of their performance. Is it merely the outcome of learning syntactic patterns and surface level statistics, or do they extract semantics and a world model from the text? Prior work by Li et al. investigated this by training a GPT model on synthetic, randomly generated Othello games and found that the model learned an internal representation of the board state. We extend this work into the more complex domain of chess, training on real games and investigating our model's internal representations using linear probes and contrastive activations. The model is given no a priori knowledge of the game and is solely trained on next character prediction, yet we find evidence of internal representations of board state. We validate these internal representations by using them to make interventions on the model's activations and edit its internal board state. Unlike Li et al's prior synthetic dataset approach, our analysis finds that the model also learns to estimate latent variables like player skill to better predict the next character. We derive a player skill vector and add it to the model, improving the model's win rate by up to 2.6 times.
Dongping Chen, Jiawen Shi, Yao Wan et al.
While Large Language Models (LLMs) have achieved remarkable success across various applications, they also raise concerns regarding self-cognition. In this paper, we perform a pioneering study to explore self-cognition in LLMs. Specifically, we first construct a pool of self-cognition instruction prompts to evaluate where an LLM exhibits self-cognition and four well-designed principles to quantify LLMs' self-cognition. Our study reveals that 4 of the 48 models on Chatbot Arena--specifically Command R, Claude3-Opus, Llama-3-70b-Instruct, and Reka-core--demonstrate some level of detectable self-cognition. We observe a positive correlation between model size, training data quality, and self-cognition level. Additionally, we also explore the utility and trustworthiness of LLM in the self-cognition state, revealing that the self-cognition state enhances some specific tasks such as creative writing and exaggeration. We believe that our work can serve as an inspiration for further research to study the self-cognition in LLMs.
Zhifan Sun, Antonio Valerio Miceli-Barone
Large Language Models (LLMs) are increasingly becoming the preferred foundation platforms for many Natural Language Processing tasks such as Machine Translation, owing to their quality often comparable to or better than task-specific models, and the simplicity of specifying the task through natural language instructions or in-context examples. Their generality, however, opens them up to subversion by end users who may embed into their requests instructions that cause the model to behave in unauthorized and possibly unsafe ways. In this work we study these Prompt Injection Attacks (PIAs) on multiple families of LLMs on a Machine Translation task, focusing on the effects of model size on the attack success rates. We introduce a new benchmark data set and we discover that on multiple language pairs and injected prompts written in English, larger models under certain conditions may become more susceptible to successful attacks, an instance of the Inverse Scaling phenomenon (McKenzie et al., 2023). To our knowledge, this is the first work to study non-trivial LLM scaling behaviour in a multi-lingual setting.
Mohamed Bayan Kmainasi, Rakif Khan, Ali Ezzat Shahroor et al.
Large language models (LLMs) have shown remarkable abilities in different fields, including standard Natural Language Processing (NLP) tasks. To elicit knowledge from LLMs, prompts play a key role, consisting of natural language instructions. Most open and closed source LLMs are trained on available labeled and unlabeled resources--digital content such as text, images, audio, and videos. Hence, these models have better knowledge for high-resourced languages but struggle with low-resourced languages. Since prompts play a crucial role in understanding their capabilities, the language used for prompts remains an important research question. Although there has been significant research in this area, it is still limited, and less has been explored for medium to low-resourced languages. In this study, we investigate different prompting strategies (native vs. non-native) on 11 different NLP tasks associated with 12 different Arabic datasets (9.7K data points). In total, we conducted 197 experiments involving 3 LLMs, 12 datasets, and 3 prompting strategies. Our findings suggest that, on average, the non-native prompt performs the best, followed by mixed and native prompts.
Erik Derner, Sara Sansalvador de la Fuente, Yoan Gutiérrez et al.
Large language models (LLMs) often inherit and amplify social biases embedded in their training data. A prominent social bias is gender bias. In this regard, prior work has mainly focused on gender stereotyping bias - the association of specific roles or traits with a particular gender - in English and on evaluating gender bias in model embeddings or generated outputs. In contrast, gender representation bias - the unequal frequency of references to individuals of different genders - in the training corpora has received less attention. Yet such imbalances in the training data constitute an upstream source of bias that can propagate and intensify throughout the entire model lifecycle. To fill this gap, we propose a novel LLM-based method to detect and quantify gender representation bias in LLM training data in gendered languages, where grammatical gender challenges the applicability of methods developed for English. By leveraging the LLMs' contextual understanding, our approach automatically identifies and classifies person-referencing words in gendered language corpora. Applied to four Spanish-English benchmarks and five Valencian corpora, our method reveals substantial male-dominant imbalances. We show that such biases in training data affect model outputs, but can surprisingly be mitigated leveraging small-scale training on datasets that are biased towards the opposite gender. Our findings highlight the need for corpus-level gender bias analysis in multilingual NLP. We make our code and data publicly available.
Jakub Piskorski, Michał Marcińczuk, Roman Yangarber
This paper presents a corpus manually annotated with named entities for six Slavic languages - Bulgarian, Czech, Polish, Slovenian, Russian, and Ukrainian. This work is the result of a series of shared tasks, conducted in 2017-2023 as a part of the Workshops on Slavic Natural Language Processing. The corpus consists of 5 017 documents on seven topics. The documents are annotated with five classes of named entities. Each entity is described by a category, a lemma, and a unique cross-lingual identifier. We provide two train-tune dataset splits - single topic out and cross topics. For each split, we set benchmarks using a transformer-based neural network architecture with the pre-trained multilingual models - XLM-RoBERTa-large for named entity mention recognition and categorization, and mT5-large for named entity lemmatization and linking.
Nilo Pedrazzini
Languages can encode temporal subordination lexically, via subordinating conjunctions, and morphologically, by marking the relation on the predicate. Systematic cross-linguistic variation among the former can be studied using well-established token-based typological approaches to token-aligned parallel corpora. Variation among different morphological means is instead much harder to tackle and therefore more poorly understood, despite being predominant in several language groups. This paper explores variation in the expression of generic temporal subordination ('when'-clauses) among the languages of Latin America and the Caribbean, where morphological marking is particularly common. It presents probabilistic semantic maps computed on the basis of the languages of the region, thus avoiding bias towards the many world's languages that exclusively use lexified connectors, incorporating associations between character $n$-grams and English $when$. The approach allows capturing morphological clause-linkage devices in addition to lexified connectors, paving the way for larger-scale, strategy-agnostic analyses of typological variation in temporal subordination.
Harzat Abbas, Asif Abbas, Asim Iqbal et al.
This research paper investigates the exploration of trauma within Abdulrazak Gurnah's novel "Afterlives" (2020), examining the profound impact of historical and personal traumas on the characters, particularly the protagonist Hamza. The research adopts a qualitative research paradigm and incorporates primary and secondary sources to analyze the text comprehensively using trauma analysis theory. Literature, as a dominant medium, reflects human experience, with trauma emerging as a pervasive merged theme of stories of suffering and self-discovery. The examination explores treating trauma as a mere narrative device, revealing it as a tangible representation shaping characters' lives. Memories and nightmares in the novel are depicted as echoes of a haunting past, challenging Hamza's sense of self and resilience. The study concludes that "Afterlives" stands out as an exceptional portrayal of trauma in literature, emphasizing the long-lasting impact on the human psyche. Suggestions include a comparative analysis with similar works, an exploration of postcolonial perspectives in Gurnah's literature, and an examination of healing mechanisms portrayed in the aftermath of trauma. Ultimately, the research adds to a broader comprehension of literary trauma, emphasizing its relevance in shaping human experiences and promoting empathy, kindness, and solidarity in adversity.
Dieli Vesaro Palma, Thiago Zilio Passerini
O presente artigo tem como objetivo detalhar as principais contribuições de Leonor Lopes Fávero aos estudos linguísticos empreendidos no Brasil, mais especificamente os relacionados à linguística textual. Para tanto, estabeleceu-se o recorte temporal de 1980 a 1986, que compreende, aproximadamente, ao primeiro momento da linguística textual brasileira, delimitado por Koch (1999). Com relação à perspectiva de análise adotada, partiu-se dos pressupostos da historiografia linguística postulados sobretudo por Koerner (2014) e Swiggers (2012). O corpus selecionado contou com textos que circularam no intervalo estabelecido, entre eles, artigos, anais de congressos, livros e capítulos de livros. Como material epi-historiográfico, utilizaram-se principalmente, as contribuições de Bentes (2001), Fávero (2017, 2019, 2021), Galembeck (2015) e Koch (1997, 1999, 2003). Os resultados da análise mostraram a relevância da autora para o período em questão, no que se refere tanto à introdução quanto ao desenvolvimento dos estudos linguísticos textuais no Brasil.
Halaman 16 dari 168012