Hasil "Language and Literature"

S2 Open Access 2005

"Theory of mind" in schizophrenia: a review of the literature.

M. Brüne

1061 sitasi en Medicine, Psychology

Detail DOI Sumber

S2 Open Access 2003

Attitudes, Motivation, and Second Language Learning: A Meta‐Analysis of Studies Conducted by Gardner and Associates

Anne-Marie Masgoret, R. Gardner

1197 sitasi en Psychology

Detail DOI Sumber

S2 Open Access 2002

The Cambridge Grammar of the English Language

R. Huddleston, G. Pullum

1217 sitasi en History, Sociology

Detail DOI Sumber

S2 Open Access 2011

Dual language exposure and early bilingual development*

E. Hoff, Cynthia Core, S. Place et al.

829 sitasi en Psychology, Medicine

Detail DOI Sumber

DOAJ Open Access 2025

Evaluating Artificial Intelligence’s Role in Developing Research Questions in Head and Neck Reconstruction

Sebastian Holm, MD, Mario Zambrana, MD, Juan E. Berner, MD, PhD et al.

Summary:. Generative artificial intelligence (AI) large language models are an emerging technology, with ChatGPT and Gemini being 2 well-known examples. The current literature discusses clinical applications and limitations of AI, but its role in research has not yet been extensively evaluated. This study aimed to assess the role of ChatGPT and Gemini in developing novel and clinically relevant research ideas (RIs) for systematic reviews (SRs) in head and neck reconstruction. ChatGPT and Gemini were prompted to provide 10 novel and clinically relevant RIs for SRs in the following domains: head and neck reconstruction in general, microsurgery, and complications in reconstructive head and neck procedures. A comprehensive search was then performed for SRs in MEDLINE, Cochrane Library, and Embase to determine the novelty of the RIs generated. A total of 60 RIs were generated, with half created by ChatGPT and the other half by Gemini. Overall, 3613 entries were found through the literature search. After deduplication and screening, a total of 50 studies that partially addressed the AI-generated RIs were identified and were included in the present review. Out of the 60 AI-generated RIs, 42 had not been previously studied and were therefore considered novel. No statistically significant differences were found between the outputs generated by Gemini and ChatGPT. Both ChatGPT and Gemini were able to effectively generate novel and clinically relevant RIs for SRs, although their suggestions were generally broad. This study demonstrated that AI could potentially aid in the process of conducting novel SRs.

Surgery

Detail DOI Sumber

arXiv Open Access 2025

Object Detection with Multimodal Large Vision-Language Models: An In-depth Review

Ranjan Sapkota, Manoj Karkee

The fusion of language and vision in large vision-language models (LVLMs) has revolutionized deep learning-based object detection by enhancing adaptability, contextual reasoning, and generalization beyond traditional architectures. This in-depth review presents a structured exploration of the state-of-the-art in LVLMs, systematically organized through a three-step research review process. First, we discuss the functioning of vision language models (VLMs) for object detection, describing how these models harness natural language processing (NLP) and computer vision (CV) techniques to revolutionize object detection and localization. We then explain the architectural innovations, training paradigms, and output flexibility of recent LVLMs for object detection, highlighting how they achieve advanced contextual understanding for object detection. The review thoroughly examines the approaches used in integration of visual and textual information, demonstrating the progress made in object detection using VLMs that facilitate more sophisticated object detection and localization strategies. This review presents comprehensive visualizations demonstrating LVLMs' effectiveness in diverse scenarios including localization and segmentation, and then compares their real-time performance, adaptability, and complexity to traditional deep learning systems. Based on the review, its is expected that LVLMs will soon meet or surpass the performance of conventional methods in object detection. The review also identifies a few major limitations of the current LVLM modes, proposes solutions to address those challenges, and presents a clear roadmap for the future advancement in this field. We conclude, based on this study, that the recent advancement in LVLMs have made and will continue to make a transformative impact on object detection and robotic applications in the future.

en cs.CV, cs.AI

Detail DOI Sumber

arXiv Open Access 2025

Enhancing Small Language Models for Cross-Lingual Generalized Zero-Shot Classification with Soft Prompt Tuning

Fred Philippy, Siwen Guo, Cedric Lothritz et al.

In NLP, Zero-Shot Classification (ZSC) has become essential for enabling models to classify text into categories unseen during training, particularly in low-resource languages and domains where labeled data is scarce. While pretrained language models (PLMs) have shown promise in ZSC, they often rely on large training datasets or external knowledge, limiting their applicability in multilingual and low-resource scenarios. Recent approaches leveraging natural language prompts reduce the dependence on large training datasets but struggle to effectively incorporate available labeled data from related classification tasks, especially when these datasets originate from different languages or distributions. Moreover, existing prompt-based methods typically rely on manually crafted prompts in a specific language, limiting their adaptability and effectiveness in cross-lingual settings. To address these challenges, we introduce RoSPrompt, a lightweight and data-efficient approach for training soft prompts that enhance cross-lingual ZSC while ensuring robust generalization across data distribution shifts. RoSPrompt is designed for small multilingual PLMs, enabling them to leverage high-resource languages to improve performance in low-resource settings without requiring extensive fine-tuning or high computational costs. We evaluate our approach on multiple multilingual PLMs across datasets covering 106 languages, demonstrating strong cross-lingual transfer performance and robust generalization capabilities over unseen classes.

en cs.CL, cs.AI

Detail Sumber

CrossRef Open Access 2024

Using Literature in the Language Classroom

Carol Griffiths

en

Detail DOI Sumber

DOAJ Open Access 2024

Tout ce qui n’est pas dit par Madeleine Bourdouxhe. L’affirmation féminine de Sept Nouvelles à travers le silence

Jorge Alcón Borrega

Après des années dans un arrière-plan, l'œuvre de Madeleine Bourdouxhe (1906-1996) a été redécouverte et la critique a trouvé dans ses récits un message d'affirmation de l'identité féminine, précoce pour son époque. Sept nouvelles rassemble certaines des nouvelles écrits par l'auteure tout au long de sa vie, qui mettent en scène différentes femmes, contemporaines à l'écrivaine, qui souffrent du manque d'identité auquel elles sont soumises par la société et qui les contraint à être conditionnées par leurs maris. À travers ces histoires, l'auteur exprime le besoin des femmes de la moitié du XXe siècle d'atteindre leur propre identité. Toutefois, ce message n'est pas présenté explicitement dans l'œuvre, mais il est développé à travers différentes ressources littéraires et linguistiques qui parviendront à transmettre au lecteur le vide ressenti par ses personnages. Les brèves interventions des femmes, leurs caractéristiques et, surtout, leurs silences constitueront le point central de notre analyse.

Philology. Linguistics, French literature - Italian literature - Spanish literature - Portuguese literature

Detail DOI Sumber

DOAJ Open Access 2024

Group-format, peer-facilitated mental health promotion interventions for students in higher education settings: a scoping review protocol

Carrie Brooke-Sumner, Yandisa Sikweyiya, Mercilene T Machisa et al.

Introduction Young people in higher education face various stressors that can make them vulnerable to mental ill-health. Mental health promotion in this group therefore has important potential benefits. Peer-facilitated and group-format interventions may be feasible and sustainable. The scoping review outlined in this protocol aims to map the literature on group-format, peer-facilitated, in-person interventions for mental health promotion for higher education students attending courses on campuses in high and low/middle-income countries.Methods and analysis Relevant studies will be identified through conducting searches of electronic databases, including Medline, CINAHL, Scopus, ERIC and PsycINFO. Searches will be conducted using Boolean operators (AND, OR, NOT) and truncation functions appropriate for each database. We will include a grey literature search. We will include articles from student participants of any gender, and published in peer-reviewed journals between 2008 and 2023. We will include English-language studies and all study types including randomised controlled trials, pilot studies and descriptive studies of intervention development. A draft charting table has been developed, which includes the fields: author, publication date, country/countries, aims, population and sample size, demographics, methods, intervention type, comparisons, peer training, number of sessions/duration of intervention, outcomes and details of measures.Ethics and dissemination No primary data will be collected from research participants to produce this review so ethics committee approval is not required. All data will be collated from published peer-reviewed studies already in the public domain. We will publish the review in an open-access, peer-reviewed journal accessible to researchers in low/middle-income countries. This protocol is registered on Open Science Framework (https://osf.io/agbfj/).

Medicine

Detail DOI Sumber

arXiv Open Access 2024

Scaling up Multimodal Pre-training for Sign Language Understanding

Wengang Zhou, Weichao Zhao, Hezhen Hu et al.

Sign language serves as the primary meaning of communication for the deaf-mute community. Different from spoken language, it commonly conveys information by the collaboration of manual features, i.e., hand gestures and body movements, and non-manual features, i.e., facial expressions and mouth cues. To facilitate communication between the deaf-mute and hearing people, a series of sign language understanding (SLU) tasks have been studied in recent years, including isolated/continuous sign language recognition (ISLR/CSLR), gloss-free sign language translation (GF-SLT) and sign language retrieval (SL-RT). Sign language recognition and translation aims to understand the semantic meaning conveyed by sign languages from gloss-level and sentence-level, respectively. In contrast, SL-RT focuses on retrieving sign videos or corresponding texts from a closed-set under the query-by-example search paradigm. These tasks investigate sign language topics from diverse perspectives and raise challenges in learning effective representation of sign language videos. To advance the development of sign language understanding, exploring a generalized model that is applicable across various SLU tasks is a profound research direction.

en cs.CV, cs.MM

Detail Sumber

arXiv Open Access 2024

Emergent World Models and Latent Variable Estimation in Chess-Playing Language Models

Adam Karvonen

Language models have shown unprecedented capabilities, sparking debate over the source of their performance. Is it merely the outcome of learning syntactic patterns and surface level statistics, or do they extract semantics and a world model from the text? Prior work by Li et al. investigated this by training a GPT model on synthetic, randomly generated Othello games and found that the model learned an internal representation of the board state. We extend this work into the more complex domain of chess, training on real games and investigating our model's internal representations using linear probes and contrastive activations. The model is given no a priori knowledge of the game and is solely trained on next character prediction, yet we find evidence of internal representations of board state. We validate these internal representations by using them to make interventions on the model's activations and edit its internal board state. Unlike Li et al's prior synthetic dataset approach, our analysis finds that the model also learns to estimate latent variables like player skill to better predict the next character. We derive a player skill vector and add it to the model, improving the model's win rate by up to 2.6 times.

en cs.LG, cs.CL

Detail Sumber

arXiv Open Access 2024

Self-Cognition in Large Language Models: An Exploratory Study

Dongping Chen, Jiawen Shi, Yao Wan et al.

While Large Language Models (LLMs) have achieved remarkable success across various applications, they also raise concerns regarding self-cognition. In this paper, we perform a pioneering study to explore self-cognition in LLMs. Specifically, we first construct a pool of self-cognition instruction prompts to evaluate where an LLM exhibits self-cognition and four well-designed principles to quantify LLMs' self-cognition. Our study reveals that 4 of the 48 models on Chatbot Arena--specifically Command R, Claude3-Opus, Llama-3-70b-Instruct, and Reka-core--demonstrate some level of detectable self-cognition. We observe a positive correlation between model size, training data quality, and self-cognition level. Additionally, we also explore the utility and trustworthiness of LLM in the self-cognition state, revealing that the self-cognition state enhances some specific tasks such as creative writing and exaggeration. We believe that our work can serve as an inspiration for further research to study the self-cognition in LLMs.

en cs.CL, cs.AI

Detail Sumber

arXiv Open Access 2024

Scaling Behavior of Machine Translation with Large Language Models under Prompt Injection Attacks

Zhifan Sun, Antonio Valerio Miceli-Barone

Large Language Models (LLMs) are increasingly becoming the preferred foundation platforms for many Natural Language Processing tasks such as Machine Translation, owing to their quality often comparable to or better than task-specific models, and the simplicity of specifying the task through natural language instructions or in-context examples. Their generality, however, opens them up to subversion by end users who may embed into their requests instructions that cause the model to behave in unauthorized and possibly unsafe ways. In this work we study these Prompt Injection Attacks (PIAs) on multiple families of LLMs on a Machine Translation task, focusing on the effects of model size on the attack success rates. We introduce a new benchmark data set and we discover that on multiple language pairs and injected prompts written in English, larger models under certain conditions may become more susceptible to successful attacks, an instance of the Inverse Scaling phenomenon (McKenzie et al., 2023). To our knowledge, this is the first work to study non-trivial LLM scaling behaviour in a multi-lingual setting.

en cs.CL

Detail Sumber

arXiv Open Access 2024

Native vs Non-Native Language Prompting: A Comparative Analysis

Mohamed Bayan Kmainasi, Rakif Khan, Ali Ezzat Shahroor et al.

Large language models (LLMs) have shown remarkable abilities in different fields, including standard Natural Language Processing (NLP) tasks. To elicit knowledge from LLMs, prompts play a key role, consisting of natural language instructions. Most open and closed source LLMs are trained on available labeled and unlabeled resources--digital content such as text, images, audio, and videos. Hence, these models have better knowledge for high-resourced languages but struggle with low-resourced languages. Since prompts play a crucial role in understanding their capabilities, the language used for prompts remains an important research question. Although there has been significant research in this area, it is still limited, and less has been explored for medium to low-resourced languages. In this study, we investigate different prompting strategies (native vs. non-native) on 11 different NLP tasks associated with 12 different Arabic datasets (9.7K data points). In total, we conducted 197 experiments involving 3 LLMs, 12 datasets, and 3 prompting strategies. Our findings suggest that, on average, the non-native prompt performs the best, followed by mixed and native prompts.

en cs.CL, cs.AI

Detail Sumber

arXiv Open Access 2024

Leveraging Large Language Models to Measure Gender Representation Bias in Gendered Language Corpora

Erik Derner, Sara Sansalvador de la Fuente, Yoan Gutiérrez et al.

Large language models (LLMs) often inherit and amplify social biases embedded in their training data. A prominent social bias is gender bias. In this regard, prior work has mainly focused on gender stereotyping bias - the association of specific roles or traits with a particular gender - in English and on evaluating gender bias in model embeddings or generated outputs. In contrast, gender representation bias - the unequal frequency of references to individuals of different genders - in the training corpora has received less attention. Yet such imbalances in the training data constitute an upstream source of bias that can propagate and intensify throughout the entire model lifecycle. To fill this gap, we propose a novel LLM-based method to detect and quantify gender representation bias in LLM training data in gendered languages, where grammatical gender challenges the applicability of methods developed for English. By leveraging the LLMs' contextual understanding, our approach automatically identifies and classifies person-referencing words in gendered language corpora. Applied to four Spanish-English benchmarks and five Valencian corpora, our method reveals substantial male-dominant imbalances. We show that such biases in training data affect model outputs, but can surprisingly be mitigated leveraging small-scale training on datasets that are biased towards the opposite gender. Our findings highlight the need for corpus-level gender bias analysis in multilingual NLP. We make our code and data publicly available.

en cs.CL, cs.CY

Detail Sumber

arXiv Open Access 2024

Cross-lingual Named Entity Corpus for Slavic Languages

Jakub Piskorski, Michał Marcińczuk, Roman Yangarber

This paper presents a corpus manually annotated with named entities for six Slavic languages - Bulgarian, Czech, Polish, Slovenian, Russian, and Ukrainian. This work is the result of a series of shared tasks, conducted in 2017-2023 as a part of the Workshops on Slavic Natural Language Processing. The corpus consists of 5 017 documents on seven topics. The documents are annotated with five classes of named entities. Each entity is described by a category, a lemma, and a unique cross-lingual identifier. We provide two train-tune dataset splits - single topic out and cross topics. For each split, we set benchmarks using a transformer-based neural network architecture with the pre-trained multilingual models - XLM-RoBERTa-large for named entity mention recognition and categorization, and mT5-large for named entity lemmatization and linking.

en cs.CL, cs.AI

Detail Sumber

arXiv Open Access 2024

Mapping 'when'-clauses in Latin American and Caribbean languages: an experiment in subtoken-based typology

Nilo Pedrazzini

Languages can encode temporal subordination lexically, via subordinating conjunctions, and morphologically, by marking the relation on the predicate. Systematic cross-linguistic variation among the former can be studied using well-established token-based typological approaches to token-aligned parallel corpora. Variation among different morphological means is instead much harder to tackle and therefore more poorly understood, despite being predominant in several language groups. This paper explores variation in the expression of generic temporal subordination ('when'-clauses) among the languages of Latin America and the Caribbean, where morphological marking is particularly common. It presents probabilistic semantic maps computed on the basis of the languages of the region, thus avoiding bias towards the many world's languages that exclusively use lexified connectors, incorporating associations between character $n$-grams and English $when$. The approach allows capturing morphological clause-linkage devices in addition to lexified connectors, paving the way for larger-scale, strategy-agnostic analyses of typological variation in temporal subordination.

en cs.CL, cs.IR

Detail DOI Sumber

DOAJ Open Access 2023

Echoes of Haunted Memories and Nightmares: Understanding Trauma in Gurnah's Afterlives

Harzat Abbas, Asif Abbas, Asim Iqbal et al.

This research paper investigates the exploration of trauma within Abdulrazak Gurnah's novel "Afterlives" (2020), examining the profound impact of historical and personal traumas on the characters, particularly the protagonist Hamza. The research adopts a qualitative research paradigm and incorporates primary and secondary sources to analyze the text comprehensively using trauma analysis theory. Literature, as a dominant medium, reflects human experience, with trauma emerging as a pervasive merged theme of stories of suffering and self-discovery. The examination explores treating trauma as a mere narrative device, revealing it as a tangible representation shaping characters' lives. Memories and nightmares in the novel are depicted as echoes of a haunting past, challenging Hamza's sense of self and resilience. The study concludes that "Afterlives" stands out as an exceptional portrayal of trauma in literature, emphasizing the long-lasting impact on the human psyche. Suggestions include a comparative analysis with similar works, an exploration of postcolonial perspectives in Gurnah's literature, and an examination of healing mechanisms portrayed in the aftermath of trauma. Ultimately, the research adds to a broader comprehension of literary trauma, emphasizing its relevance in shaping human experiences and promoting empathy, kindness, and solidarity in adversity.

English literature, Language. Linguistic theory. Comparative grammar

Detail Sumber

DOAJ Open Access 2023

Tributo a Leonor Lopes Fávero: o primeiro momento da linguística textual no Brasil

Dieli Vesaro Palma, Thiago Zilio Passerini

O presente artigo tem como objetivo detalhar as principais contribuições de Leonor Lopes Fávero aos estudos linguísticos empreendidos no Brasil, mais especificamente os relacionados à linguística textual. Para tanto, estabeleceu-se o recorte temporal de 1980 a 1986, que compreende, aproximadamente, ao primeiro momento da linguística textual brasileira, delimitado por Koch (1999). Com relação à perspectiva de análise adotada, partiu-se dos pressupostos da historiografia linguística postulados sobretudo por Koerner (2014) e Swiggers (2012). O corpus selecionado contou com textos que circularam no intervalo estabelecido, entre eles, artigos, anais de congressos, livros e capítulos de livros. Como material epi-historiográfico, utilizaram-se principalmente, as contribuições de Bentes (2001), Fávero (2017, 2019, 2021), Galembeck (2015) e Koch (1997, 1999, 2003). Os resultados da análise mostraram a relevância da autora para o período em questão, no que se refere tanto à introdução quanto ao desenvolvimento dos estudos linguísticos textuais no Brasil.

Language and Literature, English language

Detail DOI Sumber

Hasil untuk "Language and Literature"