EMBODIED MEANING CONSTRUCTION IN EFL LITERARY READING: A READER-RESPONSE STUDY OF THE OLD MAN AND THE SEA
Nur Mutmainna Halim, Abd Halim
This study investigates how meaning is constructed through embodied cognitive processes when EFL learners engage with The Old Man and the Sea. Grounded in Barsalou’s Embodied Cognition Theory (1999, 2008), which conceptualizes language comprehension as the reactivation of perceptual, motor, bodily, and affective systems rather than abstract symbol manipulation, the study examines reader responses to a literary text characterized by narrative restraint and minimal explicit emotional description. The participants were undergraduate students from the English Literature Study Program at Universitas Negeri Makassar enrolled in the History of English Language and Literature course (2024 cohort). 142 students across five intact classes (A–E), 57 students (40.1%) selected The Old Man and the Sea as their preferred final-test novel and constituted the focal participant group. Data were collected through an open-ended reflective questionnaire eliciting emotional reactions, imagined experiences, reflective pauses, and lingering thoughts after reading. The data were analyzed using thematic analysis with theory-driven coding, guided by embodied cognition categories including sensorimotor imagery, bodily state, action simulation, and affective response. The findings reveal that students consistently relied on embodied simulation to construct meaning, reporting strong experiences of empathy, loneliness, sadness, and admiration derived from imagining Santiago’s physical struggle, pain, fatigue, and isolation. Meaning emerged through experiential inference, as understanding developed from felt bodily and affective engagement rather than explicit textual cues. The study demonstrates the pedagogical potential of literary reading in EFL contexts to foster affective engagement, empathy development, and reader-centred meaning construction, while extending embodied cognition research to authentic classroom-based literary experiences.
Language. Linguistic theory. Comparative grammar
To Write or to Automate Linguistic Prompts, That Is the Question
Marina Sánchez-Torrón, Daria Akselrod, Jason Rauchwerk
LLM performance is highly sensitive to prompt design, yet whether automatic prompt optimization can replace expert prompt engineering in linguistic tasks remains unexplored. We present the first systematic comparison of hand-crafted zero-shot expert prompts, base DSPy signatures, and GEPA-optimized DSPy signatures across translation, terminology insertion, and language quality assessment, evaluating five model configurations. Results are task-dependent. In terminology insertion, optimized and manual prompts produce mostly statistically indistinguishable quality. In translation, each approach wins on different models. In LQA, expert prompts achieve stronger error detection while optimization improves characterization. Across all tasks, GEPA elevates minimal DSPy signatures, and the majority of expert-optimized comparisons show no statistically significant difference. We note that the comparison is asymmetric: GEPA optimization searches programmatically over gold-standard splits, whereas expert prompts require in principle no labeled data, relying instead on domain expertise and iterative refinement.
Beyond the Battlefield: A Cross-European Study of Wartime Disinformation
Rocío Sánchez-del-Vas, Jorge Tuñón-Navarro
Russia’s invasion of Ukraine has profoundly altered the global geopolitical landscape. Owing to its geographical proximity, the conflict has had a considerable impact on Europe. Marked by the professionalisation and democratisation of technology, it has underscored the growing significance of hybrid warfare, in which disinformation and propaganda serve as additional instruments of war. Within this context, the aim of this article is to examine the characteristics of false information related to the war between Russia and Ukraine in four European countries between 2022 and 2023. To this end, a content analysis of 297 hoaxes was conducted across eight fact-checking platforms, complemented by ten in-depth interviews with specialised professionals. The findings indicate that disinformation is characterised by viral audiovisual hoaxes, particularly on Facebook and X (formerly Twitter), with a notable surge in disinformation flows at the onset of the invasion. In the early months, misleading content predominantly consisted of decontextualised images of the conflict, whereas a year later, the focus shifted to narratives concerning international support and alliances. The primary objective of this disinformation is to polarise public opinion against a perceived common enemy. The conclusions provide a broader and more nuanced understanding of wartime disinformation within the European context.
Journalism. The periodical press, etc., Communication. Mass media
Crosscoding Through Time: Tracking Emergence & Consolidation Of Linguistic Representations Throughout LLM Pretraining
Deniz Bayazit, Aaron Mueller, Antoine Bosselut
Large language models (LLMs) learn non-trivial abstractions during pretraining, like detecting irregular plural noun subjects. However, it is not well understood when and how specific linguistic abilities emerge as traditional evaluation methods such as benchmarking fail to reveal how models acquire concepts and capabilities. To bridge this gap and better understand model training at the concept level, we use sparse crosscoders to discover and align features across model checkpoints. Using this approach, we track the evolution of linguistic features during pretraining. We train crosscoders between open-sourced checkpoint triplets with significant performance and representation shifts, and introduce a novel metric, Relative Indirect Effects (RelIE), to trace training stages at which individual features become causally important for task performance. We show that crosscoders can detect feature emergence, maintenance, and discontinuation during pretraining. Our approach is architecture-agnostic and scalable, offering a promising path toward more interpretable and fine-grained analysis of representation learning throughout pretraining.
Construction and educational application of a linguistically grounded dependency treebank for Uyghur
Jiaxin Zuo, Yiquan Wang, Yuan Pan
et al.
Developing effective educational technologies for low-resource agglutinative languages like Uyghur is often hindered by the mismatch between existing annotation frameworks and specific grammatical structures. To address this challenge, this study introduces the Modern Uyghur Dependency Treebank (MUDT), a linguistically grounded annotation framework specifically designed to capture the agglutinative complexity of Uyghur, including zero copula constructions and fine-grained case marking. Utilizing a hybrid pipeline that combines Large Language Model pre-annotation with rigorous human correction, a high-quality treebank consisting of 3,456 sentences was constructed. Intrinsic structural evaluation reveals that MUDT significantly improves dependency projectivity by reducing the crossing-arc rate from 7.35\% in the Universal Dependencies standard to 0.06\%. Extrinsic parsing experiments using UDPipe and Stanza further demonstrate that models trained on MUDT achieve superior in-domain accuracy and cross-domain generalization compared to UD-based baselines. To validate the practical utility of this computational resource, an AI-assisted grammar tutoring system was developed to translate MUDT-based syntactic analyses into interpretable pedagogical feedback. A controlled experiment involving 35 second-language learners indicated that students receiving syntax-aware feedback achieved significantly higher learning gains compared to those in a control group. These findings establish MUDT as a robust foundation for syntactic analysis and underscore the critical role of linguistically informed natural language processing resources in bridging the gap between computational models and the cognitive needs of second-language learners.
LinguaSynth: Heterogeneous Linguistic Signals for News Classification
Duo Zhang, Junyi Mo
Deep learning has significantly advanced NLP, but its reliance on large black-box models introduces critical interpretability and computational efficiency concerns. This paper proposes LinguaSynth, a novel text classification framework that strategically integrates five complementary linguistic feature types: lexical, syntactic, entity-level, word-level semantics, and document-level semantics within a transparent logistic regression model. Unlike transformer-based architectures, LinguaSynth maintains interpretability and computational efficiency, achieving an accuracy of 84.89 percent on the 20 Newsgroups dataset and surpassing a robust TF-IDF baseline by 3.32 percent. Through rigorous feature interaction analysis, we show that syntactic and entity-level signals provide essential disambiguation and effectively complement distributional semantics. LinguaSynth sets a new benchmark for interpretable, resource-efficient NLP models and challenges the prevailing assumption that deep neural networks are necessary for high-performing text classification.
Linguistic Generalizations are not Rules: Impacts on Evaluation of LMs
Leonie Weissweiler, Kyle Mahowald, Adele Goldberg
Linguistic evaluations of how well LMs generalize to produce or understand language often implicitly take for granted that natural languages are generated by symbolic rules. According to this perspective, grammaticality is determined by whether sentences obey such rules. Interpretation is compositionally generated by syntactic rules operating on meaningful words. Semantic parsing maps sentences into formal logic. Failures of LMs to obey strict rules are presumed to reveal that LMs do not produce or understand language like humans. Here we suggest that LMs' failures to obey symbolic rules may be a feature rather than a bug, because natural languages are not based on neatly separable, compositional rules. Rather, new utterances are produced and understood by a combination of flexible, interrelated, and context-dependent constructions. Considering gradient factors such as frequencies, context, and function will help us reimagine new benchmarks and analyses to probe whether and how LMs capture the rich, flexible generalizations that comprise natural languages.
A Methodology for Studying Linguistic and Cultural Change in China, 1900-1950
Spencer Dean Stewart
This paper presents a quantitative approach to studying linguistic and cultural change in China during the first half of the twentieth century, a period that remains understudied in computational humanities research. The dramatic changes in Chinese language and culture during this time call for greater reflection on the tools and methods used for text analysis. This preliminary study offers a framework for analyzing Chinese texts from the late nineteenth and twentieth centuries, demonstrating how established methods such as word counts and word embeddings can provide new historical insights into the complex negotiations between Western modernity and Chinese cultural discourse.
ExpliCa: Evaluating Explicit Causal Reasoning in Large Language Models
Martina Miliani, Serena Auriemma, Alessandro Bondielli
et al.
Large Language Models (LLMs) are increasingly used in tasks requiring interpretive and inferential accuracy. In this paper, we introduce ExpliCa, a new dataset for evaluating LLMs in explicit causal reasoning. ExpliCa uniquely integrates both causal and temporal relations presented in different linguistic orders and explicitly expressed by linguistic connectives. The dataset is enriched with crowdsourced human acceptability ratings. We tested LLMs on ExpliCa through prompting and perplexity-based metrics. We assessed seven commercial and open-source LLMs, revealing that even top models struggle to reach 0.80 accuracy. Interestingly, models tend to confound temporal relations with causal ones, and their performance is also strongly influenced by the linguistic order of the events. Finally, perplexity-based scores and prompting performance are differently affected by model size.
Linguistic Indirectness in Public Cheap-Talk Games
Liping Tang, Michiko Ogaku
We study linguistic indirectness when speakers attend to social ties. Social ties are modeled by a graph, and conferences are the sets of nodes that hear a message. Conference worth is a distance polynomial on the graph; allocations are given by the Myerson value of the conference-restricted worth, which yields the bargaining-power components for each participant. Aggregating these components gives an effective bias that, via a Partition-Threshold rule, pins down the number of equilibrium message partitions in a cheap talk game. Results: (i) among trees, stars maximize worth, leading to weakly fewer equilibrium partitions; (ii) on stars, we derive closed-form effective biases, with a witness-hub marginal effect of adding leaves changing sign at $δ^{\ast}=0.6$; (iii) for two stars joined by one link, two-star (hub-hub) vs big-star (hub-leaf) precision flips at 8/15 for the same number of nodes; private leaf-leaf conferences are most informative.
La polarización ideológica de los periodistas españoles ante la corrupción institucional
Azahara Ortiz González, Rosa Berganza, Beatriz Herrero-Jiménez
Los periodistas son quienes, a través de los medios de comunicación, cubren y enmarcan los escándalos de corrupción. De este modo, tienen un papel relevante en la información que recibe la ciudadanía sobre este fenómeno. El presente artículo busca averiguar si los profesionales de la información se encuentran influidos por su ideología política a la hora de evaluar el nivel de corrupción en las distintas instituciones (tanto políticas como regulatorias). También indaga en si creen que la cobertura de este fenómeno en los medios está influida por la polarización. Para ello, se realizó, entre marzo y julio de 2023, una encuesta representativa a 391 periodistas españoles de distintos tipos de medios, en la que se les preguntó principalmente por su ideología política y su percepción de la corrupción en distintas instituciones. Los resultados desvelan que los periodistas tienden a percibir niveles de corrupción de manera diferente según su ideología política, con una tendencia a considerar que esta es mayor en los partidos e instituciones que consideran opuestas a su orientación ideológica. Esta percepción no solo se da a la hora de evaluar los partidos políticos (que, obviamente, tienen una posición ideológica explícita), sino también con otras instituciones a priori neutrales o no alineadas políticamente. Por otro lado, la mayoría de los periodistas concuerdan en que la polarización política existente en España fomenta que los medios busquen y primen la publicación de escándalos ocurridos en el seno de partidos políticos de la ideología contraria.
Communication. Mass media, Journalism. The periodical press, etc.
Word- or root-derived? A semantic test for instrumental denominal verbs in Italian
Alice Suozzi, Anna Cardinaletti
Denominal verbs, in spite of their name, can be derived from either a noun or a root. In non-morphologically transparent languages, only semantic cues help distinguish the two classes, i.e., the entailment of existence of the corresponding noun (Kiparsky 1982, 1997). In this work, we present a novel semantic test which is the first attempt at distinguishing noun-derived from root-derived Instrumental Denominal Verbs (IDV) on a purely semantic basis, overcoming the flaws observed in previous syntactic tests. By explicitly asking Italian native speakers to mention the instruments that can be used to perform the action denoted by the verb, we measured the entailment of existence through the number of instrument nouns produced and the frequency of production of the corresponding instrument noun. Our test also contained parasynthetic verbs, whose behavior was influenced by the interaction between their derivation process and their meaning.
Romanic languages, Philology. Linguistics
ViKL: A Mammography Interpretation Framework via Multimodal Aggregation of Visual-knowledge-linguistic Features
Xin Wei, Yaling Tao, Changde Du
et al.
Mammography is the primary imaging tool for breast cancer diagnosis. Despite significant strides in applying deep learning to interpret mammography images, efforts that focus predominantly on visual features often struggle with generalization across datasets. We hypothesize that integrating additional modalities in the radiology practice, notably the linguistic features of reports and manifestation features embodying radiological insights, offers a more powerful, interpretable and generalizable representation. In this paper, we announce MVKL, the first multimodal mammography dataset encompassing multi-view images, detailed manifestations and reports. Based on this dataset, we focus on the challanging task of unsupervised pretraining and propose ViKL, a innovative framework that synergizes Visual, Knowledge, and Linguistic features. This framework relies solely on pairing information without the necessity for pathology labels, which are often challanging to acquire. ViKL employs a triple contrastive learning approach to merge linguistic and knowledge-based insights with visual data, enabling both inter-modality and intra-modality feature enhancement. Our research yields significant findings: 1) Integrating reports and manifestations with unsupervised visual pretraining, ViKL substantially enhances the pathological classification and fosters multimodal interactions. 2) Manifestations can introduce a novel hard negative sample selection mechanism. 3) The multimodal features demonstrate transferability across different datasets. 4) The multimodal pretraining approach curbs miscalibrations and crafts a high-quality representation space. The MVKL dataset and ViKL code are publicly available at https://github.com/wxwxwwxxx/ViKL to support a broad spectrum of future research.
Decoding Linguistic Nuances in Mental Health Text Classification Using Expressive Narrative Stories
Jinwen Tang, Qiming Guo, Yunxin Zhao
et al.
Recent advancements in NLP have spurred significant interest in analyzing social media text data for identifying linguistic features indicative of mental health issues. However, the domain of Expressive Narrative Stories (ENS)-deeply personal and emotionally charged narratives that offer rich psychological insights-remains underexplored. This study bridges this gap by utilizing a dataset sourced from Reddit, focusing on ENS from individuals with and without self-declared depression. Our research evaluates the utility of advanced language models, BERT and MentalBERT, against traditional models. We find that traditional models are sensitive to the absence of explicit topic-related words, which could risk their potential to extend applications to ENS that lack clear mental health terminology. Despite MentalBERT is design to better handle psychiatric contexts, it demonstrated a dependency on specific topic words for classification accuracy, raising concerns about its application when explicit mental health terms are sparse (P-value<0.05). In contrast, BERT exhibited minimal sensitivity to the absence of topic words in ENS, suggesting its superior capability to understand deeper linguistic features, making it more effective for real-world applications. Both BERT and MentalBERT excel at recognizing linguistic nuances and maintaining classification accuracy even when narrative order is disrupted. This resilience is statistically significant, with sentence shuffling showing substantial impacts on model performance (P-value<0.05), especially evident in ENS comparisons between individuals with and without mental health declarations. These findings underscore the importance of exploring ENS for deeper insights into mental health-related narratives, advocating for a nuanced approach to mental health text analysis that moves beyond mere keyword detection.
Toward Cultural Interpretability: A Linguistic Anthropological Framework for Describing and Evaluating Large Language Models (LLMs)
Graham M. Jones, Shai Satran, Arvind Satyanarayan
This article proposes a new integration of linguistic anthropology and machine learning (ML) around convergent interests in both the underpinnings of language and making language technologies more socially responsible. While linguistic anthropology focuses on interpreting the cultural basis for human language use, the ML field of interpretability is concerned with uncovering the patterns that Large Language Models (LLMs) learn from human verbal behavior. Through the analysis of a conversation between a human user and an LLM-powered chatbot, we demonstrate the theoretical feasibility of a new, conjoint field of inquiry, cultural interpretability (CI). By focusing attention on the communicative competence involved in the way human users and AI chatbots co-produce meaning in the articulatory interface of human-computer interaction, CI emphasizes how the dynamic relationship between language and culture makes contextually sensitive, open-ended conversation possible. We suggest that, by examining how LLMs internally "represent" relationships between language and culture, CI can: (1) provide insight into long-standing linguistic anthropological questions about the patterning of those relationships; and (2) aid model developers and interface designers in improving value alignment between language models and stylistically diverse speakers and culturally diverse speech communities. Our discussion proposes three critical research axes: relativity, variation, and indexicality.
Concurrent Linguistic Error Detection (CLED): a New Methodology for Error Detection in Large Language Models
Jinhua Zhu, Javier Conde, Zhen Gao
et al.
The wide adoption of Large language models (LLMs) makes their dependability a pressing concern. Detection of errors is the first step to mitigating their impact on a system and thus, efficient error detection for LLMs is an important issue. In many settings, the LLM is considered as a black box with no access to the internal nodes; this prevents the use of many error detection schemes that need access to the model's internal nodes. An interesting observation is that the output of LLMs in error-free operation should be valid and normal text. Therefore, when the text is not valid or differs significantly from normal text, it is likely that there is an error. Based on this observation we propose to perform Concurrent Linguistic Error Detection (CLED); this scheme extracts some linguistic features of the text generated by the LLM and feeds them to a concurrent classifier that detects errors. Since the proposed error detection mechanism only relies on the outputs of the model, then it can be used on LLMs in which there is no access to the internal nodes. The proposed CLED scheme has been evaluated on the T5 model when used for news summarization and on the OPUS-MT model when used for translation. In both cases, the same set of linguistic features has been used for error detection to illustrate the applicability of the proposed scheme beyond a specific case. The results show that CLED can detect most of the errors at a low overhead penalty. The use of the concurrent classifier also enables a trade-off between error detection effectiveness and its associated overhead, so providing flexibility to a designer.
Differential Privacy, Linguistic Fairness, and Training Data Influence: Impossibility and Possibility Theorems for Multilingual Language Models
Phillip Rust, Anders Søgaard
Language models such as mBERT, XLM-R, and BLOOM aim to achieve multilingual generalization or compression to facilitate transfer to a large number of (potentially unseen) languages. However, these models should ideally also be private, linguistically fair, and transparent, by relating their predictions to training data. Can these requirements be simultaneously satisfied? We show that multilingual compression and linguistic fairness are compatible with differential privacy, but that differential privacy is at odds with training data influence sparsity, an objective for transparency. We further present a series of experiments on two common NLP tasks and evaluate multilingual compression and training data influence sparsity under different privacy guarantees, exploring these trade-offs in more detail. Our results suggest that we need to develop ways to jointly optimize for these objectives in order to find practical trade-offs.
Challenges for Linguistically-Driven Computer-Based Sign Recognition from Continuous Signing for American Sign Language
Carol Neidle
There have been recent advances in computer-based recognition of isolated, citation-form signs from video. There are many challenges for such a task, not least the naturally occurring inter- and intra- signer synchronic variation in sign production, including sociolinguistic variation in the realization of certain signs. However, there are several significant factors that make recognition of signs from continuous signing an even more difficult problem. This article presents an overview of such challenges, based in part on findings from a large corpus of linguistically annotated video data for American Sign Language (ASL). Some linguistic regularities in the structure of signs that can boost handshape and sign recognition are also discussed.
A Predictive Model of Digital Information Engagement: Forecasting User Engagement With English Words by Incorporating Cognitive Biases, Computational Linguistics and Natural Language Processing
Nimrod Dvir, Elaine Friedman, Suraj Commuri
et al.
This study introduces and empirically tests a novel predictive model for digital information engagement (IE) - the READ model, an acronym for the four pivotal attributes of engaging information: Representativeness, Ease-of-use, Affect, and Distribution. Conceptualized within the theoretical framework of Cumulative Prospect Theory, the model integrates key cognitive biases with computational linguistics and natural language processing to develop a multidimensional perspective on information engagement. A rigorous testing protocol was implemented, involving 50 randomly selected pairs of synonymous words (100 words in total) from the WordNet database. These words' engagement levels were evaluated through a large-scale online survey (n = 80,500) to derive empirical IE metrics. The READ attributes for each word were then computed and their predictive efficacy examined. The findings affirm the READ model's robustness, accurately predicting a word's IE level and distinguishing the more engaging word from a pair of synonyms with an 84% accuracy rate. The READ model's potential extends across various domains, including business, education, government, and healthcare, where it could enhance content engagement and inform AI language model development and generative text work. Future research should address the model's scalability and adaptability across different domains and languages, thereby broadening its applicability and efficacy.
Editorial
Ana Bocanegra Valle
Language and Literature, Philology. Linguistics