Hasil untuk "Language. Linguistic theory. Comparative grammar"

Menampilkan 20 dari ~51313 hasil · dari DOAJ, arXiv, Semantic Scholar

JSON API
DOAJ Open Access 2025
(Re)conceptualizing translation as a dynamic dialogue of constraints

Anna Rędzioch-Korkuz1

The intersection between translation studies and semiotics has been addressed in numerous publications and approaches to researching translation. However, at least within translation studies, this intersection has been visible mainly in cases in which non-linguistic signs come into play, whereas genuine semiotic frameworks have proven to be too broad or abstract. This has led to a conceptual and ontological paradox: on the one hand, translation scholars have been struggling to move beyond linguistics, but, on the other hand, most of them still place a strong emphasis on lingual translation. As a result, ‘translation’ is no longer a precise term broad enough to include contemporary types of this activity. This paper is a proposal made to reconceptualize translation as a dynamic process of dialogue between relevant constraints. Situated between translation semiotics and translation studies, the theoretical model described here underlines the role of constraints as well as the universal semiotic nature of translation.

Language. Linguistic theory. Comparative grammar
DOAJ Open Access 2025
La testualità come fulcro della competenza comunicativa

Balboni, Paolo E.

The notion of communicative competence, the theoretical basis of the communicative approach, was widely discussed in the 1970-80s, but in our century it has not been any longer analysed systematically, thus turning into an intuitive notion rather than a scientific one. The author considers the increasing interest of educational linguists in the textual component of communicative competence and proposes a shift from the theoretical model of competence he designed in the 1990s to a new model, where textual elements play a central role.

Language. Linguistic theory. Comparative grammar
arXiv Open Access 2025
Deaf in AI: AI language technologies and the erosion of linguistic rights

Maartje De Meulder

This paper explores the interplay of AI language technologies, sign language interpreting, and linguistic access, highlighting the complex interdependencies shaping access frameworks and the tradeoffs these technologies bring. While AI tools promise innovation, they also perpetuate biases, reinforce technoableism, and deepen inequalities through systemic and design flaws. The historical and contemporary privileging of sign language interpreting as the dominant access model, and the broader inclusion ideologies it reflects, shape AIs development and deployment, often sidelining deaf languaging practices and introducing new forms of linguistic subordination to technology. Drawing on Deaf Studies, Sign Language Interpreting Studies, and crip technoscience, this paper critiques the framing of AI as a substitute for interpreters and examines its implications for access hierarchies. It calls for deaf-led approaches to foster AI systems that remain equitable, inclusive, and trustworthy, supporting rather than undermining linguistic autonomy and contributing to deaf aligned futures.

en cs.CY
arXiv Open Access 2025
Linguistic Neuron Overlap Patterns to Facilitate Cross-lingual Transfer on Low-resource Languages

Yuemei Xu, Kexin Xu, Jian Zhou et al.

The current Large Language Models (LLMs) face significant challenges in improving their performance on low-resource languages and urgently need data-efficient methods without costly fine-tuning. From the perspective of language-bridge, we propose a simple yet effective method, namely BridgeX-ICL, to improve the zero-shot Cross-lingual In-Context Learning (X-ICL) for low-resource languages. Unlike existing works focusing on language-specific neurons, BridgeX-ICL explores whether sharing neurons can improve cross-lingual performance in LLMs. We construct neuron probe data from the ground-truth MUSE bilingual dictionaries, and define a subset of language overlap neurons accordingly to ensure full activation of these anchored neurons. Subsequently, we propose an HSIC-based metric to quantify LLMs' internal linguistic spectrum based on overlapping neurons, guiding optimal bridge selection. The experiments conducted on 4 cross-lingual tasks and 15 language pairs from 7 diverse families, covering both high-low and moderate-low pairs, validate the effectiveness of BridgeX-ICL and offer empirical insights into the underlying multilingual mechanisms of LLMs. The code is publicly available at https://github.com/xuyuemei/BridgeX-ICL.

en cs.CL, cs.AI
arXiv Open Access 2025
Emissions and Performance Trade-off Between Small and Large Language Models

Anandita Garg, Uma Gaba, Deepan Muthirayan et al.

The advent of Large Language Models (LLMs) has raised concerns about their enormous carbon footprint, starting with energy-intensive training and continuing through repeated inference. This study investigates the potential of using fine-tuned Small Language Models (SLMs) as a sustainable alternative for predefined tasks. Here, we present a comparative analysis of the performance-emissions trade-off between LLMs and fine-tuned SLMs across selected tasks under Natural Language Processing, Reasoning and Programming. Our results show that in four out of the six selected tasks, SLMs maintained comparable performances for a significant reduction in carbon emissions during inference. Our findings demonstrate the viability of smaller models in mitigating the environmental impact of resource-heavy LLMs, thus advancing towards sustainable, green AI.

en cs.CL, cs.AI
S2 Open Access 2024
Flat structure: a minimalist program for syntax

Giuseppe Varaschin, P. Culicover

Abstract We explore the possibility of assuming largely flat syntactic structures in Simpler Syntax, suggesting that these are plausible alternatives to conventional hierarchical structures. We consider the implications of flat structure for analyses of various linguistic phenomena in English, including heavy NP shift, extraposition, topicalization and constituent order variation in the VP. We also sketch a general strategy to circumvent some of the problems flat structure is said to cause for semantic interpretation. Our proposals eliminate the need for movement, unpronounced copies and feature-bearing nodes postulated to trigger syntactic operations. We assume the Parallel Architecture and use declarative schemas to establish direct correspondences between phonology on the one hand and syntactic and semantic structures on the other. The resulting picture is one in which narrow syntax can be relatively stable across languages and constructions, largely reflecting the structure of human thought, and the main source of linguistic variation is the linearization of conceptual and syntactic structures. Unlike other minimalist theories that reach a similar conclusion, the theory we propose takes mappings to phonology to be central to the architecture of grammar.

arXiv Open Access 2024
Fotheidil: an Automatic Transcription System for the Irish Language

Liam Lonergan, Ibon Saratxaga, John Sloan et al.

This paper sets out the first web-based transcription system for the Irish language - Fotheidil, a system that utilises speech-related AI technologies as part of the ABAIR initiative. The system includes both off-the-shelf pre-trained voice activity detection and speaker diarisation models and models trained specifically for Irish automatic speech recognition and capitalisation and punctuation restoration. Semi-supervised learning is explored to improve the acoustic model of a modular TDNN-HMM ASR system, yielding substantial improvements for out-of-domain test sets and dialects that are underrepresented in the supervised training set. A novel approach to capitalisation and punctuation restoration involving sequence-to-sequence models is compared with the conventional approach using a classification model. Experimental results show here also substantial improvements in performance. The system will be made freely available for public use, and represents an important resource to researchers and others who transcribe Irish language materials. Human-corrected transcriptions will be collected and included in the training dataset as the system is used, which should lead to incremental improvements to the ASR model in a cyclical, community-driven fashion.

en cs.CL, cs.SD
arXiv Open Access 2024
Evaluating Telugu Proficiency in Large Language Models_ A Comparative Analysis of ChatGPT and Gemini

Katikela Sreeharsha Kishore, Rahimanuddin Shaik

The growing prominence of large language models (LLMs) necessitates the exploration of their capabilities beyond English. This research investigates the Telugu language proficiency of ChatGPT and Gemini, two leading LLMs. Through a designed set of 20 questions encompassing greetings, grammar, vocabulary, common phrases, task completion, and situational reasoning, the study delves into their strengths and weaknesses in handling Telugu. The analysis aims to identify the LLM that demonstrates a deeper understanding of Telugu grammatical structures, possesses a broader vocabulary, and exhibits superior performance in tasks like writing and reasoning. By comparing their ability to comprehend and use everyday Telugu expressions, the research sheds light on their suitability for real-world language interaction. Furthermore, the evaluation of adaptability and reasoning capabilities provides insights into how each LLM leverages Telugu to respond to dynamic situations. This comparative analysis contributes to the ongoing discussion on multilingual capabilities in AI and paves the way for future research in developing LLMs that can seamlessly integrate with Telugu-speaking communities.

en cs.CL, cs.HC
arXiv Open Access 2024
A Comprehensive Evaluation of Semantic Relation Knowledge of Pretrained Language Models and Humans

Zhihan Cao, Hiroaki Yamada, Simone Teufel et al.

Recently, much work has concerned itself with the enigma of what exactly pretrained language models~(PLMs) learn about different aspects of language, and how they learn it. One stream of this type of research investigates the knowledge that PLMs have about semantic relations. However, many aspects of semantic relations were left unexplored. Generally, only one relation has been considered, namely hypernymy. Furthermore, previous work did not measure humans' performance on the same task as that performed by the PLMs. This means that at this point in time, there is only an incomplete view of the extent of these models' semantic relation knowledge. To address this gap, we introduce a comprehensive evaluation framework covering five relations beyond hypernymy, namely hyponymy, holonymy, meronymy, antonymy, and synonymy. We use five metrics (two newly introduced here) for recently untreated aspects of semantic relation knowledge, namely soundness, completeness, symmetry, prototypicality, and distinguishability. Using these, we can fairly compare humans and models on the same task. Our extensive experiments involve six PLMs, four masked and two causal language models. The results reveal a significant knowledge gap between humans and models for all semantic relations. In general, causal language models, despite their wide use, do not always perform significantly better than masked language models. Antonymy is the outlier relation where all models perform reasonably well. The evaluation materials can be found at https://github.com/hancules/ProbeResponses.

arXiv Open Access 2024
Shifting social norms as a driving force for linguistic change: Struggles about language and gender in the German Bundestag

Carolin Müller-Spitzer, Samira Ochs

This paper focuses on language change based on shifting social norms, in particular with regard to the debate on language and gender. It is a recurring argument in this debate that language develops "naturally" and that "severe interventions" - such as gender-inclusive language is often claimed to be - in the allegedly "organic" language system are inappropriate and even "dangerous". Such interventions are, however, not unprecedented. Socially motivated processes of language change are neither unusual nor new. We focus in our contribution on one important political-social space in Germany, the German Bundestag. Taking other struggles about language and gender in the plenaries of the Bundestag as a starting point, our article illustrates that language and gender has been a recurring issue in the German Bundestag since the 1980s. We demonstrate how this is reflected in linguistic practices of the Bundestag, by the use of a) designations for gays and lesbians; b) pair forms such as Bürgerinnen und Bürger (female and male citizens); and c) female forms of addresses and personal nouns ('Präsidentin' in addition to 'Präsident'). Lastly, we will discuss implications of these earlier language battles for the currently very heated debate about gender-inclusive language, especially regarding new forms with gender symbols like the asterisk or the colon (Lehrer*innen, Lehrer:innen; male*female teachers) which are intended to encompass all gender identities.

en cs.CL
arXiv Open Access 2024
A Capabilities Approach to Studying Bias and Harm in Language Technologies

Hellina Hailu Nigatu, Zeerak Talat

Mainstream Natural Language Processing (NLP) research has ignored the majority of the world's languages. In moving from excluding the majority of the world's languages to blindly adopting what we make for English, we first risk importing the same harms we have at best mitigated and at least measured for English. However, in evaluating and mitigating harms arising from adopting new technologies into such contexts, we often disregard (1) the actual community needs of Language Technologies, and (2) biases and fairness issues within the context of the communities. In this extended abstract, we consider fairness, bias, and inclusion in Language Technologies through the lens of the Capabilities Approach. The Capabilities Approach centers on what people are capable of achieving, given their intersectional social, political, and economic contexts instead of what resources are (theoretically) available to them. We detail the Capabilities Approach, its relationship to multilingual and multicultural evaluation, and how the framework affords meaningful collaboration with community members in defining and measuring the harms of Language Technologies.

en cs.CL, cs.CY
arXiv Open Access 2024
Dependency Transformer Grammars: Integrating Dependency Structures into Transformer Language Models

Yida Zhao, Chao Lou, Kewei Tu

Syntactic Transformer language models aim to achieve better generalization through simultaneously modeling syntax trees and sentences. While prior work has been focusing on adding constituency-based structures to Transformers, we introduce Dependency Transformer Grammars (DTGs), a new class of Transformer language model with explicit dependency-based inductive bias. DTGs simulate dependency transition systems with constrained attention patterns by modifying attention masks, incorporate the stack information through relative positional encoding, and augment dependency arc representation with a combination of token embeddings and operation embeddings. When trained on a dataset of sentences annotated with dependency trees, DTGs achieve better generalization while maintaining comparable perplexity with Transformer language model baselines. DTGs also outperform recent constituency-based models, showing that dependency can better guide Transformer language models. Our code is released at https://github.com/zhaoyd1/Dep_Transformer_Grammars.

en cs.CL, cs.AI
DOAJ Open Access 2023
Post-apocalyptic Subjectivity and Nature/Culture Duality in Lois Lowry’s The Giver

Younes Poorghorban, Bakhtiar Sadjadi

The present inquiry endeavors to scrutinize the process of identity formation with regard to the Culture/Nature dichotomy within the milieu of Lois Lowry's post-apocalyptic dystopian narrative, The Giver. The antipodal forces of Culture and Nature are instrumental in shaping the social subjectivities of individuals. Lowry's post-apocalyptic dystopia portrays a society in which these antitheses are comprehensively epitomized. Our objective is to explicate the genesis of post-apocalyptic identities and to elucidate the representation of Nature/Culture within the social context of the aforementioned literary work. Furthermore, the polarity between power and resistance, which is of notable import to cultural studies, is nonexistent within this post-apocalyptic dystopia. Consequently, the establishment of identities transpires not at the site of contention between power and resistance, but exclusively through the ascendency of the imperializing power. As a corollary, the elimination of the recollections of those individuals who are unable to oppose the imperializing power is integral to the construction of homogeneous identities.

Language. Linguistic theory. Comparative grammar, Style. Composition. Rhetoric
DOAJ Open Access 2023
Hybrid agreement with English quantifier partitives

Troy Messick

This paper presents a novel case study of a 3/4 agreement pattern typically found with hybrid nouns. This case study involves agreement and binding with Quantified NPs in English. I propose an analysis that relies on different classes of agreement targets agreeing at different times and couple this with a condition on the access to semantic agreement features. This new analysis can account for the novel data presented here as well as the data from the literature. This paper hence broadens both our empirical knowledge of 3/4 patterns as well as refines our theory of features and agreement that underlie such patterns.

Language. Linguistic theory. Comparative grammar
DOAJ Open Access 2023
Images of a Good and Evil Person in Russian and Chinese Paremiological Pictures of the World

Licheng Zhang, Olga P. Kormazina

The linguoculturological studies in paremiology make one of the most actively developing, researchers have repeatedly analyzed the concepts of GOOD and EVIL on the material of paremias of a particular language. However, the study of character traits «good» and «evil» on the basis of the proverbs of the Russian and Chinese languages has not yet been considered, which is the novelty of this study. The article studies paremiological units of the Russian and Chinese languages, representing a good and evil person. The goal of the work is to identify common and peculiar ethno-specific features reflected in the proverbs of the Russian language against the background of their counterparts in the Chinese language, as well as to establish their motivation. To achieve this goal, methods of comparative and linguoculturological analyses were applied. The sources of the material were the proverbs dictionary, as well as data from Russian and Chinese corpora. The paper concludes that there are much more different cultural attitudes contained in the proverbs we are interested in than common ones, which causes a big difference in understanding the character traits of «good» and «evil» between native Russian and Chinese speakers. Understanding the universal and nationally marked ideas of Russians and Chinese effectively contributes to cultural communication between peoples.

Language. Linguistic theory. Comparative grammar, Semantics
arXiv Open Access 2023
The language of prompting: What linguistic properties make a prompt successful?

Alina Leidinger, Robert van Rooij, Ekaterina Shutova

The latest generation of LLMs can be prompted to achieve impressive zero-shot or few-shot performance in many NLP tasks. However, since performance is highly sensitive to the choice of prompts, considerable effort has been devoted to crowd-sourcing prompts or designing methods for prompt optimisation. Yet, we still lack a systematic understanding of how linguistic properties of prompts correlate with task performance. In this work, we investigate how LLMs of different sizes, pre-trained and instruction-tuned, perform on prompts that are semantically equivalent, but vary in linguistic structure. We investigate both grammatical properties such as mood, tense, aspect and modality, as well as lexico-semantic variation through the use of synonyms. Our findings contradict the common assumption that LLMs achieve optimal performance on lower perplexity prompts that reflect language use in pretraining or instruction-tuning data. Prompts transfer poorly between datasets or models, and performance cannot generally be explained by perplexity, word frequency, ambiguity or prompt length. Based on our results, we put forward a proposal for a more robust and comprehensive evaluation standard for prompting research.

en cs.CL, cs.AI
arXiv Open Access 2023
Syntactic Variation Across the Grammar: Modelling a Complex Adaptive System

Jonathan Dunn

While language is a complex adaptive system, most work on syntactic variation observes a few individual constructions in isolation from the rest of the grammar. This means that the grammar, a network which connects thousands of structures at different levels of abstraction, is reduced to a few disconnected variables. This paper quantifies the impact of such reductions by systematically modelling dialectal variation across 49 local populations of English speakers in 16 countries. We perform dialect classification with both an entire grammar as well as with isolated nodes within the grammar in order to characterize the syntactic differences between these dialects. The results show, first, that many individual nodes within the grammar are subject to variation but, in isolation, none perform as well as the grammar as a whole. This indicates that an important part of syntactic variation consists of interactions between different parts of the grammar. Second, the results show that the similarity between dialects depends heavily on the sub-set of the grammar being observed: for example, New Zealand English could be more similar to Australian English in phrasal verbs but at the same time more similar to UK English in dative phrases.

en cs.CL
DOAJ Open Access 2022
Ecosophical Exploration of War and Violence in Graphic Novel Vanni: A Representational Visual Meta-Function Analysis

Afshan Abbas, Fauzia Janjua, Dr.

The present study focuses on the correlation between ecosophy and visual grammar. For this purpose, this study incorporates Guattari’s ecosophy through the model of Kress and Van Leeuwen’s visual grammar(2006) to trace the environmental crisis in the graphic novel ‘Vanni: A Family’s Struggle through the Sri Lankan Conflict’ (2009). The study is qualitative in nature based on multimodal discourse analysis. The findings of the study developed arguments for an ecosophical lens as a way of creating a change of vision within our ethical, social, and political spaces. Through the representational, interactive, and compositional meanings represented in Vanni's visuals, Felix Guattari's ecosophies highlight the trauma of war and its impact on people and the environment.

English literature, Language. Linguistic theory. Comparative grammar
DOAJ Open Access 2022
Transmission ou transformation ?

Erwan Le Pipec

With the end of the practice of Breton within most families for a few decades, schools appear today as the main place for the transmission of this language. This leads some people to express doubts about the institutionalisation of an artificial language that is opaque to vernacular speakers, a criticism that echoes various studies in other contexts. In order to document the facts, I analysed the Breton spoken by pupils in a bilingual class in central Brittany. The aim was to ascertain the extent to which the Breton variety of this region was influenced, or even replaced, by a new standardised school variety. The initial results unsurprisingly show an influence of French. They also testify to the breakthrough of a form of Breton which is foreign to the region. But the resistance of some local peculiarities also appears, making the Breton of these new speakers a fundamentally composite phenomenon.

Language. Linguistic theory. Comparative grammar

Halaman 30 dari 2566