La traducción automática (TA) y los chatbots de inteligencia artificial (IA) generativa han transformado la comunicación global al facilitar la transmisión de información entre lenguas y, por ende, entre culturas. No obstante, también plantean desafíos éticos debido a los sesgos lingüísticos. En particular, estos sesgos afectan negativamente a la terminología que representa a las mujeres y al colectivo LGTBIQ+ en las traducciones generadas por estas tecnologías. Partimos de la hipótesis de que tanto las tecnologías de traducción automática como los chatbots de inteligencia artificial presentan dificultades para traducir correctamente los marcadores de género y la terminología LGTBIQ+ del inglés al español. En este sentido, con frecuencia optan por el masculino genérico cuando no se proporciona suficiente información contextual o por términos inadecuados, aunque se espera que los chatbots más recientes ofrezcan mejores resultados en este aspecto. Para validar dicha hipótesis, hemos diseñado una metodología de análisis basada en la recogida de datos cuantitativos y cualitativos a partir de traducciones generadas por sistemas de traducción automática convencionales (DeepL y Google Translate) y chatbots de inteligencia artificial como ChatGPT y Gemini. Para evaluar los datos se ha empleado una adaptación de Multidimensional Quality Metrics (MQM), que permitirá obtener un marco estandarizado para medir la calidad de traducciones. Los resultados que se desprenden del análisis muestran la persistencia de un sesgo marcado hacia el género masculino, con una identificación inconsistente del género femenino. De este modo, se puede concluir que el producto resultante de los sistemas basados en inteligencia artificial generativa no presenta mejoras significativas en comparación con los sistemas de traducción automática convencionales. Por ello, es necesario desarrollar tecnologías lingüísticas más inclusivas, equitativas y libres de sesgos, así como fomentar el diseño de sistemas más justos y respetuosos con la diversidad, esenciales para responder a los desafíos de un contexto cada vez más interconectado y globalizado.
Iran Ferreira de Melo, Gustavo José Barbosa Paraíso, Amanda Monteiro da Silva
Este artigo pretende apresentar uma visão particular sobre a linguagem não-binária, por meio de uma leitura de Michel Foucault (1996, 2005, 2012) e Norman Fairclough (2003, 2016) para análise de discursos legislativos. O aparato teórico advém dos Estudos Queer e da Linguística Queer e ele se justifica por se tratar de um escrito de potencialização da reflexão-ação para políticas de vida às nossas corpas resistentes em um país ‘necapolítico’ como o nosso. Seu objetivo central é promover uma discussão sobre a manifestação de usos disruptivos de linguagem para demarcar gênero no português brasileiro, com o intuito de descrever mecanismos contemporâneos e analisar a repercussão dos mesmos. Sua contribuição para a formação profissional e crítico-reflexiva em nosso país se dá sobretudo no que tange ao fortalecimento de narrativas que possibilitam a visibilidade e representatividade de sujeitas não-binárias e dissidentes de gênero em geral.
Mi contribución a este Homenaje consiste en efectuar una reflexión sobre la presencia del romancero en la vida académica de Aurelio González. El estudio del romancero atraviesa todas sus líneas de investigación desde distintos lugares de asedio al género: el romancero viejo, el romancero tradicional moderno, el romancero americano. La poética y la gramática del romancero lo preocuparon y ocuparon desde su tesis de doctorado dedicada al estudio de
las formas y funciones de los principios en el romancero viejo en 1984, hasta el romancero americano al que dedicó los últimos 20 años de su vida. Asimismo, el estudio de la tradición oral moderna lo tuvo como participante de las grandes encuestas realizadas en España a fines del siglo XX coordinadas por Diego Catalán. Estas páginas constituyen un recorrido a través de los estudios romancísticos de Aurelio González que pone de manifiesto su insoslayable contribución al campo.
Natural Language Understanding (NLU) is a basic task in Natural Language Processing (NLP). The evaluation of NLU capabilities has become a trending research topic that attracts researchers in the last few years, resulting in the development of numerous benchmarks. These benchmarks include various tasks and datasets in order to evaluate the results of pretrained models via public leaderboards. Notably, several benchmarks contain diagnostics datasets designed for investigation and fine-grained error analysis across a wide range of linguistic phenomena. This survey provides a comprehensive review of available English, Arabic, and Multilingual NLU benchmarks, with a particular emphasis on their diagnostics datasets and the linguistic phenomena they covered. We present a detailed comparison and analysis of these benchmarks, highlighting their strengths and limitations in evaluating NLU tasks and providing in-depth error analysis. When highlighting the gaps in the state-of-the-art, we noted that there is no naming convention for macro and micro categories or even a standard set of linguistic phenomena that should be covered. Consequently, we formulated a research question regarding the evaluation metrics of the evaluation diagnostics benchmarks: "Why do not we have an evaluation standard for the NLU evaluation diagnostics benchmarks?" similar to ISO standard in industry. We conducted a deep analysis and comparisons of the covered linguistic phenomena in order to support experts in building a global hierarchy for linguistic phenomena in future. We think that having evaluation metrics for diagnostics evaluation could be valuable to gain more insights when comparing the results of the studied models on different diagnostics benchmarks.
Automating chest radiograph interpretation using Deep Learning (DL) models has the potential to significantly improve clinical workflows, decision-making, and large-scale health screening. However, in medical settings, merely optimising predictive performance is insufficient, as the quantification of uncertainty is equally crucial. This paper investigates the relationship between predictive uncertainty, derived from Bayesian Deep Learning approximations, and human/linguistic uncertainty, as estimated from free-text radiology reports labelled by rule-based labellers. Utilising BERT as the model of choice, this study evaluates different binarisation methods for uncertainty labels and explores the efficacy of Monte Carlo Dropout and Deep Ensembles in estimating predictive uncertainty. The results demonstrate good model performance, but also a modest correlation between predictive and linguistic uncertainty, highlighting the challenges in aligning machine uncertainty with human interpretation nuances. Our findings suggest that while Bayesian approximations provide valuable uncertainty estimates, further refinement is necessary to fully capture and utilise the subtleties of human uncertainty in clinical applications.
Prompts are the interface for eliciting the capabilities of large language models (LLMs). Understanding their structure and components is critical for analyzing LLM behavior and optimizing performance. However, the field lacks a comprehensive framework for systematic prompt analysis and understanding. We introduce PromptPrism, a linguistically-inspired taxonomy that enables prompt analysis across three hierarchical levels: functional structure, semantic component, and syntactic pattern. By applying linguistic concepts to prompt analysis, PromptPrism bridges traditional language understanding and modern LLM research, offering insights that purely empirical approaches might miss. We show the practical utility of PromptPrism by applying it to three applications: (1) a taxonomy-guided prompt refinement approach that automatically improves prompt quality and enhances model performance across a range of tasks; (2) a multi-dimensional dataset profiling method that extracts and aggregates structural, semantic, and syntactic characteristics from prompt datasets, enabling comprehensive analysis of prompt distributions and patterns; (3) a controlled experimental framework for prompt sensitivity analysis by quantifying the impact of semantic reordering and delimiter modifications on LLM performance. Our experimental results validate the effectiveness of our taxonomy across these applications, demonstrating that PromptPrism provides a foundation for refining, profiling, and analyzing prompts.
Currently, discourse markers are the object of numerous linguistic studies, in which they are studied from the perspective of various scientific theories and approaches both in foreign studies and in the works of Russian scientists. This article is devoted to the analysis and synthesis of individual approaches to the study of discourse markers that currently exist in Russian linguistics. In the article below, this linguistic phenomenon is considered from the perspective of sociolinguistics, cognitive science, Сonstruction Grammar, gender linguistics, theory of politeness, as well as within the framework of individual combined approaches. The purpose of the article, therefore, is to critically review, analyze, generalize the views of Russian linguists, identify common ground and differences in relation to the designated linguistic phenomenon. The problem of separating discourse markers on an equal basis with other linguistic units into a separate class or an independent category is raised, their characteristics and features of functioning depending on the type of discourse are analyzed. An attempt is made to explain the various variants of terminological nomination of discourse markers, their classifications and typologies in the designated theoretical approaches are compared. The article concludes that there is no connection between the theoretical approach and the choice of a term to denote the linguistic phenomenon we are analyzing. In addition, the author notes the insufficiency of one criterion for the typologization of discourse markers due to the versatility of their nature, as well as their multifunctionality. The actor also points out the difficulty of creating a unified classification of discursive markers but sees this as a prospect for further research of this phenomenon, as well as the peculiarities of their functioning in various discursive practices.
Large language models (LLMs), despite their breakthroughs on many challenging benchmark tasks, lean to generate verbose responses and lack the controllability of output complexity, which is usually preferred by human users in practice. In this paper, we study how to precisely control multiple linguistic complexities of LLM output by finetuning using off-the-shelf data. To this end, we propose multi-control tuning (MCTune), which includes multiple linguistic complexity values of ground-truth responses as controls in the input for instruction tuning. We finetune LLaMA2-7B on Alpaca-GPT4 and WizardLM datasets. Evaluations on widely used benchmarks demonstrate that our method does not only improve LLMs' multi-complexity controllability substantially but also retains or even enhances the quality of the responses as a side benefit.
Fauzan Farooqui, Thanmay Jayakumar, Pulkit Mathur
et al.
Open Information Extraction (OIE) is a structured prediction (SP) task in Natural Language Processing (NLP) that aims to extract structured $n$-ary tuples - usually subject-relation-object triples - from free text. The word embeddings in the input text can be enhanced with linguistic features, usually Part-of-Speech (PoS) and Syntactic Dependency Parse (SynDP) labels. However, past enhancement techniques cannot leverage the power of pretrained language models (PLMs), which themselves have been hardly used for OIE. To bridge this gap, we are the first to leverage linguistic features with a Seq2Seq PLM for OIE. We do so by introducing two methods - Weighted Addition and Linearized Concatenation. Our work can give any neural OIE architecture the key performance boost from both PLMs and linguistic features in one go. In our settings, this shows wide improvements of up to 24.9%, 27.3% and 14.9% on Precision, Recall and F1 scores respectively over the baseline. Beyond this, we address other important challenges in the field: to reduce compute overheads with the features, we are the first ones to exploit Semantic Dependency Parse (SemDP) tags; to address flaws in current datasets, we create a clean synthetic dataset; finally, we contribute the first known study of OIE behaviour in SP models.
Multimodal Large Language Models (MLLMs) excel at descriptive tasks within images but often struggle with precise object localization, a critical element for reliable visual interpretation. In contrast, traditional object detection models provide high localization accuracy but frequently generate detections lacking contextual coherence due to limited modeling of inter-object relationships. To address this fundamental limitation, we introduce the \textbf{Visual-Linguistic Agent (VLA), a collaborative framework that combines the relational reasoning strengths of MLLMs with the precise localization capabilities of traditional object detectors. In the VLA paradigm, the MLLM serves as a central Linguistic Agent, working collaboratively with specialized Vision Agents for object detection and classification. The Linguistic Agent evaluates and refines detections by reasoning over spatial and contextual relationships among objects, while the classification Vision Agent offers corrective feedback to improve classification accuracy. This collaborative approach enables VLA to significantly enhance both spatial reasoning and object localization, addressing key challenges in multimodal understanding. Extensive evaluations on the COCO dataset demonstrate substantial performance improvements across multiple detection models, highlighting VLA's potential to set a new benchmark in accurate and contextually coherent object detection.
Kushal Tatariya, Vladimir Araujo, Thomas Bauwens
et al.
Pixel-based language models have emerged as a compelling alternative to subword-based language modelling, particularly because they can represent virtually any script. PIXEL, a canonical example of such a model, is a vision transformer that has been pre-trained on rendered text. While PIXEL has shown promising cross-script transfer abilities and robustness to orthographic perturbations, it falls short of outperforming monolingual subword counterparts like BERT in most other contexts. This discrepancy raises questions about the amount of linguistic knowledge learnt by these models and whether their performance in language tasks stems more from their visual capabilities than their linguistic ones. To explore this, we probe PIXEL using a variety of linguistic and visual tasks to assess its position on the vision-to-language spectrum. Our findings reveal a substantial gap between the model's visual and linguistic understanding. The lower layers of PIXEL predominantly capture superficial visual features, whereas the higher layers gradually learn more syntactic and semantic abstractions. Additionally, we examine variants of PIXEL trained with different text rendering strategies, discovering that introducing certain orthographic constraints at the input level can facilitate earlier learning of surface-level features. With this study, we hope to provide insights that aid the further development of pixel-based language models.
Luego de un período de pax gubernamental al inicio de una administración, los Gobiernos enfrentan continuamente crisis comunicacionales y de gobernanza que desestabilizan la imagen lograda en la campaña, con lo cual el reto siguiente es construir una imagen de gobierno. Pasados esos cien primeros días, las administraciones deben tener en cuenta que la opinión pública se desplazó a los espacios digitales, en donde se discute y se delibera sobre lo público. Entonces, los equipos comunicacionales deben estar preparados para embestidas digitales y crisis diarias. Aquí, un pequeño manual de cómo enfrentar estas situaciones y de cómo preparar a los equipos comunicacionales para decidir qué batallas son las importantes.
Samyadeep Basu, Shell Xu Hu, Maziar Sanjabi
et al.
Image-text contrastive models like CLIP have wide applications in zero-shot classification, image-text retrieval, and transfer learning. However, they often struggle on compositional visio-linguistic tasks (e.g., attribute-binding or object-relationships) where their performance is no better than random chance. To address this, we introduce SDS-CLIP, a lightweight and sample-efficient distillation method to enhance CLIP's compositional visio-linguistic reasoning. Our approach fine-tunes CLIP using a distillation objective borrowed from large text-to-image generative models like Stable-Diffusion, which are known for their strong visio-linguistic reasoning abilities. On the challenging Winoground benchmark, SDS-CLIP improves the visio-linguistic performance of various CLIP models by up to 7%, while on the ARO dataset, it boosts performance by up to 3%. This work underscores the potential of well-designed distillation objectives from generative models to enhance contrastive image-text models with improved visio-linguistic reasoning capabilities.
Royal Tour Picture Album, by Elizabeth Morton. London, UK: Sunday Graphic/Pitkin Pictorials Ltd, 1953. 104 pages.
ONE of the joys of travelling the world and collecting books is the historical oddities that turn up in the most unexpected places. I have a splendid copy of the complete works of Shakespeare dating to the Second World War, completely re-set, so the frontispiece notes, due to the original plates having been ‘destroyed by enemy action’. One wonders at the perfidy of the Luftwaffe in trying to blow up the Bard.
Communication. Mass media, Journalism. The periodical press, etc.
A standard measure of the influence of a research paper is the number of times it is cited. However, papers may be cited for many reasons, and citation count offers limited information about the extent to which a paper affected the content of subsequent publications. We therefore propose a novel method to quantify linguistic influence in timestamped document collections. There are two main steps: first, identify lexical and semantic changes using contextual embeddings and word frequencies; second, aggregate information about these changes into per-document influence scores by estimating a high-dimensional Hawkes process with a low-rank parameter matrix. We show that this measure of linguistic influence is predictive of $\textit{future}$ citations: the estimate of linguistic influence from the two years after a paper's publication is correlated with and predictive of its citation count in the following three years. This is demonstrated using an online evaluation with incremental temporal training/test splits, in comparison with a strong baseline that includes predictors for initial citation counts, topics, and lexical features.
Object detection is a computer vision task of predicting a set of bounding boxes and category labels for each object of interest in a given image. The category is related to a linguistic symbol such as 'dog' or 'person' and there should be relationships among them. However the object detector only learns to classify the categories and does not treat them as the linguistic symbols. Multi-modal models often use the pre-trained object detector to extract object features from the image, but the models are separated from the detector and the extracted visual features does not change with their linguistic input. We rethink the object detection as a vision-and-language reasoning task. We then propose targeted detection task, where detection targets are given by a natural language and the goal of the task is to detect only all the target objects in a given image. There are no detection if the target is not given. Commonly used modern object detectors have many hand-designed components like anchor and it is difficult to fuse the textual inputs into the complex pipeline. We thus propose Language-Targeted Detector (LTD) for the targeted detection based on a recently proposed Transformer-based detector. LTD is a encoder-decoder architecture and our conditional decoder allows the model to reason about the encoded image with the textual input as the linguistic context. We evaluate detection performances of LTD on COCO object detection dataset and also show that our model improves the detection results with the textual input grounding to the visual object.
Varāhamihira’s Sanskrit astrological and divinatory compendium, Bṛhatsaṃhitā (6th century CE), is distinguished for its adaptation of the kāvya style and aesthetics to several divinatory prognostications. Accordingly, the entire work may be classified as kāvyaśāstra, a scholarly treatise that incorporates elements of poetry. The uniqueness of its twelfth chapter, Agastyacārādhyāyaḥ ‘On the course of sage Agastya’ lies in the fact that the astrologer fashions it into a deliberate display of his poetic proficiency. In this chapter, the practical instructions concerning the observation and divinatory import of the star Agastya (Canopus) merge with poetic stanzas meant to demonstrate Varāhamihira’s acquaintance with various constituents of the kāvya style. The first aim of this study is to specify the poetic devices employed in the chapter, including a variety of classical Sanskrit metres, canonical themes, figures of speech, plot construction and intertextual references. The second aim is to recognise the purpose and significance of the chapter within the context of the entire work.
En este artículo se analiza, desde el enfoque glotopolítico, una serie de cuatro obras lexicográficas monolingües del español de la Argentina publicadas entre 1890 y 1903, que se agrupan en la categoría “diccionarios de barbarismos”. Se trata de instrumentos lingüísticos que se arrogan un carácter prescriptivo puesto que incluyen y, por extensión, excluyen ciertos usos y determinados vocablos, además de describir, calificar y valorar las divergencias léxicas entre el uso americano (por caso el argentino) y el peninsular en el momento de mayor auge del movimiento migratorio masivo que ingresó al país. A grandes rasgos, estos instrumentos lingüísticos dan cuenta de barbarismos, neologismos y extranjerismos y censuran su uso, tomando como parámetro la norma del castellano de Madrid. Aquí se argumenta que dichos dispositivos exponen sistemas de valores y concepciones dominantes sobre la lengua, que exceden ampliamente el ámbito de lo estrictamente lingüístico.
Adriana D. Correia, Henk T. C. Stoof, Michael Moortgat
Extended versions of the Lambek Calculus currently used in computational linguistics rely on unary modalities to allow for the controlled application of structural rules affecting word order and phrase structure. These controlled structural operations give rise to derivational ambiguities that are missed by the original Lambek Calculus or its pregroup simplification. Proposals for compositional interpretation of extended Lambek Calculus in the compact closed category of FVect and linear maps have been made, but in these proposals the syntax-semantics mapping ignores the control modalities, effectively restricting their role to the syntax. Our aim is to turn the modalities into first-class citizens of the vectorial interpretation. Building on the directional density matrix semantics, we extend the interpretation of the type system with an extra spin density matrix space. The interpretation of proofs then results in ambiguous derivations being tensored with orthogonal spin states. Our method introduces a way of simultaneously representing co-existing interpretations of ambiguous utterances, and provides a uniform framework for the integration of lexical and derivational ambiguity.
People tend to align their use of language to the linguistic behaviour of their own ingroup and to simultaneously diverge from the language use of outgroups. This paper proposes to model this phenomenon of sociolinguistic identity maintenance as an evolutionary game in which individuals play the field and the dynamics are supplied by a multi-population extension of the replicator-mutator equation. Using linearization, the stabilities of all dynamic equilibria of the game in its fully symmetric two-population special case are found. The model is then applied to an empirical test case from adolescent sociolinguistic behaviour. It is found that the empirically attested population state corresponds to one of a number of stable equilibria of the game under an independently plausible value of a parameter controlling the rate of linguistic mutations. An asymmetric three-population extension of the game, explored with numerical solution methods, furthermore predicts to which specific equilibrium the system converges.