CausalVLBench: Benchmarking Visual Causal Reasoning in Large Vision-Language Models
Aneesh Komanduri, Karuna Bhaila, Xintao Wu
Large language models (LLMs) have shown remarkable ability in various language tasks, especially with their emergent in-context learning capability. Extending LLMs to incorporate visual inputs, large vision-language models (LVLMs) have shown impressive performance in tasks such as recognition and visual question answering (VQA). Despite increasing interest in the utility of LLMs in causal reasoning tasks such as causal discovery and counterfactual reasoning, there has been relatively little work showcasing the abilities of LVLMs on visual causal reasoning tasks. We take this opportunity to formally introduce a comprehensive causal reasoning benchmark for multi-modal in-context learning from LVLMs. Our CausalVLBench encompasses three representative tasks: causal structure inference, intervention target prediction, and counterfactual prediction. We evaluate the ability of state-of-the-art open-source LVLMs on our causal reasoning tasks across three causal representation learning datasets and demonstrate their fundamental strengths and weaknesses. We hope that our benchmark elucidates the drawbacks of existing vision-language models and motivates new directions and paradigms in improving the visual causal reasoning abilities of LVLMs.
Language Bias in Self-Supervised Learning For Automatic Speech Recognition
Edward Storey, Naomi Harte, Peter Bell
Self-supervised learning (SSL) is used in deep learning to train on large datasets without the need for expensive labelling of the data. Recently, large Automatic Speech Recognition (ASR) models such as XLS-R have utilised SSL to train on over one hundred different languages simultaneously. However, deeper investigation shows that the bulk of the training data for XLS-R comes from a small number of languages. Biases learned through SSL have been shown to exist in multiple domains, but language bias in multilingual SSL ASR has not been thoroughly examined. In this paper, we utilise the Lottery Ticket Hypothesis (LTH) to identify language-specific subnetworks within XLS-R and test the performance of these subnetworks on a variety of different languages. We are able to show that when fine-tuning, XLS-R bypasses traditional linguistic knowledge and builds only on weights learned from the languages with the largest data contribution to the pretraining data.
Gender Encoding Patterns in Pretrained Language Model Representations
Mahdi Zakizadeh, Mohammad Taher Pilehvar
Gender bias in pretrained language models (PLMs) poses significant social and ethical challenges. Despite growing awareness, there is a lack of comprehensive investigation into how different models internally represent and propagate such biases. This study adopts an information-theoretic approach to analyze how gender biases are encoded within various encoder-based architectures. We focus on three key aspects: identifying how models encode gender information and biases, examining the impact of bias mitigation techniques and fine-tuning on the encoded biases and their effectiveness, and exploring how model design differences influence the encoding of biases. Through rigorous and systematic investigation, our findings reveal a consistent pattern of gender encoding across diverse models. Surprisingly, debiasing techniques often exhibit limited efficacy, sometimes inadvertently increasing the encoded bias in internal representations while reducing bias in model output distributions. This highlights a disconnect between mitigating bias in output distributions and addressing its internal representations. This work provides valuable guidance for advancing bias mitigation strategies and fostering the development of more equitable language models.
A Legal Framework for Natural Language Processing Model Training in Portugal
Rúben Almeida, Evelin Amorim
Recent advances in deep learning have promoted the advent of many computational systems capable of performing intelligent actions that, until then, were restricted to the human intellect. In the particular case of human languages, these advances allowed the introduction of applications like ChatGPT that are capable of generating coherent text without being explicitly programmed to do so. Instead, these models use large volumes of textual data to learn meaningful representations of human languages. Associated with these advances, concerns about copyright and data privacy infringements caused by these applications have emerged. Despite these concerns, the pace at which new natural language processing applications continued to be developed largely outperformed the introduction of new regulations. Today, communication barriers between legal experts and computer scientists motivate many unintentional legal infringements during the development of such applications. In this paper, a multidisciplinary team intends to bridge this communication gap and promote more compliant Portuguese NLP research by presenting a series of everyday NLP use cases, while highlighting the Portuguese legislation that may arise during its development.
Learning and communication pressures in neural networks: Lessons from emergent communication
Lukas Galke, Limor Raviv
Finding and facilitating commonalities between the linguistic behaviors of large language models and humans could lead to major breakthroughs in our understanding of the acquisition, processing, and evolution of language. However, most findings on human-LLM similarity can be attributed to training on human data. The field of emergent machine-to-machine communication provides an ideal testbed for discovering which pressures are neural agents naturally exposed to when learning to communicate in isolation, without any human language to start with. Here, we review three cases where mismatches between the emergent linguistic behavior of neural agents and humans were resolved thanks to introducing theoretically-motivated inductive biases. By contrasting humans, large language models, and emergent communication agents, we then identify key pressures at play for language learning and emergence: communicative success, production effort, learnability, and other psycho-/sociolinguistic factors. We discuss their implications and relevance to the field of language evolution and acquisition. By mapping out the necessary inductive biases that make agents' emergent languages more human-like, we not only shed light on the underlying principles of human cognition and communication, but also inform and improve the very use of these models as valuable scientific tools for studying language learning, processing, use, and representation more broadly.
Türkçe Dil Modellerinin Performans Karşılaştırması Performance Comparison of Turkish Language Models
Eren Dogan, M. Egemen Uzun, Atahan Uz
et al.
The developments that language models have provided in fulfilling almost all kinds of tasks have attracted the attention of not only researchers but also the society and have enabled them to become products. There are commercially successful language models available. However, users may prefer open-source language models due to cost, data privacy, or regulations. Yet, despite the increasing number of these models, there is no comprehensive comparison of their performance for Turkish. This study aims to fill this gap in the literature. A comparison is made among seven selected language models based on their contextual learning and question-answering abilities. Turkish datasets for contextual learning and question-answering were prepared, and both automatic and human evaluations were conducted. The results show that for question-answering, continuing pretraining before fine-tuning with instructional datasets is more successful in adapting multilingual models to Turkish and that in-context learning performances do not much related to question-answering performances.
Effectiveness of Back care education Programme among school children: a systematic review of randomized controlled trials
Canice Chukwudi Anyachukwu, Confidence Chinemerem Amarah, Blessing Chiagozikam Atueyi
et al.
Abstract Study design Systematic review of Randomised controlled trials. Objectives With the increasing incidence of back pain among children and its untold implications to their future, back education tailored in an effective way would be indicated. However literature appears unsettled. This study aims to review available literature to determine the effect of school-based back education in preventing and managing low back pain in school children. Methods Randomized controlled trials carried out on elementary and secondary school children of ages 6 to 18 years and published in English language were included. Back education taught in hospitals or other settings were excluded. Primary outcome was back pain prevalence and secondary outcomes were constituted from the study characteristics of selected studies which includes: back behavior, knowledge, postural habits, physical activity, fear-avoidance beliefs, back pack carriage, pain intensity, skills and self efficacy. Databases searched were PEDro, HINARI, PubMed, Cochrane, and Google Scholar. Available stiudies from 2000 to March 2022 were retrieved. Quality of studies were assessed using the PEDro scale. Obtained studies were descriptively analyzed. Results A total 8420 studies were retrieved and 8 studies (with 1239 participants) were included in this review. Four studies each assessed back knowledge and back behavior, and two assessed back pain prevalence. There were improvements in back knowledge and back behaviour, but effectiveness of back care education on back pain prevalence was not conclusive. Forms of education used involved the indirect method of conditioning the environment and the direct method which made use of theory, practical lessons and educational books and materials. Conclusion Back care education programmes in schools are effective in improving back care knowledge, behavior and reduction in low back pain frequency. Reduction in back pain prevalence is not conclusive. Back care education could be incorporated as part of schools’ education programmes. Limitations include exclusion of non English language studies and inconsistent outcome measures. Funding source None. Registration This review protocol was registered under the International platform of Registered systematic review and meta-analysis protocol (INPLASY) with the registration number; INPLASY202310044 and DOI number; https://doi.org/10.37766/inplasy2023.1.0044
Valeurs expressive et didactique de l’anthroponymie des prénoms Gulmance au Burkina Faso
Germain OUALLY
Résumé : Les systèmes de nominalisation dans la société traditionnelle africaine obéissent à des règles et codes précis qui expriment une vision et un message ayant généralement une visée pédagogique. Cet article a pour but de mener une réflexion sur le système anthroponymique de l’ethnie gulmance au Burkina Faso afin d’en révéler les potentialités pédagogiques. Pour ce faire ce systeme est ici analysé à travers ses différentes modalités et ses lignes de force comme marque d’identité. En effet, en imposant un prénom à l’enfant chez les gulmance, le groupe familial lui confère à la fois une identité, une personnalité et surtout un chemin à suivre ou à indiquer aux autres. Ainsi dans cette communauté, le nom individuel renvoie à une vertu, un pouvoir de création et d’incantation qui permet d’influencer la réalité sociale en vertu de la forte croyance en la métempsychose. Ce qui nous conduit à la question de recherche suivante : En quoi le système anthroponymique gulmance possède des potentialités pédagogiques ? C’est à cette question fondamentale que cet article tente de donner réponse en menant la réflexion sur les croyances et conception qui fondent la nominalisation. Nous analysons l’anthroponymie en tant qu’outil pédagogique, didactique et linguistique contribuant au développement personnel. Notre communication vise à présenter le nom individuel dans la pensée existentielle traditionnelle gulmance comme signalement culturel. Elle se base sur des recherches documentaires et des enquêtes terrains avec la sociocritique comme théorie d’analyse.
Mots-clés : anthroponymie, valeurs, didactique, expressive, identité et analyse
Indian Language Summarization using Pretrained Sequence-to-Sequence Models
Ashok Urlana, Sahil Manoj Bhatt, Nirmal Surange
et al.
The ILSUM shared task focuses on text summarization for two major Indian languages- Hindi and Gujarati, along with English. In this task, we experiment with various pretrained sequence-to-sequence models to find out the best model for each of the languages. We present a detailed overview of the models and our approaches in this paper. We secure the first rank across all three sub-tasks (English, Hindi and Gujarati). This paper also extensively analyzes the impact of k-fold cross-validation while experimenting with limited data size, and we also perform various experiments with a combination of the original and a filtered version of the data to determine the efficacy of the pretrained models.
A Survey on Multimodal Large Language Models
Shukang Yin, Chaoyou Fu, Sirui Zhao
et al.
Recently, Multimodal Large Language Model (MLLM) represented by GPT-4V has been a new rising research hotspot, which uses powerful Large Language Models (LLMs) as a brain to perform multimodal tasks. The surprising emergent capabilities of MLLM, such as writing stories based on images and OCR-free math reasoning, are rare in traditional multimodal methods, suggesting a potential path to artificial general intelligence. To this end, both academia and industry have endeavored to develop MLLMs that can compete with or even better than GPT-4V, pushing the limit of research at a surprising speed. In this paper, we aim to trace and summarize the recent progress of MLLMs. First of all, we present the basic formulation of MLLM and delineate its related concepts, including architecture, training strategy and data, as well as evaluation. Then, we introduce research topics about how MLLMs can be extended to support more granularity, modalities, languages, and scenarios. We continue with multimodal hallucination and extended techniques, including Multimodal ICL (M-ICL), Multimodal CoT (M-CoT), and LLM-Aided Visual Reasoning (LAVR). To conclude the paper, we discuss existing challenges and point out promising research directions. In light of the fact that the era of MLLM has only just begun, we will keep updating this survey and hope it can inspire more research. An associated GitHub link collecting the latest papers is available at https://github.com/BradyFU/Awesome-Multimodal-Large-Language-Models.
Language-Conditioned Change-point Detection to Identify Sub-Tasks in Robotics Domains
Divyanshu Raj, Chitta Baral, Nakul Gopalan
In this work, we present an approach to identify sub-tasks within a demonstrated robot trajectory using language instructions. We identify these sub-tasks using language provided during demonstrations as guidance to identify sub-segments of a longer robot trajectory. Given a sequence of natural language instructions and a long trajectory consisting of image frames and discrete actions, we want to map an instruction to a smaller fragment of the trajectory. Unlike previous instruction following works which directly learn the mapping from language to a policy, we propose a language-conditioned change-point detection method to identify sub-tasks in a problem. Our approach learns the relationship between constituent segments of a long language command and corresponding constituent segments of a trajectory. These constituent trajectory segments can be used to learn subtasks or sub-goals for planning or options as demonstrated by previous related work. Our insight in this work is that the language-conditioned robot change-point detection problem is similar to the existing video moment retrieval works used to identify sub-segments within online videos. Through extensive experimentation, we demonstrate a $1.78_{\pm 0.82}\%$ improvement over a baseline approach in accurately identifying sub-tasks within a trajectory using our proposed method. Moreover, we present a comprehensive study investigating sample complexity requirements on learning this mapping, between language and trajectory sub-segments, to understand if the video retrieval-based methods are realistic in real robot scenarios.
UzbekStemmer: Development of a Rule-Based Stemming Algorithm for Uzbek Language
Maksud Sharipov, Ollabergan Yuldashov
In this paper we present a rule-based stemming algorithm for the Uzbek language. Uzbek is an agglutinative language, so many words are formed by adding suffixes, and the number of suffixes is also large. For this reason, it is difficult to find a stem of words. The methodology is proposed for doing the stemming of the Uzbek words with an affix stripping approach whereas not including any database of the normal word forms of the Uzbek language. Word affixes are classified into fifteen classes and designed as finite state machines (FSMs) for each class according to morphological rules. We created fifteen FSMs and linked them together to create the Basic FSM. A lexicon of affixes in XML format was created and a stemming application for Uzbek words has been developed based on the FSMs.
Utilizing Language-Image Pretraining for Efficient and Robust Bilingual Word Alignment
Tuan Dinh, Jy-yong Sohn, Shashank Rajput
et al.
Word translation without parallel corpora has become feasible, rivaling the performance of supervised methods. Recent findings have shown that the accuracy and robustness of unsupervised word translation (UWT) can be improved by making use of visual observations, which are universal representations across languages. In this work, we investigate the potential of using not only visual observations but also pretrained language-image models for enabling a more efficient and robust UWT. Specifically, we develop a novel UWT method dubbed Word Alignment using Language-Image Pretraining (WALIP), which leverages visual observations via the shared embedding space of images and texts provided by CLIP models (Radford et al., 2021). WALIP has a two-step procedure. First, we retrieve word pairs with high confidences of similarity, computed using our proposed image-based fingerprints, which define the initial pivot for the word alignment. Second, we apply our robust Procrustes algorithm to estimate the linear mapping between two embedding spaces, which iteratively corrects and refines the estimated alignment. Our extensive experiments show that WALIP improves upon the state-of-the-art performance of bilingual word alignment for a few language pairs across different word embeddings and displays great robustness to the dissimilarity of language pairs or training corpora for two word embeddings.
Do Not Fire the Linguist: Grammatical Profiles Help Language Models Detect Semantic Change
Mario Giulianelli, Andrey Kutuzov, Lidia Pivovarova
Morphological and syntactic changes in word usage (as captured, e.g., by grammatical profiles) have been shown to be good predictors of a word's meaning change. In this work, we explore whether large pre-trained contextualised language models, a common tool for lexical semantic change detection, are sensitive to such morphosyntactic changes. To this end, we first compare the performance of grammatical profiles against that of a multilingual neural language model (XLM-R) on 10 datasets, covering 7 languages, and then combine the two approaches in ensembles to assess their complementarity. Our results show that ensembling grammatical profiles with XLM-R improves semantic change detection performance for most datasets and languages. This indicates that language models do not fully cover the fine-grained morphological and syntactic signals that are explicitly represented in grammatical profiles. An interesting exception are the test sets where the time spans under analysis are much longer than the time gap between them (for example, century-long spans with a one-year gap between them). Morphosyntactic change is slow so grammatical profiles do not detect in such cases. In contrast, language models, thanks to their access to lexical information, are able to detect fast topical changes.
Embracing AI-Based Education: Perceived Social Presence of Human Teachers and Expectations About Machine Teachers in Online Education
Jihyun Kim, Kelly Merrill Jr, Kun Xu
et al.
Technological advancements in education have turned the idea of machines as teachers into a reality. To better understand this phenomenon, the present study explores how college students develop expectations (or anticipations) about a machine teacher, particularly an AI teaching assistant. Specifically, the study examines whether students’ previous experiences with online courses taught by a human teacher would influence their expectations about AI teaching assistants in future online courses. An online survey was conducted to collect data from college students in the United States. Findings indicate that positively experienced social presence of a human teacher helps develop positive expectations about an AI teaching assistant. The study provides meaningful implications and contributions to our understanding of a machine agent in education.
Technology (General), Oral communication. Speech
Francophone : un terme qui pose problème ou / et une réalite qui dérange ?
Marc Quaghebeur
A problematic term and/or a disturbing reality
Often dubious or allergic, even downright negative, the reactions to the word “Francophone”, a term whose meaning is nevertheless clear, do not fail to raise questions. They are particularly strong in the literary field where more and more Francophone literatures are developing, the emergence, study and recognition of which always come up against resistance without equal in other linguistic areas resulting from European colonization. The explanation lies at the very heart of the History of France and of the Franco-French structures for apprehending the world – particularly through the place and the conception of the language and literature that signify it – what the author calls the French ideology. The effects of Parisian editorial centralism, unique in the world, are also studied, as well as the contrasting consequences of the political use made of the French language and its supposed universality. Diverse historical strata and contemporary contradictions are meticulously analysed, as well as the obstacles to considering and building a plural Franco-Francophone space. What the rejection of the word “Francophone” refers to is the realities that it designates and forces us to recognize fundamentally. They call into question a habitus.
Speak Global, Sell Local? Digital Linguistic Landscape of Local Small Businesses in the Social Media
Enikő Biró
This paper focuses on the online presence of languages and
linguistic patterns of local small businesses in a bilingual, HungarianRomanian ethnic community in Romania. By capturing linguistic diversity
and creativity via netnographic research, patterns of linguistic landscape
elements in the social media, such as marketing strategy of local small
businesses, can be analysed. The findings suggest that despite the need
to advertise by using the state language, Romanian, in order to maximize
the target audience, the concentration of Hungarian landscape elements is
the highest. Businesses construct their linguistic identity by their language
choices and practices, aligned with the collective linguistic identity of a
bilingual community and the need for a global representation, in order to
secure a place in the local market.
Theorizing race, marginalization, and language in the digital media
Deepali Mallya, Rini Susanti
Digitization of the communication medium has transformed the mute, marginalized ‘audience’ into a heterogeneous and credible content ‘producer.’ Drawing on this dynamics and operation of the digital media, it has urged the need to re-theorize ‘marginalization’ and ‘race.’ Hence, this paper critiques the digital-media tool, blogs, using a rhetoric-textual analysis method and critical discourse analysis method for the fictional text, Americanah. These methods employ the psychoanalytical-Althusserian critique of Adichie’s fictional narrative, Americanah. In the psychoanalytical sense, blog-writing can qualify as a mechanism of ‘sublimation’ in the post-modern world. In the Althusserian sense, blogs become persuasive mechanisms for a subject’s interpellation into non-dominant ideology. Among the plethora of marginalized global communities, African-Americans are enormously embracing the virtual communication trends for socio-political motives. This paper theorizes the correlations between race-related blogging, psychoanalytic sublimation, and the socio-political repudiation of power structure by employing the literary text as material evidence. Accordingly, the literary study has concluded that digital-mediums (i.e., in this case, political blogs) can depose the power vested in the ideological-state-apparatuses and impose a high potential for expression of unrestrained, credible, and democratic voice of the marginalized. It also validates that blogs/blogging influences and moulds national/political/racial discourses by lending a liberated voice and context-independent perspective to the racially oppressed.
Communication. Mass media, Advertising
Classification Benchmarks for Under-resourced Bengali Language based on Multichannel Convolutional-LSTM Network
Md. Rezaul Karim, Bharathi Raja Chakravarthi, John P. McCrae
et al.
Exponential growths of social media and micro-blogging sites not only provide platforms for empowering freedom of expressions and individual voices but also enables people to express anti-social behaviour like online harassment, cyberbullying, and hate speech. Numerous works have been proposed to utilize these data for social and anti-social behaviours analysis, document characterization, and sentiment analysis by predicting the contexts mostly for highly resourced languages such as English. However, there are languages that are under-resources, e.g., South Asian languages like Bengali, Tamil, Assamese, Telugu that lack of computational resources for the NLP tasks. In this paper, we provide several classification benchmarks for Bengali, an under-resourced language. We prepared three datasets of expressing hate, commonly used topics, and opinions for hate speech detection, document classification, and sentiment analysis, respectively. We built the largest Bengali word embedding models to date based on 250 million articles, which we call BengFastText. We perform three different experiments, covering document classification, sentiment analysis, and hate speech detection. We incorporate word embeddings into a Multichannel Convolutional-LSTM (MConv-LSTM) network for predicting different types of hate speech, document classification, and sentiment analysis. Experiments demonstrate that BengFastText can capture the semantics of words from respective contexts correctly. Evaluations against several baseline embedding models, e.g., Word2Vec and GloVe yield up to 92.30%, 82.25%, and 90.45% F1-scores in case of document classification, sentiment analysis, and hate speech detection, respectively during 5-fold cross-validation tests.
Situated Data, Situated Systems: A Methodology to Engage with Power Relations in Natural Language Processing Research
Lucy Havens, Melissa Terras, Benjamin Bach
et al.
We propose a bias-aware methodology to engage with power relations in natural language processing (NLP) research. NLP research rarely engages with bias in social contexts, limiting its ability to mitigate bias. While researchers have recommended actions, technical methods, and documentation practices, no methodology exists to integrate critical reflections on bias with technical NLP methods. In this paper, after an extensive and interdisciplinary literature review, we contribute a bias-aware methodology for NLP research. We also contribute a definition of biased text, a discussion of the implications of biased NLP systems, and a case study demonstrating how we are executing the bias-aware methodology in research on archival metadata descriptions.