About the Authors
R. Wilson
Art Graesser is a Professor in the Department of Psychology and the Institute of Intelligent Systems at the University of Memphis as well as an Honorary Research Fellow at the University of Oxford. His primary research interests are in cognitive science, discourse processing, and the learning sciences. More specific interests include knowledge representation, question asking and answering, tutoring, text comprehension, conversation, emotions, artificial intelligence, and computational linguistics. He served as editor of the journals Discourse Processes (1996–2005) and Journal of Educational Psychology (2009–2014). His service in professional societies includes having been president of the Empirical Studies of Literature, Art, and Media (1989–1992), the Society for Text and Discourse (2007–2010), the International Society for Artificial Intelligence in Education (2007–2009), and the Federation of Associations in the Behavioral and Brain Sciences Foundation (2012–2013). He and his colleagues have developed and tested software in learning, language, and discourse technologies, including those that hold a conversation in natural language and interact with multimedia (such as AutoTutor) and those that analyze text on multiple levels of language and discourse (Coh-Metrix and QUAID). He has served on four Organisation for Economic Co-operation and Development (OECD) expert panels on problem solving, including acting as chair of the Programme for International Student Assessment (PISA) 2015 Collaborative Problem Solving.
LA THEORIE DU MARKETING POLITIQUE DE CELEBRITE ET LA VISITE DU PRESIDENT FRANÇAIS EMMANUEL MACRON A KINSHASA EN 2023. LE MARKETING ET LA MARKETOLOGIE : LA RUPTURE EPISTEMOLOGIQUE
Bob BOBUTAKA Bateko
Résumé : La conception de la théorie du Marketing Politique de Célébrité est un apport pour asseoir le schème scientifique dans le contexte des études du marché. Cette théorie scientifique s’inscrit dans la logique de la rupture épistémologie de la scientificité du marketing et conséquemment, depuis la fin de la décennie 2010, nous avons conceptualisé la Marketologie qui est la science du marché, à la différence du Marketing qui est l’action du marché. La visite du Président français Emmanuel Macron à Kinshasa en 2023 constitue l’élément ayant stimulé la construction de cette théorie. Mots-clés : Théorie du Marketing Politique de Célébrité, Emmanuel Macron, Marketing, Marketologie, Rupture épistémologique
Arts in general, Computational linguistics. Natural language processing
Automatic detection of manipulated Bangla news: A new knowledge-driven approach
Aysha Akther, Kazi Masudul Alam, Rameswar Debnath
In recent years, dissemination of misleading news has become easier than ever due to the simplicity of creating and distributing news content on online media platforms. Misleading news detection has become a global topic of interest due to its significant impact on society, economics, and politics. Automatic detection of the veracity of news remains challenging because of its diversity and close resemblance with true events. In many languages, fake news detection has been studied from different perspectives. However, in Bangla, existing endeavors on fake news detection generally relied on linguistic style analysis and latent representation-based machine learning and deep learning models. These models primarily rely on manually labeled annotations. To address these challenges, we proposed a knowledge-based Bangla fake news detection model that does not require model training. In our proposed manipulation detection approach, a news article is automatically labeled as fake or authentic based on an authenticity score that relies on the consistency of knowledge and semantics, underlying sentiment, and credibility of the news source. We also propose a consistent and context-aware manipulated news generation technique to facilitate the detection of partially manipulated Bangla news. We found the proposed model to be a reliable one for the detection of both fake news and partially manipulated news. We also developed a dataset that is balanced according to the number of authentic and fake news for the detection of Bangla fake news, where news items are collected from multiple domains and various news sources. The experimental evaluation of our proposed knowledge-driven approach on the developed dataset has shown 97.08% accuracy for only fake news detection.
Computational linguistics. Natural language processing
Part-of-Speech Tagging of 16th-Century Latin with GPT
Elina Stüssi, Phillip Ströbel
Part-of-speech tagging is foundational to natural language processing, transcending mere linguistic functions. However, taggers optimized for Classical Latin struggle when faced with diverse linguistic eras shaped by the language ́s evolution. Exploring 16th-century Latin from the correspondence and assessing five Latin treebanks, we focused on carefully evaluating tagger accuracy and refining Large Language Models for improved performance in this nuanced linguistic context. Our discoveries unveiled the competitive accuracies of different versions of GPT, particularly after fine-tuning. Notably, our best fine-tuned model soared to an average accuracy of 88.99% over the treebank data, underscoring the remarkable adaptability and learning capabilities when fine-tuned to the specific intricacies of Latin texts. Next to emphasising GPT’s part-of-speech tagging capabilities, our second aim is to strengthen taggers ́ adaptability across different periods. We establish solid groundwork for using Large Language Models in specific natural language processing tasks where part-of-speech tagging is often employed as a pre-processing step. This work significantly advances the use of modern language models in interpreting historical language, bridging the gap between past linguistic epochs and modern computational linguistics.
Parbala: An Important Dramatic Character
Majid Mushtaq Dr
Urdu litreture has many genres like Gazal Nazam, Novel, short story and Drama.Sadarshan is one of the earlier renowned short story writers in Urdu, having potential of drama writing equally. His stories have romantic aura. Parbala is one of his importent charector of his drama Parbal.He wrote this dramma in 1925. In This article using the close reading methods, the salient features of this characters are brought forth. It is argued that despite the social and cultural bounds of his time, Sudrashan has created a feminine character that has courageous personality and who had the capcity to turn the tables. In Parbla we can find a balance between hard core femininism and sublime qualities of traditional culture.
Language. Linguistic theory. Comparative grammar, Computational linguistics. Natural language processing
PROBLEMATIQUE DES CONDITIONS DE VIE DES PROFESSEURS CONGOLAIS PAR RAPPORT A LEUR RECHERCHE SCIENTIFIQUE
MWANZA ILOTI Jean Claude
Résumé : Le rôle et les conditions de vie des professeurs des universités, en République Démocratique du Congo, ne sont pas préservés par l’autorité de tutelle, entrainant de multiples maux qui rongent les acteurs enseignants dans leurs métiers professionnels. Car, la vie de ces hommes et leurs familles dépendent de l’Etat pour éviter qui ne soient en péril, et ne pas semer un désintéressement graduel à l’ensemble du pays et aux jeunes intellectuels de demain. Tout ce qui est vital, provient d’une très longue recherche, muri et expérimenté par les savants scientifiques dans le monde comme en République Démocratique du Congo, suivant les principes édictés par l’autorité étatique qui, fixant les moyens de conditionnement pour obtenir le titre convoité et donné aux intellectuels, surtout à travers l’éducation qui doit servir à l’âme du peuple et à l’Etat de manière générale, et le futur savant de façon singulière.
Mots-clés : Education, Etat, Professeur, Université, Conditions et Rôle.
Arts in general, Computational linguistics. Natural language processing
OSFS‐Vague: Online streaming feature selection algorithm based on vague set
Jie Yang, Zhijun Wang, Guoyin Wang
et al.
Abstract Online streaming feature selection (OSFS), as an online learning manner to handle streaming features, is critical in addressing high‐dimensional data. In real big data‐related applications, the patterns and distributions of streaming features constantly change over time due to dynamic data generation environments. However, existing OSFS methods rely on presented and fixed hyperparameters, which undoubtedly lead to poor selection performance when encountering dynamic features. To make up for the existing shortcomings, the authors propose a novel OSFS algorithm based on vague set, named OSFS‐Vague. Its main idea is to combine uncertainty and three‐way decision theories to improve feature selection from the traditional dichotomous method to the trichotomous method. OSFS‐Vague also improves the calculation method of correlation between features and labels. Moreover, OSFS‐Vague uses the distance correlation coefficient to classify streaming features into relevant features, weakly redundant features, and redundant features. Finally, the relevant features and weakly redundant features are filtered for an optimal feature set. To evaluate the proposed OSFS‐Vague, extensive empirical experiments have been conducted on 11 datasets. The results demonstrate that OSFS‐Vague outperforms six state‐of‐the‐art OSFS algorithms in terms of selection accuracy and computational efficiency.
Computational linguistics. Natural language processing, Computer software
Propriété privée du sol et la communio fundi originaria : Entre hostilité et hospitalité
Emmanuel LOKULI IYELE
Résumé : Hier et aujourd’hui, l’envahissement du mal, mieux de la violence parmi les peuples du monde, à mon avis, est à situer dans le seul et unique axe ultime, celui de propriété privée du sol. Les guerres intestines, interétatiques et même les guerres ethniques résultent en effet de la recherche effrénée, par chacun, d’acquérir des plus grandes propriétés privées du sol en en tirant tous les avantages relatifs, en vue de rendre plus croissant son portefeuille tout en foulant au pied l’idée d’intérêt général de l’humanité. Cette violence, qui se manifeste souvent dans le cadre des relations diplomatiques, installe un climat d’entredéchirement cosmologique et ontologique de l’humanité. Pourtant, cette dernière, de par sa nature, reste Une et Multiple. Si son unicité consiste dans l’appartenance de tous les êtres humains à la même espèce, espèce humaine, et à la même communauté universelle originaire, sa multiplicité est dans la diversité culturelle. En raison de cette unicité, principe primordial, cet article, inspiré de nos recherches sur Locke et Kant, pose comme postulat : « communio fundi originaria » « communauté originaire ». C’est donc à partir de ce postulat que nous tentons de justifier la nécessité de promouvoir l’hospitalité universelle, le revenu minimum d’existence (RME) et la cohabitation raisonnée entre les peuples de la terre avec le mécanisme d’encadrement.
Mots-clés : propriété, communauté, inégalités, RME et hospitalité
Arts in general, Computational linguistics. Natural language processing
Introduction to the Special Issue on Summarization
Dragomir R. Radev, E. Hovy, K. McKeown
592 sitasi
en
Computer Science
AI Nüshu (Women's scripts) - An Exploration of Language Emergence in Sisterhood
Yuying Tang, Yuqian Sun, Ze Gao
et al.
This paper presents "AI Nüshu," an emerging language system inspired by Nüshu (women's scripts), the unique language created and used exclusively by ancient Chinese women who were illiterate under a patriarchy society. Through an interactive art installation, two artificial intelligent (AI) agents continuously observe their environment and communicate with each other, developing a writing system that encodes Chinese. In this system, two AI agents observe the environment through cameras, record the unconscious behaviors of the audience, and generate summaries of their observations through visual recognition. Subsequently, the agent associates the corresponding original Nüshu poetry lines and generates new poetry text through a Language Model (LLM), representing its reflection. To develop their language, they continuously switch roles between the speaker and listener, constantly communicating their reflections, and encrypting a word in the poetry line with their self-created AI Nüshu character, allowing the other to guess and learn. Gradually, they reach a consensus on AI Nüshu, forming a unique "AI Nüshu Dictionary" for machines. This language, algorithmically combined into corresponding characters, has components derived from Nüshu, similar to Chinese characters and traditional textile patterns. Thus, like ancient women, the two agents gradually developed their Chinese writing system, corresponding one-to-one with Chinese characters. In contrast, humans, as the authority of the language system, became an object observed, interpreted, and inspired by machines to stimulate non-human language. This is the first media art project to interpret Nüshu from a computational linguistics perspective, infusing AI and art research with non-English natural language processing, Chinese cultural heritage, and a feminist viewpoint. This encourages the creation of more non-English, linguistically-oriented artworks for diverse cultures. We simulate communication in sisterhood through a multi-agent learning system, which questioned knowledge authority between humans and machines through the lens of language development.
10 sitasi
en
Computer Science
A new fuzzy support vector machine with pinball loss
Ram Nayan Verma, Rahul Deo, Rakesh Srivastava
et al.
Abstract The fuzzy support vector machine (FSVM) assigns each sample a fuzzy membership value based on its relevance, making it less sensitive to noise or outliers in the data. Although FSVM has had some success in avoiding the negative effects of noise, it uses hinge loss, which maximizes the shortest distance between two classes and is ineffective in dealing with feature noise near the decision boundary. Furthermore, whereas FSVM concentrates on misclassification errors, it neglects to consider the critical within-class scatter minimization. We present a Fuzzy support vector machine with pinball loss (FPin-SVM), which is a fuzzy extension of a reformulation of a recently proposed support vector machine with pinball loss (Pin-SVM) with several significant improvements, to improve the performance of FSVM. First, because we used the squared L2- norm of errors variables instead of the L1 norm, our FPin-SVM is a strongly convex minimization problem; second, to speed up the training procedure, solutions of the proposed FPin-SVM, as an unconstrained minimization problem, are obtained using the functional iterative and Newton methods. Third, it is proposed to solve the minimization problem directly in primal. Unlike FSVM and Pin-SVM, our FPin-SVM does not require a toolbox for optimization. We dig deeper into the features of FPin-SVM, such as noise insensitivity and within-class scatter minimization. We conducted experiments on synthetic and real-world datasets with various sounds to validate the usefulness of the suggested approach. Compared to the SVM, FSVM, and Pin-SVM, the presented approaches demonstrate equivalent or superior generalization performance in less training time.
Computational linguistics. Natural language processing, Electronic computers. Computer science
Introduction to Mathematical Language Processing: Informal Proofs, Word Problems, and Supporting Tasks
Jordan Meadows, André Freitas
Computational linguistics. Natural language processing
AI-powered narrative building for facilitating public participation and engagement
Fernando Marmolejo-Ramos, Thomas Workman, Clint Walker
et al.
Abstract Algorithms, data, and AI (ADA) technologies permeate most societies worldwide because of their proven benefits in different areas of life. Governments are the entities in charge of harnessing the benefits of ADA technologies above and beyond providing government services digitally. ADA technologies have the potential to transform the way governments develop and deliver services to citizens, and the way citizens engage with their governments. Conventional public engagement strategies employed by governments have limited both the quality and diversity of deliberation between the citizen and their governments, and the potential for ADA technologies to be employed to improve the experience for both governments and the citizens they serve. In this article we argue that ADA technologies can improve the quality, scope, and reach of public engagement by governments, particularly when coupled with other strategies to ensure legitimacy and accessibility among a broad range of communities and other stakeholders. In particular, we explore the role “narrative building” (NB) can play in facilitating public engagement through the use of ADA technologies. We describe a theoretical implementation of NB enhanced by adding natural language processing, expert knowledge elicitation, and semantic differential rating scales capabilities to increase gains in scale and reach. The theoretical implementation focuses on the public’s opinion on ADA-related technologies, and it derives implications for ethical governance.
Computational linguistics. Natural language processing, Electronic computers. Computer science
A Systematic Literature Review of Automated ICD Coding and Classification Systems using Discharge Summaries
R. Kaur, J. A. Ginige, O. Obst
Codification of free-text clinical narratives have long been recognised to be beneficial for secondary uses such as funding, insurance claim processing and research. The current scenario of assigning codes is a manual process which is very expensive, time-consuming and error prone. In recent years, many researchers have studied the use of Natural Language Processing (NLP), related Machine Learning (ML) and Deep Learning (DL) methods and techniques to resolve the problem of manual coding of clinical narratives and to assist human coders to assign clinical codes more accurately and efficiently. This systematic literature review provides a comprehensive overview of automated clinical coding systems that utilises appropriate NLP, ML and DL methods and techniques to assign ICD codes to discharge summaries. We have followed the Preferred Reporting Items for Systematic Reviews and Meta-Analyses(PRISMA) guidelines and conducted a comprehensive search of publications from January, 2010 to December 2020 in four academic databases- PubMed, ScienceDirect, Association for Computing Machinery(ACM) Digital Library, and the Association for Computational Linguistics(ACL) Anthology. We reviewed 7,556 publications; 38 met the inclusion criteria. This review identified: datasets having discharge summaries; NLP techniques along with some other data extraction processes, different feature extraction and embedding techniques. To measure the performance of classification methods, different evaluation metrics are used. Lastly, future research directions are provided to scholars who are interested in automated ICD code assignment. Efforts are still required to improve ICD code prediction accuracy, availability of large-scale de-identified clinical corpora with the latest version of the classification system. This can be a platform to guide and share knowledge with the less experienced coders and researchers.
30 sitasi
en
Computer Science
Argumentation Mining
Marco Lippi, Paolo Torroni
Argumentation is pervasive—in everyday life as well as in politics and media. People exchange arguments to persuade each other, to achieve agreement, to make decisions, and more. Argumentation has been studied since ancient times (Aristotle 2007), and it is an active research topic across disciplines today, from logic to rhetoric to linguistics (van Eemeren et al. 2014). In a time of alternative facts and filter bubbles, arguments are of ever-increasing importance; when the truth of facts is unclear, we need to compare reasons for opposing claims, and we should do so beyond our own view. Computational research on argumentation is still young. It first evolved in the AI community, oriented toward formal argumentation (Dung 1995). Except for some early pioneering works, the natural language side is getting attention since the publication of the first approaches to mine arguments from text (Palau and Moens 2009). Since then, computational linguistics research on argumentation grows constantly, and impressive industrial applications such as IBM’s Debater start to appear. Since 2014, the ArgMining workshop series exists, annually taking place at ACL, EMNLP, or NAACL. Argumentation Mining is the first textbook on the topic. In line with ongoing research activities, it does not only tackle the identification and classification of claims, reasons, and their relations, but also the assessment and generation of argumentation. Although the focus is on natural language processing (NLP) techniques, the book covers connections to formal argumentation and some underlying techniques. The complementary pair of authors was a smart choice in this regard: Manfred Stede is one of the leading computational linguists for discourse processing, and Jodi Schneider represents the AI community and has specific expertise on scholarly communication and knowledge organization. Both have co-authored several papers on argumentation, and they obviously have a comprehensive overview of the topic; I couldn’t think of more than a few papers that I missed being discussed. The book contains ten chapters. After definitions and relevant basics (Chapters 1–2), it outlines common argument models and existing corpora (Chapters 3–4). Chapters 5 to 7 present computational aspects of argumentation mining, followed by assessment and generation approaches (Chapters 8–9), and a final conclusion (Chapter 10). Chapter 1 begins with a very compressed definition of argumentation from the literature. It then explicates each single term in the definition one after another, directly
122 sitasi
en
Computer Science, Philosophy
A Refutation of Finite-State Language Models through Zipf’s Law for Factual Knowledge
L. Debowski
We present a hypothetical argument against finite-state processes in statistical language modeling that is based on semantics rather than syntax. In this theoretical model, we suppose that the semantic properties of texts in a natural language could be approximately captured by a recently introduced concept of a perigraphic process. Perigraphic processes are a class of stochastic processes that satisfy a Zipf-law accumulation of a subset of factual knowledge, which is time-independent, compressed, and effectively inferrable from the process. We show that the classes of finite-state processes and of perigraphic processes are disjoint, and we present a new simple example of perigraphic processes over a finite alphabet called Oracle processes. The disjointness result makes use of the Hilberg condition, i.e., the almost sure power-law growth of algorithmic mutual information. Using a strongly consistent estimator of the number of hidden states, we show that finite-state processes do not satisfy the Hilberg condition whereas Oracle processes satisfy the Hilberg condition via the data-processing inequality. We discuss the relevance of these mathematical results for theoretical and computational linguistics.
9 sitasi
en
Computer Science, Medicine
Parts-of-Speech tagging for Malayalam using deep learning techniques
K. Akhil, R. Rajimol, V. Anoop
35 sitasi
en
Computer Science
Classifying Non-Sentential Utterances in Dialogue: A Machine Learning Approach
Raquel Fernández, Jonathan Ginzburg, Shalom Lappin
Computational linguistics. Natural language processing
Toward Gender-Inclusive Coreference Resolution: An Analysis of Gender and Bias Throughout the Machine Learning Lifecycle
Yang Trista Cao, Hal Daumé
AbstractCorrectly resolving textual mentions of people fundamentally entails making inferences about those people. Such inferences raise the risk of systematic biases in coreference resolution systems, including biases that can harm binary and non-binary trans and cis stakeholders. To better understand such biases, we foreground nuanced conceptualizations of gender from sociology and sociolinguistics, and investigate where in the machine learning pipeline such biases can enter a coreference resolution system. We inspect many existing data sets for trans-exclusionary biases, and develop two new data sets for interrogating bias in both crowd annotations and in existing coreference resolution systems. Through these studies, conducted on English text, we confirm that without acknowledging and building systems that recognize the complexity of gender, we will build systems that fail for: quality of service, stereotyping, and over- or under-representation, especially for binary and non-binary trans users.
Computational linguistics. Natural language processing
The Polish Language in Egypt: Current Status and Speakers’ Preferences
Katarzyna Dzierżawin
The Polish Language in Egypt: Current Status and Speakers’ Preferences
The article describes multilingualism and language preferences of the Polish community living in Egypt. The present paper is a first study of multilingualism involving both Polish and Arabic. Field study was carried out in October – December 2020 using surveys and face-to-face interviews. This includes 54 persons who belong mainly to the first generation of migrants, as well as the second and third generation of Polish diaspora. The research aims to explore the process of becoming and being trilingual, and how the use of languages changes in emigration, depending on the spheres of life and self-realization among different generations of the Polish diaspora.
Status języka polskiego i preferencje językowe Polonii w Egipcie
Autorka opisuje wielojęzyczność i preferencje językowe Polaków mieszkających w Egipcie. Prezentuje pierwsze badania nad wielojęzycznością, obejmujące język polski i arabski. Zostały one przeprowadzone w październiku 2020 roku z wykorzystaniem ankiet i wywiadów uzupełniających. Objęto nimi 54 osoby należące głównie do pierwszego pokolenia migrantów, jak również drugiego i trzeciego pokolenia Polonii. Badania mają na celu zgłębienie procesu stawania się (i bycia) trójjęzycznym oraz tego, jak w warunkach emigracyjnych wśród różnych pokoleń Polonii zmienia się użycie języków w zależności od sfer życia i samorealizacji.
Computational linguistics. Natural language processing, Semantics