Sequential Classification of Aviation Safety Occurrences with Natural Language Processing
Aziida Nanyonga, Hassan Wasswa, Ugur Turhan
et al.
Safety is a critical aspect of the air transport system given even slight operational anomalies can result in serious consequences. To reduce the chances of aviation safety occurrences, accidents and incidents are reported to establish the root cause, propose safety recommendations etc. However, analysis narratives of the pre-accident events are presented using human-understandable, raw, unstructured, text that a computer system cannot understand. The ability to classify and categorise safety occurrences from their textual narratives would help aviation industry stakeholders make informed safety-critical decisions. To classify and categorise safety occurrences, we applied natural language processing (NLP) and AI (Artificial Intelligence) models to process text narratives. The study aimed to answer the question. How well can the damage level caused to the aircraft in a safety occurrence be inferred from the text narrative using natural language processing. The classification performance of various deep learning models including LSTM, BLSTM, GRU, sRNN, and combinations of these models including LSTM and GRU, BLSTM+GRU, sRNN and LSTM, sRNN and BLSTM, sRNN and GRU, sRNN and BLSTM and GRU, and sRNN and LSTM and GRU was evaluated on a set of 27,000 safety occurrence reports from the NTSB. The results of this study indicate that all models investigated performed competitively well recording an accuracy of over 87.9% which is well above the random guess of 25% for a four-class classification problem. Also, the models recorded high precision, recall, and F1 scores above 80%, 88%, and 85%, respectively. sRNN slightly outperformed other single models in terms of recall (90%) and accuracy (90%) while LSTM reported slightly better performance in terms of precision (87%).
Can Large Language Models Predict Audio Effects Parameters from Natural Language?
Seungheon Doh, Junghyun Koo, Marco A. Martínez-Ramírez
et al.
In music production, manipulating audio effects (Fx) parameters through natural language has the potential to reduce technical barriers for non-experts. We present LLM2Fx, a framework leveraging Large Language Models (LLMs) to predict Fx parameters directly from textual descriptions without requiring task-specific training or fine-tuning. Our approach address the text-to-effect parameter prediction (Text2Fx) task by mapping natural language descriptions to the corresponding Fx parameters for equalization and reverberation. We demonstrate that LLMs can generate Fx parameters in a zero-shot manner that elucidates the relationship between timbre semantics and audio effects in music production. To enhance performance, we introduce three types of in-context examples: audio Digital Signal Processing (DSP) features, DSP function code, and few-shot examples. Our results demonstrate that LLM-based Fx parameter generation outperforms previous optimization approaches, offering competitive performance in translating natural language descriptions to appropriate Fx settings. Furthermore, LLMs can serve as text-driven interfaces for audio production, paving the way for more intuitive and accessible music production tools.
Language Bias in Self-Supervised Learning For Automatic Speech Recognition
Edward Storey, Naomi Harte, Peter Bell
Self-supervised learning (SSL) is used in deep learning to train on large datasets without the need for expensive labelling of the data. Recently, large Automatic Speech Recognition (ASR) models such as XLS-R have utilised SSL to train on over one hundred different languages simultaneously. However, deeper investigation shows that the bulk of the training data for XLS-R comes from a small number of languages. Biases learned through SSL have been shown to exist in multiple domains, but language bias in multilingual SSL ASR has not been thoroughly examined. In this paper, we utilise the Lottery Ticket Hypothesis (LTH) to identify language-specific subnetworks within XLS-R and test the performance of these subnetworks on a variety of different languages. We are able to show that when fine-tuning, XLS-R bypasses traditional linguistic knowledge and builds only on weights learned from the languages with the largest data contribution to the pretraining data.
Data Augmentation With Back translation for Low Resource languages: A case of English and Luganda
Richard Kimera, Dongnyeong Heo, Daniela N. Rim
et al.
In this paper,we explore the application of Back translation (BT) as a semi-supervised technique to enhance Neural Machine Translation(NMT) models for the English-Luganda language pair, specifically addressing the challenges faced by low-resource languages. The purpose of our study is to demonstrate how BT can mitigate the scarcity of bilingual data by generating synthetic data from monolingual corpora. Our methodology involves developing custom NMT models using both publicly available and web-crawled data, and applying Iterative and Incremental Back translation techniques. We strategically select datasets for incremental back translation across multiple small datasets, which is a novel element of our approach. The results of our study show significant improvements, with translation performance for the English-Luganda pair exceeding previous benchmarks by more than 10 BLEU score units across all translation directions. Additionally, our evaluation incorporates comprehensive assessment metrics such as SacreBLEU, ChrF2, and TER, providing a nuanced understanding of translation quality. The conclusion drawn from our research confirms the efficacy of BT when strategically curated datasets are utilized, establishing new performance benchmarks and demonstrating the potential of BT in enhancing NMT models for low-resource languages.
Enhancing Construction Site Safety: Natural Language Processing for Hazards Identification and Prevention
Shrutika Ballal, K. A. Patel, D. A. Patel
Construction sites are well known for the inherent risks that negatively impact the safety and well-being of workers. Identifying and minimising these hazards is critical for preventing accidents and creating a safe working environment. Traditional techniques of hazards identification in construction rely on visual assessments and professional expertise, which can be time-consuming and subjective. The goal of this research is to identify traits that indicate potential dangers in the construction industry by extracting meaningful information from accident narratives. This will be achieved through the application of a rule-based iteration approach, using the Natural Language Toolkit (NLTK) for keyword extraction and text tokenization. It is a branch of artificial intelligence and computational linguistics concerned with the interaction of computers and human language. The research methodology involves the utilization of NLTK and the application of a rule-based iteration approach to extract hazards from construction-related accident narratives. The proposed approach includes gathering accident narratives, pre-processing data, and textual analysis with NLP tool for information extraction and training the algorithm with identified attributes. The textual analysis eventually leads to the extraction of significant sources of dangers that cause accidents. The study contributes to the developing subject of construction safety management by utilizing the capabilities of NLP to enhance hazard detection, resulting in safer construction practices and lower occupational hazards. The findings emphasise the accuracy with which NLP approaches detect dangers, allowing construction professionals to proactively decrease risks and enhance overall safety on construction sites.
Industrial engineering. Management engineering
State of What Art? A Call for Multi-Prompt LLM Evaluation
Moran Mizrahi, Guy Kaplan, Dan Malkin
et al.
Computational linguistics. Natural language processing
The role of topic shift and conversation turn in the intonation of Italian wh-questions
Patrizia Sorianello
The aim of this study is to analyse the intonation patterns of Italian wh-questions in relation to epistemic orientation, topic shift and turn organisation. The research conducted so far has shown that wh-questions have both falling and rising final contours. Linguistic, sociolinguistic, and pragmatic factors are recognised to affect intonation patterns. Nevertheless, the influence of conversational elements on intonation contours remains unclear and requires further research.
This study analyses a corpus of unplanned conversations to evaluate how the retention/change of topic and the maintenance/transfer of dialogic turn affect the intonation of questions. The research revealed two significant findings. Firstly, wh-question categorisation shows that the degree of epistemic certainty has a profound impact on topic shift and floor passing. Secondly, the results suggest that conversational aspects do not significantly affect the final intonation patterns. However, they do have a relevant effect on the initial pitch level of the questions, leading to an increase in the onset and overall pitch range.
Computational linguistics. Natural language processing, Language. Linguistic theory. Comparative grammar
Real-Time Multilingual Sign Language Processing
Amit Moryossef
Sign Language Processing (SLP) is an interdisciplinary field comprised of Natural Language Processing (NLP) and Computer Vision. It is focused on the computational understanding, translation, and production of signed languages. Traditional approaches have often been constrained by the use of gloss-based systems that are both language-specific and inadequate for capturing the multidimensional nature of sign language. These limitations have hindered the development of technology capable of processing signed languages effectively. This thesis aims to revolutionize the field of SLP by proposing a simple paradigm that can bridge this existing technological gap. We propose the use of SignWiring, a universal sign language transcription notation system, to serve as an intermediary link between the visual-gestural modality of signed languages and text-based linguistic representations. We contribute foundational libraries and resources to the SLP community, thereby setting the stage for a more in-depth exploration of the tasks of sign language translation and production. These tasks encompass the translation of sign language from video to spoken language text and vice versa. Through empirical evaluations, we establish the efficacy of our transcription method as a pivot for enabling faster, more targeted research, that can lead to more natural and accurate translations across a range of languages. The universal nature of our transcription-based paradigm also paves the way for real-time, multilingual applications in SLP, thereby offering a more inclusive and accessible approach to language technology. This is a significant step toward universal accessibility, enabling a wider reach of AI-driven language technologies to include the deaf and hard-of-hearing community.
A Tutorial on the Pretrain-Finetune Paradigm for Natural Language Processing
Yu Wang, Wen Qu
Given that natural language serves as the primary conduit for expressing thoughts and emotions, text analysis has become a key technique in psychological research. It enables the extraction of valuable insights from natural language, facilitating endeavors like personality traits assessment, mental health monitoring, and sentiment analysis in interpersonal communications. In text analysis, existing studies often resort to either human coding, which is time-consuming, using pre-built dictionaries, which often fails to cover all possible scenarios, or training models from scratch, which requires large amounts of labeled data. In this tutorial, we introduce the pretrain-finetune paradigm. The pretrain-finetune paradigm represents a transformative approach in text analysis and natural language processing. This paradigm distinguishes itself through the use of large pretrained language models, demonstrating remarkable efficiency in finetuning tasks, even with limited training data. This efficiency is especially beneficial for research in social sciences, where the number of annotated samples is often quite limited. Our tutorial offers a comprehensive introduction to the pretrain-finetune paradigm. We first delve into the fundamental concepts of pretraining and finetuning, followed by practical exercises using real-world applications. We demonstrate the application of the paradigm across various tasks, including multi-class classification and regression. Emphasizing its efficacy and user-friendliness, the tutorial aims to encourage broader adoption of this paradigm. To this end, we have provided open access to all our code and datasets. The tutorial is highly beneficial across various psychology disciplines, providing a comprehensive guide to employing text analysis in diverse research settings.
Grounding Toxicity in Real-World Events across Languages
Wondimagegnhue Tsegaye Tufa, Ilia Markov, Piek Vossen
Social media conversations frequently suffer from toxicity, creating significant issues for users, moderators, and entire communities. Events in the real world, like elections or conflicts, can initiate and escalate toxic behavior online. Our study investigates how real-world events influence the origin and spread of toxicity in online discussions across various languages and regions. We gathered Reddit data comprising 4.5 million comments from 31 thousand posts in six different languages (Dutch, English, German, Arabic, Turkish and Spanish). We target fifteen major social and political world events that occurred between 2020 and 2023. We observe significant variations in toxicity, negative sentiment, and emotion expressions across different events and language communities, showing that toxicity is a complex phenomenon in which many different factors interact and still need to be investigated. We will release the data for further research along with our code.
Statements: Universal Information Extraction from Tables with Large Language Models for ESG KPIs
Lokesh Mishra, Sohayl Dhibi, Yusik Kim
et al.
Environment, Social, and Governance (ESG) KPIs assess an organization's performance on issues such as climate change, greenhouse gas emissions, water consumption, waste management, human rights, diversity, and policies. ESG reports convey this valuable quantitative information through tables. Unfortunately, extracting this information is difficult due to high variability in the table structure as well as content. We propose Statements, a novel domain agnostic data structure for extracting quantitative facts and related information. We propose translating tables to statements as a new supervised deep-learning universal information extraction task. We introduce SemTabNet - a dataset of over 100K annotated tables. Investigating a family of T5-based Statement Extraction Models, our best model generates statements which are 82% similar to the ground-truth (compared to baseline of 21%). We demonstrate the advantages of statements by applying our model to over 2700 tables from ESG reports. The homogeneous nature of statements permits exploratory data analysis on expansive information found in large collections of ESG reports.
Deep Learning and Machine Learning -- Natural Language Processing: From Theory to Application
Keyu Chen, Cheng Fei, Ziqian Bi
et al.
With a focus on natural language processing (NLP) and the role of large language models (LLMs), we explore the intersection of machine learning, deep learning, and artificial intelligence. As artificial intelligence continues to revolutionize fields from healthcare to finance, NLP techniques such as tokenization, text classification, and entity recognition are essential for processing and understanding human language. This paper discusses advanced data preprocessing techniques and the use of frameworks like Hugging Face for implementing transformer-based models. Additionally, it highlights challenges such as handling multilingual data, reducing bias, and ensuring model robustness. By addressing key aspects of data processing and model fine-tuning, this work aims to provide insights into deploying effective and ethically sound AI solutions.
The neural correlates of logical-mathematical symbol systems processing resemble that of spatial cognition more than natural language processing
Yuannan Li, Shan Xu, Jia Liu
The ability to manipulate logical-mathematical symbols (LMS), encompassing tasks such as calculation, reasoning, and programming, is a cognitive skill arguably unique to humans. Considering the relatively recent emergence of this ability in human evolutionary history, it has been suggested that LMS processing may build upon more fundamental cognitive systems, possibly through neuronal recycling. Previous studies have pinpointed two primary candidates, natural language processing and spatial cognition. Existing comparisons between these domains largely relied on task-level comparison, which may be confounded by task idiosyncrasy. The present study instead compared the neural correlates at the domain level with both automated meta-analysis and synthesized maps based on three representative LMS tasks, reasoning, calculation, and mental programming. Our results revealed a more substantial cortical overlap between LMS processing and spatial cognition, in contrast to language processing. Furthermore, in regions activated by both spatial and language processing, the multivariate activation pattern for LMS processing exhibited greater multivariate similarity to spatial cognition than to language processing. A hierarchical clustering analysis further indicated that typical LMS tasks were indistinguishable from spatial cognition tasks at the neural level, suggesting an inherent connection between these two cognitive processes. Taken together, our findings support the hypothesis that spatial cognition is likely the basis of LMS processing, which may shed light on the limitations of large language models in logical reasoning, particularly those trained exclusively on textual data without explicit emphasis on spatial content.
Aspect based sentiment analysis using multi‐criteria decision‐making and deep learning under COVID‐19 pandemic in India
Rakesh Dutta, Nilanjana Das, Mukta Majumder
et al.
Abstract The COVID‐19 pandemic has a significant impact on the global economy and health. While the pandemic continues to cause casualties in millions, many countries have gone under lockdown. During this period, people have to stay within walls and become more addicted towards social networks. They express their emotions and sympathy via these online platforms. Thus, popular social media (Twitter and Facebook) have become rich sources of information for Opinion Mining and Sentiment Analysis on COVID‐19‐related issues. We have used Aspect Based Sentiment Analysis to anticipate the polarity of public opinion underlying different aspects from Twitter during lockdown and stepwise unlock phases. The goal of this study is to find the feelings of Indians about the lockdown initiative taken by the Government of India to stop the spread of Coronavirus. India‐specific COVID‐19 tweets have been annotated, for analysing the sentiment of common public. To classify the Twitter data set a deep learning model has been proposed which has achieved accuracies of 82.35% for Lockdown and 83.33% for Unlock data set. The suggested method outperforms many of the contemporary approaches (long short‐term memory, Bi‐directional long short‐term memory, Gated Recurrent Unit etc.). This study highlights the public sentiment on lockdown and stepwise unlocks, imposed by the Indian Government on various aspects during the Corona outburst.
Computational linguistics. Natural language processing, Computer software
Temporal Effects on Pre-trained Models for Language Processing Tasks
Oshin Agarwal, Ani Nenkova
AbstractKeeping the performance of language technologies optimal as time passes is of great practical interest. We study temporal effects on model performance on downstream language tasks, establishing a nuanced terminology for such discussion and identifying factors essential to conduct a robust study. We present experiments for several tasks in English where the label correctness is not dependent on time and demonstrate the importance of distinguishing between temporal model deterioration and temporal domain adaptation for systems using pre-trained representations. We find that, depending on the task, temporal model deterioration is not necessarily a concern. Temporal domain adaptation, however, is beneficial in all cases, with better performance for a given time period possible when the system is trained on temporally more recent data. Therefore, we also examine the efficacy of two approaches for temporal domain adaptation without human annotations on new data. Self-labeling shows consistent improvement and notably, for named entity recognition, leads to better temporal adaptation than even human annotations.
Computational linguistics. Natural language processing
Pour une lecture écocritique des Fables du moineau de Sami Tchak
Baguissoga SATRA
Résumé : L’innovation n’est pas l’apanage des sciences exactes. La tendance qui ramènerait tout à la seule sphère technologique est une posture erronée. En effet, tout renouvellement est un processus cognitif inséparable des facteurs culturels et socio-économiques. Or, la littérature est au cœur même de ces facteurs dans la mesure où les œuvres littéraires représentent la complexité de l’humanité et du monde vivant. Ainsi, dans Les Fables du moineau, Sami Tchak présente un dialogue imaginaire entre Aboubakar et un moineau sur les interactions au sein de la nature. L’objectif de cette étude est de mettre en lumière la façon dont ce dialogue constitué d’une série de fables ponctuées par l’anaphore, véhicule une pensée écologique ancrée dans l’observation minutieuse des êtres et des choses. Ainsi, la lecture de cet ouvrage convoque l’écocritique qui permet d’analyser la manière dont sont représentés les rapports entre les humains et la nature. L’étude révèle les différents aspects de la violence parfois gratuite dont sont victimes les menus êtres vivants. Elle postule la nécessité du respect tant pour les animaux les plus petits que les plus grands. Bien plus, elle montre comment les êtres les plus fragiles sont parfois les plus forts, grâce à leur extraordinaire capacité de mobilité et d’adaptation à l’environnement. Ces enseignements incitent à la prise de conscience de la fragilité de tous les êtres vivants, et par extrapolation de toutes les institutions humaines, qui oublient vite que leur gestation commence toujours par une petite graine. La question écologique constitue donc une quête collective.
Arts in general, Computational linguistics. Natural language processing
SciNLI: A Corpus for Natural Language Inference on Scientific Text
Mobashir Sadat, Cornelia Caragea
Existing Natural Language Inference (NLI) datasets, while being instrumental in the advancement of Natural Language Understanding (NLU) research, are not related to scientific text. In this paper, we introduce SciNLI, a large dataset for NLI that captures the formality in scientific text and contains 107,412 sentence pairs extracted from scholarly papers on NLP and computational linguistics. Given that the text used in scientific literature differs vastly from the text used in everyday language both in terms of vocabulary and sentence structure, our dataset is well suited to serve as a benchmark for the evaluation of scientific NLU models. Our experiments show that SciNLI is harder to classify than the existing NLI datasets. Our best performing model with XLNet achieves a Macro F1 score of only 78.18% and an accuracy of 78.23% showing that there is substantial room for improvement.
Unsupervised Acquisition of Comprehensive Multiword Lexicons using Competition in an <i>n</i>-gram Lattice
Julian Brooke, Jan Šnajder, Timothy Baldwin
Computational linguistics. Natural language processing
Large-Scale Induction and Evaluation of Lexical Resources from the Penn-II and Penn-III Treebanks
Ruth O'Donovan, Michael Burke, Aoife Cahill
et al.
Computational linguistics. Natural language processing
PREPROCESSING ARABIC DIALECT FOR SENTIMENT MINING: STATE OF ART
Z. Nassr, N. Sael, F. Benabbou
Sentiment Analysis concerns the analysis of ideas, emotions, evaluations, values, attitudes and feelings about products, services, companies, individuals, tasks, events, titles and their characteristics. With the increase in applications on the Internet and social networks, Sentiment Analysis has become more crucial in the field of text mining research and has since been used to explore users’ opinions on various products or topics discussed on the Internet. Developments in the fields of Natural Language Processing and Computational Linguistics have contributed positively to Sentiment Analysis studies, especially for sentiments written in non-structured or semi-structured languages. In this paper, we present a literature review on the pre-processing task on the field of sentiment analysis and an analytical and comparative study of different researches conducted in Arabic social networks. This study allowed as concluding that several works have dealt with the generation of stop words dictionary. In this context, two approaches are adopted: first, the manual one, which gives rise to a limited list, and second, the automatic, where the list of stop words is extracted from social networks based on defined rules. For stemming two, algorithms have been proposed to isolate prefixes and suffixes from words in dialects. However, few works have been interested in dialects directly without translation. The Moroccan dialect in particular is considered as the 5th dialect studied among Arabic dialects after Jordanian, Egyptian, Tunisian and Algerian dialects. Despite the significant lack in studies carried out on Arabic dialects, we were able to extract several conclusions about the difficulties and challenges encountered through this comparative study, as well as the possible ways and tracks to study in any dialects sentiment analysis pre-processing solution.
Technology, Engineering (General). Civil engineering (General)