Breast cancer image classification using deep learning augmented with attention mechanism
Musa Idris, Edgar Osaghae, Taiwo Kolajo
Abstract The use of the medical imaging approach for breast cancer diagnosis has become increasingly reliable with the advent of deep learning techniques. Advances in deep learning have strengthened the reliability of medical imaging-based diagnosis. However, many existing models struggle to generalize across diverse imaging modalities and heterogeneous datasets. The study essentially focuses on addressing a key limitation of deep learning in the context of its limited ability to generalize effectively across diverse breast cancer datasets and image formats. This study showcases a custom deep learning model combining a dilated Convolutional Neural Network (CNN) infused with an attention mechanism using model infusion techniques for classifying breast cancer images. The model was trained on a dataset comprising various imaging modalities, including histopathology, ultrasound, Magnetic Resonance Imaging (MRI), and mammograms, with experimental evaluation showing that the model achieves high overall performance, with accuracy, precision, recall, and AUC all approaching optimal values. The confusion matrix analysis revealed low false positives (11 benign misclassified as malignant) and false negatives (19 malignant misclassified as benign), underscoring the model’s reliability in medical applications where diagnostic accuracy is critical. The model was also tested on unseen data, showing robust generalization across different imaging modalities. Although the model performance showcases a promising outcome, limitations such as dataset diversity and the reliance on image-based diagnosis are acknowledged. Further testing and validation in clinical settings, as well as the integration of multi-modal data, are suggested to enhance the model’s performance and applicability in real-world breast cancer diagnosis. Overall, this study indicates the potential of advanced CNN architectures in improving breast cancer detection and classification.
Computational linguistics. Natural language processing, Electronic computers. Computer science
Informationen zu den Beitragenden/Information about the authors
Computational linguistics. Natural language processing, Language. Linguistic theory. Comparative grammar
Customer perceptions driving the adoption of artificial intelligence products in Ethiopian market
Biruk Tessema, Esayas Degago Demissie, Swati Prasad
et al.
Abstract Artificial intelligence (AI) is revolutionizing how businesses function. Products with AI integration are the newest trend in several stores. In these stores, consumers interact with totally automated technology. As a result, it's important to carefully consider the factors that led to consumers' intention to buy these products. This paper attempts to investigate the factors influencing the customer intention to buy AI-integrated products in Ethiopia. A survey questionnaire is used to gather data from different customers located in different shopping centers in the capital city. A simple random sampling technique from different accessible shopping centers was used and a sample of 255 respondents was utilized or an actual response rate of 85%, were obtained in the analysis. Partial list square structural equation modeling (PLS-SEM) was used to analyze the data. The study finds customers' behavioral intentions are highly influenced by Perceived usefulness, Perceived ease of use, attitude, subjective norm, and enjoyment. However, perceived cost and performance risk were found insignificant. This study offers important academic and managerial implications in the fields of business and technology. Business and policy makers needs to address affordability and digital literacy gap to enhance AI acceptance in Ethiopian market. Consequently, this study adds to the expanding body of research on emerging trends in services using AI-based technologies.
Computational linguistics. Natural language processing, Electronic computers. Computer science
A survey on intelligent secure and distributed frameworks for Healthcare 5.0
Syed Rizwan Hassan, Ayesha Hassan, Aysha Maqsood
et al.
Abstract Healthcare 5.0 represents the next significant evolution in healthcare, emphasizing the integration of cutting-edge technologies such as the Internet of Medical Things, Artificial Intelligence, and Blockchain to offer more personalized, efficient, and secure medical services. Unlike its predecessors, Healthcare 5.0 focuses on enhancing human capabilities through technology rather than replacing human roles, aiming for a collaborative environment between patients, healthcare providers, and advanced systems. This review explores the historical development of healthcare systems from basic human-driven care to the automated and digitalized Healthcare 4.0, ultimately leading to the human-centric vision of Healthcare 5.0. This paper provides a comprehensive review of machine learning techniques in improving disease detection and diagnosis. Moreover, a detailed review of the advanced technologies used for predictive care and real-time health monitoring is presented. This article also provides an inclusive review of Federated Learning and Blockchain-based privacy preservation and security techniques used to secure personal healthcare data in the Healthcare 5.0 paradigm. Practical examples such as AI-assisted chronic heart disease prediction, blockchain-enabled secure electronic health records, and federated learning models for COVID-19 diagnosis across multiple hospitals are highlighted to demonstrate real-world impact. Key issues like real-time monitoring, system interoperability, and model transparency are critically assessed, with a focus on current technological solutions such as explainable AI, fog computing, and cloud computing. The paper concludes with a discussion of challenges and future research directions, focusing on further integrating innovative technologies in healthcare while addressing the challenges of data security and system integration, ultimately paving the way for more efficient, secure, and patient-centered care.
Computational linguistics. Natural language processing, Electronic computers. Computer science
ENTRE L’ETRE ET LE PARAITRE : L’IDENTITE EN QUESTION DANS MAMAN A UN AMANT DE CALIXTHE BEYALA ET BEL-AMI DE GUY DE MAUPASSANT
Louis Hervé NGAFOMO
Résumé : Le présent article propose une analyse sociocritique des modalités de l’identité dans Maman a un amant de Calixthe Beyala et Bel-Ami de Guy de Maupassant. L’expression de l’être et du paraître reconduit la problématique de l’identité au prisme des trajectoires de rencontres avec l’altérité. L’étude des personnages comme Georges Duroy dans Bel-Ami et de Loukoum dans Maman a un amant de Beyala, constitue une opportunité critique autour des représentations sociales du texte. Cette étude vise à répondre à la préoccupation suivante : comment la langue et les stratégies discursives sont utilisées pour construire et négocier l'identité dans des contextes socioculturels différents ? Il s’agit de saisir à l’horizon le spectre d’une ouverture vers l’identité plurielle.
Mots-clés : Être, paraitre, identité, personnage, roman francophone, Beyala, Maupassant.
Arts in general, Computational linguistics. Natural language processing
Psycho-Linguistic Communication Strategies Employed by Second-Hand Clothing Vendors to Influence Consumer Buying Behaviour at Gikomba Market, Kenya
Lucy Mandillah
Language plays a critical role in communication, shaping and being shaped by cognitive processes and contextual meaning. This study explores the psycho-linguistic communication strategies (PLCS) used by second-hand clothing (SHC) sellers/vendors at Gikomba Market in Nairobi, Kenya to influence buyer behaviour. Despite the growing popularity of SHC, research on language use in this sector remains scarce and inconclusive. This study examines how SHC sellers use language to engage and persuade customers in a competitive marketplace. The study seeks to identify the PLCS used by sellers, evaluate their impact on buyer behaviour, and assess their effectiveness in influencing consumer decisions. Guided by Robert Cialdini’s psycho-linguistic theory, the research adopts qualitative methods, including observation and in-depth interviews with 20 SHC vendors and 10 consumers. Data were analysed thematically to identify recurring communication strategies. The findings reveal that vendors employ strategies such as code-switching, deceptive pricing, repetition, narratives, hyperbole, and euphemism to attract buyers. These techniques create a sense of urgency, pride, and cultural connection, which significantly influence purchasing decisions. The study is limited to a specific market (Gikomba) and population, restricting generalisability. Further research is needed to explore PLCS in diverse contexts and their long-term impact on buyer behaviour. The study findings offer valuable insights into consumer behaviour in informal markets, highlighting the role of language in marketing and informing future research. It also provides a basis for vendor training programmes to improve customer engagement and sales in competitive settings.
Language. Linguistic theory. Comparative grammar, Computational linguistics. Natural language processing
Dialect Normalization using Large Language Models and Morphological Rules
Antonios Dimakis, John Pavlopoulos, Antonios Anastasopoulos
Natural language understanding systems struggle with low-resource languages, including many dialects of high-resource ones. Dialect-to-standard normalization attempts to tackle this issue by transforming dialectal text so that it can be used by standard-language tools downstream. In this study, we tackle this task by introducing a new normalization method that combines rule-based linguistically informed transformations and large language models (LLMs) with targeted few-shot prompting, without requiring any parallel data. We implement our method for Greek dialects and apply it on a dataset of regional proverbs, evaluating the outputs using human annotators. We then use this dataset to conduct downstream experiments, finding that previous results regarding these proverbs relied solely on superficial linguistic information, including orthographic artifacts, while new observations can still be made through the remaining semantics.
Resampling Filter Design for Multirate Neural Audio Effect Processing
Alistair Carson, Vesa Välimäki, Alec Wright
et al.
Neural networks have become ubiquitous in audio effects modelling, especially for guitar amplifiers and distortion pedals. One limitation of such models is that the sample rate of the training data is implicitly encoded in the model weights and therefore not readily adjustable at inference. Recent work explored modifications to recurrent neural network architecture to approximate a sample rate independent system, enabling audio processing at a rate that differs from the original training rate. This method works well for integer oversampling and can reduce aliasing caused by nonlinear activation functions. For small fractional changes in sample rate, fractional delay filters can be used to approximate sample rate independence, but in some cases this method fails entirely. Here, we explore the use of real-time signal resampling at the input and output of the neural network as an alternative solution. We investigate several resampling filter designs and show that a two-stage design consisting of a half-band IIR filter cascaded with a Kaiser window FIR filter can give similar or better results to the previously proposed model adjustment method with many fewer filtering operations per sample and less than one millisecond of latency at typical audio rates. Furthermore, we investigate interpolation and decimation filters for the task of integer oversampling and show that cascaded half-band IIR and FIR designs can be used in conjunction with the model adjustment method to reduce aliasing in a range of distortion effect models.
Studies with impossible languages falsify LMs as models of human language
Jeffrey S. Bowers, Jeff Mitchell
According to Futrell and Mahowald [arXiv:2501.17047], both infants and language models (LMs) find attested languages easier to learn than impossible languages that have unnatural structures. We review the literature and show that LMs often learn attested and many impossible languages equally well. Difficult to learn impossible languages are simply more complex (or random). LMs are missing human inductive biases that support language acquisition.
Opportunities, challenges, and benefits of AI innovation in government services: a review
Khalifa Alhosani, Saadat M. Alhashmi
Abstract Artificial intelligence (AI) has emerged as an excellent tool across multiple industries and holds great promise for the government, society, and economy. However, the absence of a distinct consensus regarding the definition and scope of artificial intelligence hinders its practical implementation in government settings. This article examines the various methodologies, emphases, and goals within artificial intelligence, emphasizing its ability to enhance human capabilities in critical situations. Considering the present advantages and enhanced productivity brought about by AI adoption in trailblazing government departments, this study explores the possible benefits and limitations of AI usage in the public sector. By looking at the cross-disciplinary difficulties of public AI applications, such as language hurdles and service delays, this study highlights the necessity for a thorough knowledge of the risks, impediments, and incentives of employing AI for government services. The study hopes to provide insight into AI research's ultimate aims, including object manipulation, natural language processing, and reasoning. This study emphasizes the potential for greater productivity, simplified procedures, and reduced obligations by analyzing the pros and cons of using AI in the public sector. Further, organizational theory is considered a tool for figuring out how to deal with challenges and maximize possibilities associated with AI deployment. The theory is used as the conceptual framework to understand the benefits, opportunities, and challenges involved in using AI when providing government services. The results of this research help us better understand how AI may revolutionize public service delivery by stimulating new ideas and improving efficiency. This study covers critical questions about organizational theory's role in improving government AI adoption, the challenges governments have in adopting AI, and the potential benefits AI might offer public service delivery. The research recommends a strategic approach to AI adoption in the public sector, considering organizational, ethical, and societal implications while recognizing the possibility of AI's transformative impacts on governments' service provision.
Computational linguistics. Natural language processing, Electronic computers. Computer science
Lingue e culture aumentate.
Martina Bellinzona, Martina Manna
As an emerging technology, Augmented Reality (AR) demonstrated several advantages in education. However, the implementation of AR for plurilingual and intercultural education is still at an early stage and more studies are needed. Therefore, the research presented here aims to evaluate the impact of the implementation of AR on the development of adolescent students’ plurilingual and intercultural competence, as well as on their motivation. To achieve the aims described, several educational activities centered on authentic stories of migrants were developed through a Game-Based Learning approach and implemented. The stories were drawn from the DiMMi project, which collects autobiographical testimonies related to the themes of migration. Results showed how the integration of AR in plurilingual and intercultural education consists of an effective strategy to bring adolescent students closer to contemporary migration issues. Moreover, it enables the development of citizen and cultural awareness, as well as plurilingual and digital skills and competences. Furthermore, the study demonstrated the potential of AR to stimulate students’ interest in the topics covered by the activities implemented.
Computational linguistics. Natural language processing, Language. Linguistic theory. Comparative grammar
A Proposal to Study of Cross Language Information Retrieval (CLIR) System Users' Information Seeking Behavior
YooJin Ha
Computational linguistics. Natural language processing
The Power of Metaphor in the Representation of Mental Images in the Language of Tourism Print Advertising: Lakoff and Johnson’s Model of Conceptual Metaphor
Jawad El Bakri
This study investigates, from a cognitive perspective, the effectiveness of metaphor use in the communication of tourism print advertising-related messages (visual images). The researcher examines how the use of metaphorical language generates an effective visual image based on Lakoff and Johnson's (1980) model of metaphor. The study analyzes the mental impacts of metaphor usage in tourism print advertisements and investigates how they can contribute to an effective communication of tourist messages (images of certain tourist destinations). This research article investigates the power of employing metaphor as a figure of speech which increases imagery and memorability of the described tourist service and product. This study takes on the question of how metaphorical language constructs beautiful images of a given tourist destination and examines what kind of metaphor seems to be highly frequent and pervasive within the gathered data by the application of Lakoff and Johnson’s model-based cognitive perspective. This study makes use of both content analysis, as a qualitative research tool, and frequency count, as a quantitative research tool, to analyze the written language of tourism print advertising in various countries all over the world. This research article aims to conduct a comprehensive analysis of nineteen distinct tourist prints encompassing diverse tourist destinations around the globe. The purpose of which is to find out the most frequent type of metaphor and its impact on the communication of attractive and appealing images of certain tourist destinations. This mixed approach is established on appropriately constructed principles that would enable the author of this study to answer its questions. Lakoff and Johnson’s (1980) model-based cognitive perspective, as the theoretical framework of this study, provides the bases and the analytical tools to interpret the inferred information in this study. The findings indicate that approximately 68.4% (13 out of 19) of the analyzed ads contained structural metaphors, whereas around 31.6% (6 out of 19) contained ontological metaphors. Advertisers make recourse to structural metaphors because they offer a concise and comprehensible means of conveying complex ideas (abstract notions). They facilitate clarity in terms of communication of fundamental aspects and experiences. They resonate with the preference for the visual communication of compelling images of certain tourist destinations. They help to create captivating imagery that attracts the audience's attention and evokes specific emotions or perceptions which is one of the main marketing techniques in advertising practices. The extracted results which are predicated on the use of content analysis to investigate the collected data prove that metaphor-based messages tend to be significantly present in tourism print advertisements. The findings suggest that the use of metaphor is highly effective when it comes to intelligibly communicating abstract complex notions (images) related to tourist destinations. The most frequent type of metaphor as far as the examined data is concerned is the so-called structural metaphor. It is found to be much more effective in terms of communicating highly intricate thoughts and conveying unique and special qualities of a certain destination that make it stand out. It is one of the most important types of metaphor which are defined by Lakoff and Johnson (1980) (orientational, ontological, and structural metaphor). It constructs evocative and memorable images of the target tourist destination. The main limitation of this research article is that it heavily relies on Lakoff and Johnson's model-based cognitive perspective, which is not that widely accepted theory within the academic arena and does not consider other theoretical perspectives in the investigation of this research problem. Another limitation could emerge from not taking into account the factor of the so-called context of those examined tourism print advertisements (visual presentations and the target audience). These described limitations provide a framework for further future research in this area in order to perfectly gain more reliable and valid insights into this subject matter which is under investigation.
Language. Linguistic theory. Comparative grammar, Computational linguistics. Natural language processing
Rentabilité des stratégies de gestion d’une entreprise de services en période de pandémie de COVID 19 : cas de Canal+ RDC
Isaac Ngoie Kaluhunga kaseba
Résumé : Canal+ RDC est une filiale du groupe Canal+ dont les activités sont orientées dans le secteur de l’audiovisuel. La production, la diffusion des programmes, à travers une spirale d'émissions, sur la télévision, et la commercialisation des chaines de télévision sur satellite constituent l'objet social de cette entreprise. Apparue en Chine, en décembre 2019, dans la ville de Wuhan, province de Hubei, une flambée épidémique à coronavirus (Covid 19), s’est répandue à grande vitesse dans le monde. Cette pandémie a entrainé des chocs sur la demande (baisse de la consomma-tion suite aux mesures de distanciation et confinement de la population) et sur l’offre (perturbation de la chaine de production à l’échelle internationale au départ de la Chine), et a entrainé des spéculations sur les marchés financiers. Une détérioration de la confiance des consommateurs et entreprises a incité à tabler sur une baisse de la demande, notamment des dépenses courantes et d’investisse-ments. Cela a exacerbé la fermeture d’entreprises, les pertes d’emplois et une baisse en cascades de la demande globale. Pourtant, Canal+ RDC a misé sur quelques stratégies de gestion pour rentabiliser ses activités pendant cette période trouble.
Mots-clés : Gestion, rentabilité, pandémie, Covid 19, Canal+.
Arts in general, Computational linguistics. Natural language processing
L’émotion esthétique ou la mise en récit des blessures mémorielles dans le roman 1994 de Adlène Meddi
Goucem Nadira KHODJA
Résumé : L’émotion est ce qu’il ya de plus intime et paradoxalement de plus visible chez l’être humain. Exprimée ou tue, elle révèle la part secrète et incompréhensible de l’intériorité. C’est à travers le prisme de l’émotion, qu’Adlène Meddi évoque dans son roman 1994, une période dramatique de l’Histoire contemporaine de l’Algérie. Il place ses principaux personnages dans un contexte précis (1994-2001) ; une période instable dominée par la violence de la guerre et les entraîne peu à peu dans un tourbillon d’événements et d’actions dramatiques. Nous nous proposons d’étudier à travers ce roman, les enjeux esthétiques de la nouvelle tendance de la littérature algérienne francophone dans la mesure où Meddi s’intéresse à la reproduction de situations sociales, historiques et politiques vécues dramatiquement par toute la société algérienne au moyen de procédés d’écriture originaux. Le projet littéraire de Meddi semble émaner d’un besoin de mettre des mots sur des traumas peu explorés et des blessures morales indicibles. A travers une recherche artistique préoccupée par le sort de l’être dans sa relation au monde et aux autres, cet écrivain algérien prospecte les ressorts du langage quand il doit exprimer une intériorité profondément heurtée.
Mots-clés : Emotion- Aesthetics- Interiority- History- francophone literature.
Arts in general, Computational linguistics. Natural language processing
A Survey of Large Language Models for Arabic Language and its Dialects
Malak Mashaabi, Shahad Al-Khalifa, Hend Al-Khalifa
This survey offers a comprehensive overview of Large Language Models (LLMs) designed for Arabic language and its dialects. It covers key architectures, including encoder-only, decoder-only, and encoder-decoder models, along with the datasets used for pre-training, spanning Classical Arabic, Modern Standard Arabic, and Dialectal Arabic. The study also explores monolingual, bilingual, and multilingual LLMs, analyzing their architectures and performance across downstream tasks, such as sentiment analysis, named entity recognition, and question answering. Furthermore, it assesses the openness of Arabic LLMs based on factors, such as source code availability, training data, model weights, and documentation. The survey highlights the need for more diverse dialectal datasets and attributes the importance of openness for research reproducibility and transparency. It concludes by identifying key challenges and opportunities for future research and stressing the need for more inclusive and representative models.
modeLing: A Novel Dataset for Testing Linguistic Reasoning in Language Models
Nathan A. Chi, Teodor Malchev, Riley Kong
et al.
We introduce modeLing, a novel benchmark of Linguistics Olympiad-style puzzles which tests few-shot reasoning in AI systems. Solving these puzzles necessitates inferring aspects of a language's grammatical structure from a small number of examples. Such puzzles provide a natural testbed for language models, as they require compositional generalization and few-shot inductive reasoning. Consisting solely of new puzzles written specifically for this work, modeLing has no risk of appearing in the training data of existing AI systems: this ameliorates the risk of data leakage, a potential confounder for many prior evaluations of reasoning. Evaluating several large open source language models and GPT on our benchmark, we observe non-negligible accuracy, demonstrating few-shot emergent reasoning ability which cannot merely be attributed to shallow memorization. However, imperfect model performance suggests that modeLing can be used to measure further progress in linguistic reasoning.
Linguistic Minimal Pairs Elicit Linguistic Similarity in Large Language Models
Xinyu Zhou, Delong Chen, Samuel Cahyawijaya
et al.
We introduce a novel analysis that leverages linguistic minimal pairs to probe the internal linguistic representations of Large Language Models (LLMs). By measuring the similarity between LLM activation differences across minimal pairs, we quantify the and gain insight into the linguistic knowledge captured by LLMs. Our large-scale experiments, spanning 100+ LLMs and 150k minimal pairs in three languages, reveal properties of linguistic similarity from four key aspects: consistency across LLMs, relation to theoretical categorizations, dependency to semantic context, and cross-lingual alignment of relevant phenomena. Our findings suggest that 1) linguistic similarity is significantly influenced by training data exposure, leading to higher cross-LLM agreement in higher-resource languages. 2) Linguistic similarity strongly aligns with fine-grained theoretical linguistic categories but weakly with broader ones. 3) Linguistic similarity shows a weak correlation with semantic similarity, showing its context-dependent nature. 4) LLMs exhibit limited cross-lingual alignment in their understanding of relevant linguistic phenomena. This work demonstrates the potential of minimal pairs as a window into the neural representations of language in LLMs, shedding light on the relationship between LLMs and linguistic theory. Codes and data are available at https://github.com/ChenDelong1999/Linguistic-Similarity
On‐device audio‐visual multi‐person wake word spotting
Yidi Li, Guoquan Wang, Zhan Chen
et al.
Abstract Audio‐visual wake word spotting is a challenging multi‐modal task that exploits visual information of lip motion patterns to supplement acoustic speech to improve overall detection performance. However, most audio‐visual wake word spotting models are only suitable for simple single‐speaker scenarios and require high computational complexity. Further development is hindered by complex multi‐person scenarios and computational limitations in mobile environments. In this paper, a novel audio‐visual model is proposed for on‐device multi‐person wake word spotting. Firstly, an attention‐based audio‐visual voice activity detection module is presented, which generates an attention score matrix of audio and visual representations to derive active speaker representation. Secondly, the knowledge distillation method is introduced to transfer knowledge from the large model to the on‐device model to control the size of our model. Moreover, a new audio‐visual dataset, PKU‐KWS, is collected for sentence‐level multi‐person wake word spotting. Experimental results on the PKU‐KWS dataset show that this approach outperforms the previous state‐of‐the‐art methods.
Computational linguistics. Natural language processing, Computer software
Partial Tensorized Transformers for Natural Language Processing
Subhadra Vadlamannati, Ryan Solgi
The transformer architecture has revolutionized Natural Language Processing (NLP) and other machine-learning tasks, due to its unprecedented accuracy. However, their extensive memory and parameter requirements often hinder their practical applications. In this work, we study the effect of tensor-train decomposition to improve the accuracy and compress transformer vision-language neural networks, namely BERT and ViT. We focus both on embedding-layer compression and partial tensorization of neural networks (PTNN) through an algorithmic approach. Our novel PTNN approach significantly improves the accuracy of existing models by up to 5%, all without the need for post-training adjustments, breaking new ground in the field of tensor decomposition.