This study examines the ethical dimensions of artificial intelligence (AI) and explores the awareness and understanding of how users interact with AI and the potential consequences of these interactions. In recent years, growing awareness of the risks of AI has driven the development of ethical guidelines by various organizations. These guidelines aim to ensure that AI is developed and deployed in a responsible manner, with a focus on human well-being, societal benefits and minimizing risks. However, despite this global movement, there is a lack of consensus on key ethical issues such as bias, discrimination, privacy and human rights issues. The study focuses on the perceptions of 127 participants from 11 countries with diverse professional backgrounds in technology, education and finance. A survey with a 5-point Likert scale was used to assess participants’ attitudes towards AI ethics in relation to various topics such as transparency. The study examines differences in responses across professions and countries using Multivariate Analysis of Variance (MANOVA) test. The results reveal variations in ethical priorities, suggesting that while global ethical frameworks are emerging, further efforts are needed to achieve uniformity in AI ethical standards. The findings emphasize the importance of increasing awareness and understanding of AI ethics to mitigate potential harms.
The emergence of large language models (LLMs) represents a significant technological shift within the scientific ecosystem, particularly within the field of artificial intelligence (AI). This paper examines structural changes in the AI research landscape using a dataset of arXiv preprints (cs.AI) from 2021 through 2025. Given the rapid pace of AI development, the preprint ecosystem has become a critical barometer for real-time scientific shifts, often preceding formal peer-reviewed publication by months or years. By employing a multi-stage data collection and enrichment pipeline in conjunction with LLM-based institution classification, we analyze the evolution of publication volumes, author team sizes, and academic--industry collaboration patterns. Our results reveal an unprecedented surge in publication output following the introduction of ChatGPT, with academic institutions continuing to provide the largest volume of research. However, we observe that academic--industry collaboration is still suppressed, as measured by a Normalized Collaboration Index (NCI) that remains significantly below the random-mixing baseline across all major subfields. These findings highlight a continuing institutional divide and suggest that the capital-intensive nature of generative AI research may be reshaping the boundaries of scientific collaboration.
In this work, the authors provide a novel framework for the effectiveness of AI writing assessment systems by embedding state-of-the-art deep learning networks, user feedback mechanisms, and knowledge graph frameworks. Most writing assessment tools cannot give personalized, detailed feedback. To tackle this problem, we employ writing assessment transformer models BERT and GPT-3, which allow exploring and scoring the writing on various features, including phrase structure, semantics, vocabulary usage, etc. In our system, we propose a dynamic relational knowledge graph that incorporates writing concepts and their relations, making it easier for the system to devise contextualized thesaurus-wise suggestions. The addition of graph neural networks (GNNs) empowers the model by boosting the GNN’s learning ability regarding the knowledge graph and improving comprehension of complex semantics. Additionally, we have included an iterative design whereby user feedback is collected, and the system adjusts the feedback given in light of historical feedback and changes in a user’s writing behavior over time. The system reconceptualizes the problem of user AI interaction by incorporating its dynamic nature and movement towards the known user and not vice-versa, achieving higher efficiency. To assess user satisfaction and improvements in the quality of the prepared texts, the authors conduct a series of user studies evaluating the efficiency of this integrated system. However, the preliminary data obtained from the task performance analysis show that the results of the proposed framework are far better than those of traditional methods, achieving a better level of engagement and feedback while performing the assessment. This study underscores the potential of deep learning, feedback, and knowledge graph integration in leveraging writing education. It can potentially reform learners’ capabilities, enabling them to write better and more effectively.
Modern machine learning (ML) methods typically fail to adequately capture causal information. Consequently, such models do not handle data distributional shifts, are vulnerable to adversarial examples, and often learn spurious correlations (Schölkopf and von Kügelgen 2022 (arXiv:2204.00607) [cs.AI]). Causal ML, or causal inference, aims to solve these issues by estimating the expected outcome of counterfactual events, using observational and/or interventional data, where causal relationships are typically depicted as directed acyclic graphs. It is an open question as to whether these causal algorithms provide opportunities for quantum enhancement. In this paper we consider a recently developed family of non-parametric, continuous causal estimators and derive quantum algorithms for these tasks. Kernel evaluation and large matrix inversion are critical sub-routines of these classical algorithms, which makes them particularly amenable to a quantum treatment. Unlike other quantum ML algorithms, closed form solutions for the estimators exist, negating the need for gradient evaluation and variational learning. We describe several new hybrid quantum–classical algorithms and show that uniform consistency of the estimators is retained. Furthermore, if one is satisfied with a quantum state output that is proportional to the true causal estimand, then these algorithms inherit the exponential complexity advantages given by quantum linear system solvers.
Incorporating generative artificial intelligence (GAI) in education has become crucial in contemporary educational environments. This research article thoroughly investigates the ramifications of implementing GAI in the higher education context of Saudi Arabia, employing a blend of quantitative and qualitative research approaches. Survey-based quantitative data reveals a noteworthy correlation between educators’ awareness of GAI and the frequency of its application. Notably, around half of the surveyed educators are at stages characterized by understanding and familiarity with GAI integration, indicating a tangible readiness for its adoption. Moreover, the study’s quantitative findings underscore the perceived value and ease associated with integrating GAI, thus reinforcing the assumption that educators are motivated and inclined to integrate GAI tools like ChatGPT into their teaching methodologies. In addition to the quantitative analysis, qualitative insights from in-depth interviews with educators unveil a rich tapestry of perspectives. The qualitative data emphasizes GAI’s role as a catalyst for collaborative learning, contributing to professional development, and fostering innovative teaching practices.
This article presents a usability evaluation and comparison of generative AI applications through the analysis of user reviews from popular digital marketplaces, specifically Apple’s App Store and Google Play. The study aims to bridge the research gap in real-world usability assessments of generative AI tools. A total of 11,549 reviews were extracted and analyzed from January to March 2024 for five generative AI apps: ChatGPT, Bing AI, Microsoft Copilot, Gemini AI, and Da Vinci AI. The dataset has been made publicly available, allowing for further analysis by other researchers. The evaluation follows ISO 9241 usability standards, focusing on effectiveness, efficiency, and user satisfaction. This study is believed to be the first usability evaluation for generative AI applications using user reviews across digital marketplaces. The results show that ChatGPT achieved the highest compound usability scores among Android and iOS users, with scores of 0.504 and 0.462, respectively. Conversely, Gemini AI scored the lowest among Android apps at 0.016, and Da Vinci AI had the lowest among iOS apps at 0.275. Satisfaction scores were critical in usability assessments, with ChatGPT obtaining the highest rates of 0.590 for Android and 0.565 for iOS, while Gemini AI had the lowest satisfaction rate at −0.138 for Android users. The findings revealed usability issues related to ease of use, functionality, and reliability in generative AI tools, providing valuable insights from user opinions and feedback. Based on the analysis, actionable recommendations were proposed to enhance the usability of generative AI tools, aiming to address identified usability issues and improve the overall user experience. This study contributes to a deeper understanding of user experiences and offers valuable guidance for enhancing the usability of generative AI applications.
Markus Bläser, Julian Dörfler, Maciej Liskiewicz
et al.
To characterize the computational complexity of satisfiability problems for probabilistic and causal reasoning within the Pearl's Causal Hierarchy, arXiv:2305.09508 [cs.AI] introduce a new natural class, named succ-$\exists$R. This class can be viewed as a succinct variant of the well-studied class $\exists$R based on the Existential Theory of the Reals (ETR). Analogously to $\exists$R, succ-$\exists$R is an intermediate class between NEXP and EXPSPACE, the exponential versions of NP and PSPACE. The main contributions of this work are threefold. Firstly, we characterize the class succ-$\exists$R in terms of nondeterministic real RAM machines and develop structural complexity theoretic results for real RAMs, including translation and hierarchy theorems. Notably, we demonstrate the separation of $\exists$R and succ-$\exists$R. Secondly, we examine the complexity of model checking and satisfiability of fragments of existential second-order logic and probabilistic independence logic. We show succ-$\exists$R- completeness of several of these problems, for which the best-known complexity lower and upper bounds were previously NEXP-hardness and EXPSPACE, respectively. Thirdly, while succ-$\exists$R is characterized in terms of ordinary (non-succinct) ETR instances enriched by exponential sums and a mechanism to index exponentially many variables, in this paper, we prove that when only exponential sums are added, the corresponding class $\exists$R^{\Sigma} is contained in PSPACE. We conjecture that this inclusion is strict, as this class is equivalent to adding a VNP-oracle to a polynomial time nondeterministic real RAM. Conversely, the addition of exponential products to ETR, yields PSPACE. Additionally, we study the satisfiability problem for probabilistic reasoning, with the additional requirement of a small model and prove that this problem is complete for $\exists$R^{\Sigma}.
Colorectal cancer is an enormous health concern since it is among the most lethal types of malignancy. The manual examination has its limitations, including subjectivity and data overload. To overcome these challenges, computer-aided diagnostic systems focusing on image segmentation and abnormality classification have been developed. This study presents a two-stage approach for the automatic detection of five types of colorectal abnormalities in addition to a control group: polyp, low-grade intraepithelial neoplasia, high-grade intraepithelial neoplasia, serrated adenoma, adenocarcinoma. In the first stage, UNet3+ was used for image segmentation to locate the anomalies, while in the second stage, the Cross-Attention Multi-Scale Vision Transformer deep learning model was used to predict the type of anomaly after highlighting the anomaly on the raw images. In anomaly segmentation, UNet3+ achieved values of 0.9872, 0.9422, 0.9832, and 0.9560 for Dice Coefficient, Jaccard Index, Sensitivity, Specificity respectively. In anomaly detection, the Cross-Attention Multi-Scale Vision Transformer model attained a classification performance of 0.9340, 0.9037, 0.9446, 0.8723, 0.9102, 0.9849 for accuracy, F1 score, precision, recall, Matthews correlation coefficient, and specificity, respectively. The proposed approach proves its capacity to alleviate the overwhelm of pathologists and enhance the accuracy of colorectal cancer diagnosis by achieving high performance in both the identification of anomalies and the segmentation of regions.
Christoph Leiter, Jonas Belouadi, Yanran Chen
et al.
The NLLG (Natural Language Learning&Generation) arXiv reports assist in navigating the rapidly evolving landscape of NLP and AI research across cs.CL, cs.CV, cs.AI, and cs.LG categories. This fourth installment captures a transformative period in AI history - from January 1, 2023, following ChatGPT's debut, through September 30, 2024. Our analysis reveals substantial new developments in the field - with 45% of the top 40 most-cited papers being new entries since our last report eight months ago and offers insights into emerging trends and major breakthroughs, such as novel multimodal architectures, including diffusion and state space models. Natural Language Processing (NLP; cs.CL) remains the dominant main category in the list of our top-40 papers but its dominance is on the decline in favor of Computer vision (cs.CV) and general machine learning (cs.LG). This report also presents novel findings on the integration of generative AI in academic writing, documenting its increasing adoption since 2022 while revealing an intriguing pattern: top-cited papers show notably fewer markers of AI-generated content compared to random samples. Furthermore, we track the evolution of AI-associated language, identifying declining trends in previously common indicators such as"delve".
Citations are a key ingredient of scientific research to relate a paper to others published in the community. Recently, it has been noted that there is a citation age bias in the Natural Language Processing (NLP) community, one of the currently fastest growing AI subfields, in that the mean age of the bibliography of NLP papers has become ever younger in the last few years, leading to `citation amnesia' in which older knowledge is increasingly forgotten. In this work, we put such claims into perspective by analyzing the bibliography of $\sim$300k papers across 15 different scientific fields submitted to the popular preprint server Arxiv in the time period from 2013 to 2022. We find that all AI subfields (in particular: cs.AI, cs.CL, cs.CV, cs.LG) have similar trends of citation amnesia, in which the age of the bibliography has roughly halved in the last 10 years (from above 12 in 2013 to below 7 in 2022), on average. Rather than diagnosing this as a citation age bias in the NLP community, we believe this pattern is an artefact of the dynamics of these research fields, in which new knowledge is produced in ever shorter time intervals.
With this special issue of the Journal of Data Mining and Digital Humanities (JDMDH), we bringtogether in one single volume several experiments, projects and reflections related to automatic textrecognition applied to historical documents. More and more research projects now include automatic text acquisition in their data processing chain, and this is true not only for projects focussed on Digital or Computational Humanities but increasingly also for those that are simply using existing digital tools as the means to an end. The increasing use of this technology has led to an automation of tasks that affects the role of the researcher in the textual production process. This new data-intensive practice makes it urgent to collect and harmonise the corpora necessary for the constitution of training sets, but also to make them available for exploitation. This special issue is therefore an opportunity to present articles combining philological and technical questions to make a scientific assessment of the use of automatic text recognition for ancient documents, its results, its contributions and the new practices induced by its use in the process of editing and exploring texts. We hope that practical aspects will be questioned on this occasion, while raising methodological challenges and its impact on research data.The special issue on Automatic Text Recognition (ATR) is therefore dedicated to providing a comprehensive overview of the use of ATR in the humanities field, particularly concerning historical documents in the early 2020s. This issue presents a fusion of engineering and philological aspects, catering to both beginners and experienced users interested in launching projects with ATR. The collection encompasses a diverse array of approaches, covering topics such as data creation or collection for training generic models, reaching specific objectives, technical and HTR machine architecture, segmentation methods, and image processing.
History of scholarship and learning. The humanities, Bibliography. Library science. Information resources
In the context of digital usages, acting on practices, avoiding risky behaviors or moving towards more responsible mobile consumption are questions that have become central to our hyperconnected lives. Corporate Sustainability and Responsibility (CSR) strategies are underway, but companies are struggling to translate into action in some technocentric areas, such as in the Telco world. This paper proposes to examine the interest of the reflexive approach, as described by Vacher (2011), for the user-centered design of innovative and participatory services. Two educational devices were tested with French customers. Within Orange Innovation Research, these two devices have been designed to support customers in their digital uses. The first one focuses on cybersecurity risk management and the second one deals with the appropriation of responsible behavior regarding the purchase and usage of smartphone. We fall within the scope of persuasive technologies, as defined by Foulonneau (2017), within an influence that can be described as responsible and respectful of the recipient “without the use of coercion or deception”.The reflexive process, at the heart of our approach, encourages individuals to analyze and to question their actions and their consequences, thus facilitating the development and transformation of ethical and sustainable behaviors. Reflexivity can be described as experiential learning (Dewey, J. 1938 ), and mobilizes a metacognitive ability on how to learn. The first device deals with a priority topic for Orange: the digital protection of its customers. Through a chatbot, customers are encouraged to verbalize the risks they encounter, analyze their experience and take a step back to consider changing their behavior. The results of experiments with a group of young people show that reflective practice allows participants to become aware of their protective behaviors and to put in place effective and informed strategies aimed at countering daily risks. The second device is a service model designed to assist users in selecting a new smartphone via a questionnaire. The model provides information on the ecological and social impacts of smartphone manufacturing and suggests ways to reduce it. The qualitative test carried out in the laboratory shows that the use of the device allows the generation of "socio-eco-reflections" as well as a meta-reflection, both steps validating the start of a reflexive process. This questioning is accompanied by projections on practices built from the reflexive experience. We postulate that this reflexive approach, in an industrial environment, allows individuals to build and transform their own behaviors through a guided experiential journey. It also promotes the emergence of changes that are significant, adapted, and sustainable. This hypothesis highlights the importance of ethics and behaviors persistence within the framework of CSR, offering promising perspectives for the provision of enabling tools for our clients. Vacher, Y. (2011). La pratique réflexive. Un concept et des mises en œuvre à définir. In Recherche et formation, Vol 66, p. 65-78. Foulonneau A. (2017). Les technologies persuasives adaptatives. Intelligence artificielle [cs.AI]. Thèse, Université Grenoble Alpes. Dewey, J. 1938 Experience and Education (Internet Archive). Experience and Education. Macmillan Publishers. P. 9-10.
This study stems from the Desenrollando el cordel (Untangling the cordel) project, which focuses on 19th-century Spanish prints editing. It evaluates the impact of image enhancement methods on the automatic transcription of low-quality documents, both in terms of printing and digitisation. We compare different methods (binarisation, deblur) and present the results obtained during the training of models with the Kraken tool. We demonstrate that binarisation methods give better results than the other, and that the combination of several techniques did not significantly improve the transcription prediction. This study shows the significance of using image enhancement methods with Kraken. It paves the way for further experiments with larger and more varied corpora to help future projects design their automatic transcription workflow.
History of scholarship and learning. The humanities, Bibliography. Library science. Information resources
Handwritten Text Recognition (HTR) techniques aim to accurately recognize sequences of characters in input manuscript images by training artificial intelligence models to capture historical writing features. Efficient HTR models can transform digitized manuscript collections into indexed and quotable corpora, providing valuable research insight for various historical inquiries. However, several challenges must be addressed, including the scarcity of relevant training corpora, the consequential variability introduced by different scribal hands and writing scripts, and the complexity of page layouts. This paper presents two models and one cross-model approach for automatic transcription of Latin and French medieval documentary manuscripts, particularly charters and registers, written between the 12th and 15th centuries and classified into two major writing scripts: Textualis (from the late-11th to 13th century) and Cursiva (from the 13th to the 15th century). The architecture of the models is based on a Convolutional Recurrent Neural Network (CRNN) coupled with a Connectionist Temporal Classification (CTC) loss. The training and evaluation of the models, involving 120k lines of text and almost 1M tokens, were conducted using three available ground-truth corpora : The e-NDP corpus, the Alcar-HOME database and the Himanis project. This paper describes the training architecture and corpora used, while discussing the main training challenges, results, and potential applications of HTR techniques on medieval documentary manuscripts.
History of scholarship and learning. The humanities, Bibliography. Library science. Information resources
Layout Analysis (the identification of zones and their classification) is the first step along line segmentation in Optical Character Recognition and similar tasks. The ability of identifying main body of text from marginal text or running titles makes the difference between extracting the work full text of a digitized book and noisy outputs. We show that most segmenters focus on pixel classification and that polygonization of this output has not been used as a target for the latest competition on historical document (ICDAR 2017 and onwards), despite being the focus in the early 2010s. We propose to shift, for efficiency, the task from a pixel classification-based polygonization to an object detection using isothetic rectangles. We compare the output of Kraken and YOLOv5 in terms of segmentation and show that the later severely outperforms the first on small datasets (1110 samples and below). We release two datasets for training and evaluation on historical documents as well as a new package, YALTAi, which injects YOLOv5 in the segmentation pipeline of Kraken 4.1.
History of scholarship and learning. The humanities, Bibliography. Library science. Information resources