Hasil untuk "Greek language and literature. Latin language and literature"

Menampilkan 20 dari ~2871557 hasil · dari CrossRef, DOAJ, arXiv, Semantic Scholar

JSON API
arXiv Open Access 2025
Implicit Behavioral Alignment of Language Agents in High-Stakes Crowd Simulations

Yunzhe Wang, Gale M. Lucas, Burcin Becerik-Gerber et al.

Language-driven generative agents have enabled large-scale social simulations with transformative uses, from interpersonal training to aiding global policy-making. However, recent studies indicate that generative agent behaviors often deviate from expert expectations and real-world data--a phenomenon we term the Behavior-Realism Gap. To address this, we introduce a theoretical framework called Persona-Environment Behavioral Alignment (PEBA), formulated as a distribution matching problem grounded in Lewin's behavior equation stating that behavior is a function of the person and their environment. Leveraging PEBA, we propose PersonaEvolve (PEvo), an LLM-based optimization algorithm that iteratively refines agent personas, implicitly aligning their collective behaviors with realistic expert benchmarks within a specified environmental context. We validate PEvo in an active shooter incident simulation we developed, achieving an 84% average reduction in distributional divergence compared to no steering and a 34% improvement over explicit instruction baselines. Results also show PEvo-refined personas generalize to novel, related simulation scenarios. Our method greatly enhances behavioral realism and reliability in high-stakes social simulations. More broadly, the PEBA-PEvo framework provides a principled approach to developing trustworthy LLM-driven social simulations.

en cs.CL, cs.AI
arXiv Open Access 2025
Beyond the Leaderboard: Understanding Performance Disparities in Large Language Models via Model Diffing

Sabri Boughorbel, Fahim Dalvi, Nadir Durrani et al.

As fine-tuning becomes the dominant paradigm for improving large language models (LLMs), understanding what changes during this process is increasingly important. Traditional benchmarking often fails to explain why one model outperforms another. In this work, we use model diffing, a mechanistic interpretability approach, to analyze the specific capability differences between Gemma-2-9b-it and a SimPO-enhanced variant. Using crosscoders, we identify and categorize latent representations that differentiate the two models. We find that SimPO acquired latent concepts predominantly enhance safety mechanisms (+32.8%), multilingual capabilities (+43.8%), and instruction-following (+151.7%), while its additional training also reduces emphasis on model self-reference (-44.1%) and hallucination management (-68.5%). Our analysis shows that model diffing can yield fine-grained insights beyond leaderboard metrics, attributing performance gaps to concrete mechanistic capabilities. This approach offers a transparent and targeted framework for comparing LLMs.

en cs.CL
arXiv Open Access 2025
From Textbook to Talkbot: A Case Study of a Greek-Language RAG-Based Chatbot in Higher Education

Maria Eleni Koutsiaki, Marina Delianidi, Chaido Mizeli et al.

The integration of AI chatbots into educational settings has opened new pathways for transforming teaching and learning, offering enhanced support to both educators and learners. This study investigates the design and application of an AI chatbot as an educational tool in higher education. Designed to operate in the Greek language, the chatbot addresses linguistic challenges unique to Greek while delivering accurate, context grounded support aligned with the curriculum. The AI chatbot is built on the Retrieval Augmented Generation (RAG) framework by grounding its responses in specific course content. RAG architecture significantly enhances the chatbots reliability by providing accurate, context-aware responses while mitigating common challenges associated with large language models (LLMs), such as hallucinations and misinformation. The AI chatbot serves a dual purpose: it enables students to access accurate, ondemand academic support and assists educators in the rapid creation of relevant educational materials. This dual functionality promotes learner autonomy and streamlines the instructional design process. The study aims to evaluate the effectiveness, reliability, and perceived usability of RAG based chatbots in higher education, exploring their potential to enhance educational practices and outcomes as well as supporting the broader adoption of AI technologies in language specific educational contexts. Findings from this research are expected to contribute to the emerging field of AI driven education by demonstrating how intelligent systems can be effectively aligned with pedagogical goals.

en cs.CY, cs.AI
arXiv Open Access 2025
SURGE: On the Potential of Large Language Models as General-Purpose Surrogate Code Executors

Bohan Lyu, Siqiao Huang, Zichen Liang et al.

Neural surrogate models are powerful and efficient tools in data mining. Meanwhile, large language models (LLMs) have demonstrated remarkable capabilities in code-related tasks, such as generation and understanding. However, an equally important yet underexplored question is whether LLMs can serve as surrogate models for code execution prediction. To systematically investigate it, we introduce SURGE, a comprehensive benchmark with $1160$ problems covering $8$ key aspects: multi-language programming tasks, competition-level programming problems, repository-level code analysis, high-cost scientific computing, time-complexity-intensive algorithms, buggy code analysis, programs dependent on specific compilers or execution environments, and formal mathematical proof verification. Through extensive analysis of $21$ open-source and proprietary LLMs, we examine scaling laws, data efficiency, and predictive accuracy. Our findings reveal important insights about the feasibility of LLMs as efficient surrogates for computational processes. The benchmark and evaluation framework are available at https://github.com/Imbernoulli/SURGE.

en cs.LG, cs.CL
arXiv Open Access 2025
Can Vision-Language Models Solve Visual Math Equations?

Monjoy Narayan Choudhury, Junling Wang, Yifan Hou et al.

Despite strong performance in visual understanding and language-based reasoning, Vision-Language Models (VLMs) struggle with tasks requiring integrated perception and symbolic computation. We study this limitation through visual equation solving, where mathematical equations are embedded in images, variables are represented by object icons, and coefficients must be inferred by counting. While VLMs perform well on textual equations, they fail on visually grounded counterparts. To understand this gap, we decompose the task into coefficient counting and variable recognition, and find that counting is the primary bottleneck, even when recognition is accurate. We also observe that composing recognition and reasoning introduces additional errors, highlighting challenges in multi-step visual reasoning. Finally, as equation complexity increases, symbolic reasoning itself becomes a limiting factor. These findings reveal key weaknesses in current VLMs and point toward future improvements in visually grounded mathematical reasoning.

en cs.CL, cs.AI
DOAJ Open Access 2024
Apocalypse Now? Kate Atkinson Reads Ovid’s Metamorphoses

Barbara Weiden Boyd

Kate Atkinson’s short story collection, Not the End of the World (2002), has typically puzzled reviewers, unsure of its goals and hesitant about its success. This essay offers an alternative, by reading the collection through the lens of Ovid’s Metamorphoses. Atkinson clearly signals her debt to Ovid in several epigraphs, but the overall impression left with readers who know Ovid only as a repository of classical myth has caused Atkinson’s remarkably inventive reception of Ovidian poetics to be misread. Atkinson uses the Latin poet’s interest in change and its often strange permutations as a way of interrogating contemporary concerns about consumerism, environmental degradation, and cultural forgetfulness, against a backdrop of post-apocalyptic fantasy. The result is both a new appreciation for Atkinson’s brilliance and a fresh approach to Ovid’s transformative masterpiece.

Greek language and literature. Latin language and literature
arXiv Open Access 2024
The Mystery of the Pathological Path-star Task for Language Models

Arvid Frydenlund

The recently introduced path-star task is a minimal task designed to exemplify limitations to the abilities of language models (Bachmann and Nagarajan, 2024). It involves a path-star graph where multiple arms radiate from a single starting node and each node is unique. Given the start node and a specified target node that ends an arm, the task is to generate the arm containing that target node. This is straightforward for a human but surprisingly difficult for language models, which did not outperform the random baseline. The authors hypothesized this is due to a deficiency in teacher-forcing and the next-token prediction paradigm. We demonstrate the task is learnable using teacher-forcing in alternative settings and that the issue is partially due to representation. We introduce a regularization method using structured samples of the same graph but with differing target nodes, improving results across a variety of model types. We provide RASP proofs showing the task is theoretically solvable. Finally, we find settings where an encoder-only model can consistently solve the task.

en cs.CL, cs.LG
arXiv Open Access 2024
RoMemes: A multimodal meme corpus for the Romanian language

Vasile Păiş, Sara Niţă, Alexandru-Iulius Jerpelea et al.

Memes are becoming increasingly more popular in online media, especially in social networks. They usually combine graphical representations (images, drawings, animations or video) with text to convey powerful messages. In order to extract, process and understand the messages, AI applications need to employ multimodal algorithms. In this paper, we introduce a curated dataset of real memes in the Romanian language, with multiple annotation levels. Baseline algorithms were employed to demonstrate the usability of the dataset. Results indicate that further research is needed to improve the processing capabilities of AI tools when faced with Internet memes.

en cs.CL
arXiv Open Access 2024
Overview of the 2023 ICON Shared Task on Gendered Abuse Detection in Indic Languages

Aatman Vaidya, Arnav Arora, Aditya Joshi et al.

This paper reports the findings of the ICON 2023 on Gendered Abuse Detection in Indic Languages. The shared task deals with the detection of gendered abuse in online text. The shared task was conducted as a part of ICON 2023, based on a novel dataset in Hindi, Tamil and the Indian dialect of English. The participants were given three subtasks with the train dataset consisting of approximately 6500 posts sourced from Twitter. For the test set, approximately 1200 posts were provided. The shared task received a total of 9 registrations. The best F-1 scores are 0.616 for subtask 1, 0.572 for subtask 2 and, 0.616 and 0.582 for subtask 3. The paper contains examples of hateful content owing to its topic.

en cs.CL, cs.LG
arXiv Open Access 2024
Mitigating Translationese in Low-resource Languages: The Storyboard Approach

Garry Kuwanto, Eno-Abasi E. Urua, Priscilla Amondi Amuok et al.

Low-resource languages often face challenges in acquiring high-quality language data due to the reliance on translation-based methods, which can introduce the translationese effect. This phenomenon results in translated sentences that lack fluency and naturalness in the target language. In this paper, we propose a novel approach for data collection by leveraging storyboards to elicit more fluent and natural sentences. Our method involves presenting native speakers with visual stimuli in the form of storyboards and collecting their descriptions without direct exposure to the source text. We conducted a comprehensive evaluation comparing our storyboard-based approach with traditional text translation-based methods in terms of accuracy and fluency. Human annotators and quantitative metrics were used to assess translation quality. The results indicate a preference for text translation in terms of accuracy, while our method demonstrates worse accuracy but better fluency in the language focused.

en cs.CL
arXiv Open Access 2024
From 'Showgirls' to 'Performers': Fine-tuning with Gender-inclusive Language for Bias Reduction in LLMs

Marion Bartl, Susan Leavy

Gender bias is not only prevalent in Large Language Models (LLMs) and their training data, but also firmly ingrained into the structural aspects of language itself. Therefore, adapting linguistic structures within LLM training data to promote gender-inclusivity can make gender representations within the model more inclusive. The focus of our work are gender-exclusive affixes in English, such as in 'show-girl' or 'man-cave', which can perpetuate gender stereotypes and binary conceptions of gender. We use an LLM training dataset to compile a catalogue of 692 gender-exclusive terms along with gender-neutral variants and from this, develop a gender-inclusive fine-tuning dataset, the 'Tiny Heap'. Fine-tuning three different LLMs with this dataset, we observe an overall reduction in gender-stereotyping tendencies across the models. Our approach provides a practical method for enhancing gender inclusivity in LLM training data and contributes to incorporating queer-feminist linguistic activism in bias mitigation research in NLP.

en cs.CL
arXiv Open Access 2024
Kreyòl-MT: Building MT for Latin American, Caribbean and Colonial African Creole Languages

Nathaniel R. Robinson, Raj Dabre, Ammon Shurtz et al.

A majority of language technologies are tailored for a small number of high-resource languages, while relatively many low-resource languages are neglected. One such group, Creole languages, have long been marginalized in academic study, though their speakers could benefit from machine translation (MT). These languages are predominantly used in much of Latin America, Africa and the Caribbean. We present the largest cumulative dataset to date for Creole language MT, including 14.5M unique Creole sentences with parallel translations -- 11.6M of which we release publicly, and the largest bitexts gathered to date for 41 languages -- the first ever for 21. In addition, we provide MT models supporting all 41 Creole languages in 172 translation directions. Given our diverse dataset, we produce a model for Creole language MT exposed to more genre diversity than ever before, which outperforms a genre-specific Creole MT model on its own benchmark for 26 of 34 translation directions.

en cs.CL
DOAJ Open Access 2023
Studio preliminare del ms Atheniensis EBE 1089, con appunti sulle tradizioni manoscritte e sui testi dell’Ecloga di Frinico e del Lessico di Meride

Sandri, Maria Giovanna

This paper offers a preliminary study of MS Atheniensis EBE 1089, a neglected lexicographic and grammatical miscellany consisting of several different sections, to be dated between the 13th and the 15th century. Among other texts, this manuscript transmits Phrynichus’ Eclogue and Moeris’ Lexicon. After providing a survey of the contents of this codex, the first part of the article deals with some of its main material features, with a description of its different sections and the scribes who copied them. Additionally, it is argued that the codex as a whole was the product of a single ‘editorial’ project carried out by a certain Μᾶρκος, active in the middle of the 16th century. The second part of the article offers a philological analysis of the folios containing the lexica of Phrynichus and Moeris; that gave the occasion to develop some new thoughts on their texts and manuscript traditions.

Greek language and literature. Latin language and literature, History of the Greco-Roman World
DOAJ Open Access 2023
Bulletin bibliographique

Stefano Rozzi

Le Bulletin présente par ordre alphabétique tous les titres signalés au cours du semestre précédent par la Newsletter en ligne gratuite de la SIAC, qui est envoyée aux abonnés toutes les quatre semaines.

Philology. Linguistics, Greek language and literature. Latin language and literature
DOAJ Open Access 2023
Bárbaros y otros extranjeros en la Atenas clásica: el testimonio de los epitafios

Torben Vestegaard

Las notas sobre los extranjeros y sus orígenes étnicos en la literatura clásica ateniense son pocas y dispersas. Las inscripciones funerarias proporcionan una información más comprensiva y más detallada, y suministran un amplio material de nombres con étnicos. Incluyen sobre todo a inmigrantes libres de nacimiento, entre ellos a muchas mujeres, quienes probablemente disfrutaban de una vida más independiente que las mujeres atenienses. La gran mayoría de los inmigrantes tiene étnicos que revelan la ciudadanía de varias ciudades-estado griegas. Los extranjeros con étnico bárbaro no constituyen más de un 10%, un hecho interesante a la luz de la afirmación de Jenofonte (Poroi, 2, 3) que, con toda probabilidad, es una exageración psicológicamente fácil de explicar.

Greek language and literature. Latin language and literature
arXiv Open Access 2023
Humans and language models diverge when predicting repeating text

Aditya R. Vaidya, Javier Turek, Alexander G. Huth

Language models that are trained on the next-word prediction task have been shown to accurately model human behavior in word prediction and reading speed. In contrast with these findings, we present a scenario in which the performance of humans and LMs diverges. We collected a dataset of human next-word predictions for five stimuli that are formed by repeating spans of text. Human and GPT-2 LM predictions are strongly aligned in the first presentation of a text span, but their performance quickly diverges when memory (or in-context learning) begins to play a role. We traced the cause of this divergence to specific attention heads in a middle layer. Adding a power-law recency bias to these attention heads yielded a model that performs much more similarly to humans. We hope that this scenario will spur future work in bringing LMs closer to human behavior.

en cs.CL
arXiv Open Access 2023
Challenging the Validity of Personality Tests for Large Language Models

Tom Sühr, Florian E. Dorner, Samira Samadi et al.

With large language models (LLMs) like GPT-4 appearing to behave increasingly human-like in text-based interactions, it has become popular to attempt to evaluate personality traits of LLMs using questionnaires originally developed for humans. While reusing measures is a resource-efficient way to evaluate LLMs, careful adaptations are usually required to ensure that assessment results are valid even across human subpopulations. In this work, we provide evidence that LLMs' responses to personality tests systematically deviate from human responses, implying that the results of these tests cannot be interpreted in the same way. Concretely, reverse-coded items ("I am introverted" vs. "I am extraverted") are often both answered affirmatively. Furthermore, variation across prompts designed to "steer" LLMs to simulate particular personality types does not follow the clear separation into five independent personality factors from human samples. In light of these results, we believe that it is important to investigate tests' validity for LLMs before drawing strong conclusions about potentially ill-defined concepts like LLMs' "personality".

en cs.CL, cs.AI

Halaman 34 dari 143578