Hasil "Greek philology and language"

S2 Open Access 2026

Greco-Roman Antiquity in The Library for Reading journal of 1841-1848

Ekaterina L. Smirnova

The article attempts to identify the characteristics of classical antiquity representation in O.I. Senkovsky’s Library for Reading during 1841–1848 — a time of decline in popularity for the journal and a period of ambiguous perception of the classical heritage in Russian literature. Analysing twenty-four literary and more than thirty scholarly items on Greco-Roman subjects, the study reconstructs Senkovsky’s multifaceted editorial strategy for the popularisation of antiquity. In the literary section, the extracts from the works of classical authors selected by Senkovsky for publication aimed to highlight romantic features in Horace's poetry and Virgil's “Aeneid,” the woman question in Aristophanes' “Lysistrata”, and potential for Christian perception of Sophocles' “Antigone”. They also displayed innovative approaches by younger generation of translators to interpreting masterpieces of classical literature. Works by contemporary poets posited antiquity not as a rejection of modernity, but as a means to better understanding of modern conflicts, values, and anxieties. The key features of the scholarly and critical materials, many written or extensively edited by Senkovsky, were: first, a focus not on political history but on everyday life, material culture, arts, and intellectual history of ancient Greece and Rome; second, an interest not only in the results of recent research but also in the very process of generating new knowledge about classical world; and finally, a deliberately lucid, engaging, public-facing style. Antiquity was thus presented as an appealing civilization with a unique cultural-historical experience, its mysteries still awaiting researchers, and as an inexhaustible source for creative dialogue across times. This model allowed Senkovsky to address both a broad reading public, thereby fulfilling the educational mission of the journal, and the learned elite, inviting them to reflect on the methodology of ancient history, translation from classical languages, and effective ways of presenting specialized issues in classical philology and archaeology to public.

en

Detail DOI Sumber

arXiv Open Access 2025

The Importance of Facial Features in Vision-based Sign Language Recognition: Eyes, Mouth or Full Face?

Dinh Nam Pham, Eleftherios Avramidis

Non-manual facial features play a crucial role in sign language communication, yet their importance in automatic sign language recognition (ASLR) remains underexplored. While prior studies have shown that incorporating facial features can improve recognition, related work often relies on hand-crafted feature extraction and fails to go beyond the comparison of manual features versus the combination of manual and facial features. In this work, we systematically investigate the contribution of distinct facial regionseyes, mouth, and full faceusing two different deep learning models (a CNN-based model and a transformer-based model) trained on an SLR dataset of isolated signs with randomly selected classes. Through quantitative performance and qualitative saliency map evaluation, we reveal that the mouth is the most important non-manual facial feature, significantly improving accuracy. Our findings highlight the necessity of incorporating facial features in ASLR.

en cs.CV, cs.CL

Detail DOI Sumber

arXiv Open Access 2025

mCLM: A Modular Chemical Language Model that Generates Functional and Makeable Molecules

Carl Edwards, Chi Han, Gawon Lee et al.

Despite their ability to understand chemical knowledge, large language models (LLMs) remain limited in their capacity to propose novel molecules with desired functions (e.g., drug-like properties). In addition, the molecules that LLMs propose can often be challenging to make, and are almost never compatible with automated synthesis approaches. To better enable the discovery of functional small molecules, LLMs need to learn a new molecular language that is more effective in predicting properties and inherently synced with automated synthesis technology. Current molecule LLMs are limited by representing molecules based on atoms. In this paper, we argue that just like tokenizing texts into meaning-bearing (sub-)word tokens instead of characters, molecules should be tokenized at the level of functional building blocks, i.e., parts of molecules that bring unique functions and serve as effective building blocks for real-world automated laboratory synthesis. This motivates us to propose mCLM, a modular Chemical-Language Model that comprises a bilingual language model that understands both natural language descriptions of functions and molecular blocks. mCLM front-loads synthesizability considerations while improving the predicted functions of molecules in a principled manner. Experiments on FDA-approved drugs showed that mCLM is capable of significantly improving chemical functions. mCLM, with only 3B parameters, also achieves improvements in synthetic accessibility relative to 7 other leading generative AI methods including GPT-5. When tested on 122 out-of-distribution medicines using only building blocks/tokens that are compatible with automated modular synthesis, mCLM outperforms all baselines in property scores and synthetic accessibility. mCLM can also reason on multiple functions and iteratively self-improve to rescue drug candidates that failed late in clinical trials ("fallen angels").

en cs.AI, cs.CL

Detail Sumber

arXiv Open Access 2025

Large Language Models Meet Text-Attributed Graphs: A Survey of Integration Frameworks and Applications

Guangxin Su, Hanchen Wang, Jianwei Wang et al.

Large Language Models (LLMs) have achieved remarkable success in natural language processing through strong semantic understanding and generation. However, their black-box nature limits structured and multi-hop reasoning. In contrast, Text-Attributed Graphs (TAGs) provide explicit relational structures enriched with textual context, yet often lack semantic depth. Recent research shows that combining LLMs and TAGs yields complementary benefits: enhancing TAG representation learning and improving the reasoning and interpretability of LLMs. This survey provides the first systematic review of LLM--TAG integration from an orchestration perspective. We introduce a novel taxonomy covering two fundamental directions: LLM for TAG, where LLMs enrich graph-based tasks, and TAG for LLM, where structured graphs improve LLM reasoning. We categorize orchestration strategies into sequential, parallel, and multi-module frameworks, and discuss advances in TAG-specific pretraining, prompting, and parameter-efficient fine-tuning. Beyond methodology, we summarize empirical insights, curate available datasets, and highlight diverse applications across recommendation systems, biomedical analysis, and knowledge-intensive question answering. Finally, we outline open challenges and promising research directions, aiming to guide future work at the intersection of language and graph learning.

en cs.CL, cs.AI

Detail Sumber

arXiv Open Access 2025

Does Localization Inform Unlearning? A Rigorous Examination of Local Parameter Attribution for Knowledge Unlearning in Language Models

Hwiyeong Lee, Uiji Hwang, Hyelim Lim et al.

Large language models often retain unintended content, prompting growing interest in knowledge unlearning. Recent approaches emphasize localized unlearning, restricting parameter updates to specific regions in an effort to remove target knowledge while preserving unrelated general knowledge. However, their effectiveness remains uncertain due to the lack of robust and thorough evaluation of the trade-off between the competing goals of unlearning. In this paper, we begin by revisiting existing localized unlearning approaches. We then conduct controlled experiments to rigorously evaluate whether local parameter updates causally contribute to unlearning. Our findings reveal that the set of parameters that must be modified for effective unlearning is not strictly determined, challenging the core assumption of localized unlearning that parameter locality is inherently indicative of effective knowledge removal.

en cs.CL

Detail Sumber

S2 Open Access 2025

Editors’ Words

Nikolina Tsvetkova

Issue 63 of the Rhetoric and Communications journal is dedicated to themes that align with the publication’s philosophy and editorial policy, specifically rhetoric, as well as pedagogical, academic, and intercultural communication. This issue features nine scholarly contributions, categorized into distinct sections following the established editorial traditions of previous issues. The first section explores two interconnected themes: rhetoric and pedagogical communication. Foteini Egglezou, President of the Hellenic Institute of Rhetorical & Communication Studies and an expert in pedagogy, presents best practices in rhetoric education at the elementary school level in Greece, including the organization of festivals and the development of public speaking, argumentation, and debating skills. Svetla Tsankova introduces a novel topic concerning media literacy, with findings based on research conducted in the early stages of the Bulgarian education system. Ilyan Vasilev offers a distinctive perspective on education by presenting research on how technical disciplines can incorporate and enhance communicative and cognitive skills within the pedagogical discourse. Irena Shunina discusses fundamental principles of pedagogical communication based on Thomas Gordon’s methodology and examines its potential for effective teaching. The second section comprises four studies focusing on academic and intercultural communication. Katya Issa analyzes communication and eloquence within academic environments, emphasizing the factors influencing dialogicity and the role of ethics in this specialized communicative sphere. Tiha Boncheva investigates the impact of artificial intelligence on academic communication, offering a contemporary analysis of its evolving parameters. Radeya Gesheva highlights the intersection of academic and intercultural communication in the context of university education, particularly in the didactics of Italian literature. Jo Joseph examines business communication in the IT sector across multiple continents, identifying national characteristics and intercultural perspectives. Denitsa Hinkova systematically presents the roles and functions of academic conferences and research centers, emphasizing creativity, innovation, and future-oriented dimensions. Her scholarly work is included in the section Contemporary Research Methods on Artificial Intelligence. Prof. DSc. Tsvetan Davidkov is a lecturer at Sofia University “St. Kliment Ohridski”, where he teaches courses in national and organizational cultures, management fundamentals, entrepreneurship, and organizational behavior. His academic research focuses on intercultural communication, business communication, entrepreneurship, and organizational culture. He has published numerous monographs, textbooks, and articles, including National and Organizational Cultures (2009), The Values of Wealth: Entrepreneurs in Bulgaria (1991-2004) (2009), and Studies on Cultures: Cultural Guidelines for Management (2019). His co-authored work with Iya Petkova-Gurbalova, Master’s Thesis: Theory, Practice, Challenges – Chapter 3: General Structure of the Master’s Thesis, provides a comprehensive structural framework for academic writing (2020). Prof. Yovka Tisheva is a Doctor of Philology and a member of the Department of Bulgarian Language at the Faculty of Slavic Philology, Sofia University “St. Kliment Ohridski”. She lectures on academic and business communication for undergraduate and graduate students in the Faculty of Slavic Philology and the Faculty of Philosophy, alongside courses in linguistics. Her research interests include academic and business communication, oral and written communication, and linguistic pragmatics. She has authored several books and articles, including Academic Communication (2010), Academic Writing for Doctoral and Postdoctoral Students (2014), and From Abstract to Master’s Thesis: Academic Writing for Students (2016), co-authored with Ivanka Mavrodieva. Issue 63 of the Rhetoric and Communications journal (April 2025) is published with the financial support of the Scientific Research Fund, Contract No. KP-06-NP6/48, dated December 4, 2024. Rhetoric and Communications Journal, issue 63, April 2025 Read the Original in Bulgarian and English

en

Detail DOI Sumber

DOAJ Open Access 2024

Mea culpa - vielen Dank für die Klärung! Übersetzen mit ChatGPT im Lateinunterricht

Lorenzo di Maggio

Greek language and literature. Latin language and literature, Philology. Linguistics

Detail DOI Sumber

DOAJ Open Access 2024

Presentazione

De Vido, Stefania

Questo numero di Axon segna un nuovo cambiamento che mantenendo integri obiettivi e finalità del nostro lavoro lo rende, speriamo, più adatto ai tempi nuovi, alle necessità degli autori, spesso giovani, e ai caratteri dell’editoria digitale. La Rivista abbandona la cadenza semestrale, ma esce e uscirà nel corso di ciascun anno solare ogni volta che sia pronto per la pubblicazione un gruppo di contributi che riterremo opportuno rendere subito disponibile per la comunità scientifica.

Ancient history, Greek philology and language

Detail DOI Sumber

DOAJ Open Access 2024

KI und Lehrbucharbeit im zweiten Lernjahr

Sabine Jung

Greek language and literature. Latin language and literature, Philology. Linguistics

Detail DOI Sumber

arXiv Open Access 2024

An Empirical Study of Gendered Stereotypes in Emotional Attributes for Bangla in Multilingual Large Language Models

Jayanta Sadhu, Maneesha Rani Saha, Rifat Shahriyar

The influence of Large Language Models (LLMs) is rapidly growing, automating more jobs over time. Assessing the fairness of LLMs is crucial due to their expanding impact. Studies reveal the reflection of societal norms and biases in LLMs, which creates a risk of propagating societal stereotypes in downstream tasks. Many studies on bias in LLMs focus on gender bias in various NLP applications. However, there's a gap in research on bias in emotional attributes, despite the close societal link between emotion and gender. This gap is even larger for low-resource languages like Bangla. Historically, women are associated with emotions like empathy, fear, and guilt, while men are linked to anger, bravado, and authority. This pattern reflects societal norms in Bangla-speaking regions. We offer the first thorough investigation of gendered emotion attribution in Bangla for both closed and open source LLMs in this work. Our aim is to elucidate the intricate societal relationship between gender and emotion specifically within the context of Bangla. We have been successful in showing the existence of gender bias in the context of emotions in Bangla through analytical methods and also show how emotion attribution changes on the basis of gendered role selection in LLMs. All of our resources including code and data are made publicly available to support future research on Bangla NLP. Warning: This paper contains explicit stereotypical statements that many may find offensive.

en cs.CL

Detail Sumber

arXiv Open Access 2024

RoundTripOCR: A Data Generation Technique for Enhancing Post-OCR Error Correction in Low-Resource Devanagari Languages

Harshvivek Kashid, Pushpak Bhattacharyya

Optical Character Recognition (OCR) technology has revolutionized the digitization of printed text, enabling efficient data extraction and analysis across various domains. Just like Machine Translation systems, OCR systems are prone to errors. In this work, we address the challenge of data generation and post-OCR error correction, specifically for low-resource languages. We propose an approach for synthetic data generation for Devanagari languages, RoundTripOCR, that tackles the scarcity of the post-OCR Error Correction datasets for low-resource languages. We release post-OCR text correction datasets for Hindi, Marathi, Bodo, Nepali, Konkani and Sanskrit. We also present a novel approach for OCR error correction by leveraging techniques from machine translation. Our method involves translating erroneous OCR output into a corrected form by treating the OCR errors as mistranslations in a parallel text corpus, employing pre-trained transformer models to learn the mapping from erroneous to correct text pairs, effectively correcting OCR errors.

en cs.CL, cs.CV

Detail Sumber

arXiv Open Access 2024

A Federated Learning Approach to Privacy Preserving Offensive Language Identification

Marcos Zampieri, Damith Premasiri, Tharindu Ranasinghe

The spread of various forms of offensive speech online is an important concern in social media. While platforms have been investing heavily in ways of coping with this problem, the question of privacy remains largely unaddressed. Models trained to detect offensive language on social media are trained and/or fine-tuned using large amounts of data often stored in centralized servers. Since most social media data originates from end users, we propose a privacy preserving decentralized architecture for identifying offensive language online by introducing Federated Learning (FL) in the context of offensive language identification. FL is a decentralized architecture that allows multiple models to be trained locally without the need for data sharing hence preserving users' privacy. We propose a model fusion approach to perform FL. We trained multiple deep learning models on four publicly available English benchmark datasets (AHSD, HASOC, HateXplain, OLID) and evaluated their performance in detail. We also present initial cross-lingual experiments in English and Spanish. We show that the proposed model fusion approach outperforms baselines in all the datasets while preserving privacy.

en cs.CL, cs.LG

Detail Sumber

arXiv Open Access 2024

Türkçe Dil Modellerinin Performans Karşılaştırması Performance Comparison of Turkish Language Models

Eren Dogan, M. Egemen Uzun, Atahan Uz et al.

The developments that language models have provided in fulfilling almost all kinds of tasks have attracted the attention of not only researchers but also the society and have enabled them to become products. There are commercially successful language models available. However, users may prefer open-source language models due to cost, data privacy, or regulations. Yet, despite the increasing number of these models, there is no comprehensive comparison of their performance for Turkish. This study aims to fill this gap in the literature. A comparison is made among seven selected language models based on their contextual learning and question-answering abilities. Turkish datasets for contextual learning and question-answering were prepared, and both automatic and human evaluations were conducted. The results show that for question-answering, continuing pretraining before fine-tuning with instructional datasets is more successful in adapting multilingual models to Turkish and that in-context learning performances do not much related to question-answering performances.

en cs.CL, cs.AI

Detail Sumber

arXiv Open Access 2024

A Comprehensive Survey of Scientific Large Language Models and Their Applications in Scientific Discovery

Yu Zhang, Xiusi Chen, Bowen Jin et al.

In many scientific fields, large language models (LLMs) have revolutionized the way text and other modalities of data (e.g., molecules and proteins) are handled, achieving superior performance in various applications and augmenting the scientific discovery process. Nevertheless, previous surveys on scientific LLMs often concentrate on one or two fields or a single modality. In this paper, we aim to provide a more holistic view of the research landscape by unveiling cross-field and cross-modal connections between scientific LLMs regarding their architectures and pre-training techniques. To this end, we comprehensively survey over 260 scientific LLMs, discuss their commonalities and differences, as well as summarize pre-training datasets and evaluation tasks for each field and modality. Moreover, we investigate how LLMs have been deployed to benefit scientific discovery. Resources related to this survey are available at https://github.com/yuzhimanhua/Awesome-Scientific-Language-Models.

en cs.CL

Detail Sumber

arXiv Open Access 2024

Why We Build Local Large Language Models: An Observational Analysis from 35 Japanese and Multilingual LLMs

Koshiro Saito, Sakae Mizuki, Masanari Ohi et al.

Why do we build local large language models (LLMs)? What should a local LLM learn from the target language? Which abilities can be transferred from other languages? Do language-specific scaling laws exist? To explore these research questions, we evaluated 35 Japanese, English, and multilingual LLMs on 19 evaluation benchmarks for Japanese and English, taking Japanese as a local language. Adopting an observational approach, we analyzed correlations of benchmark scores, and conducted principal component analysis (PCA) on the scores to derive \textit{ability factors} of local LLMs. We found that training on English text can improve the scores of academic subjects in Japanese (JMMLU). In addition, it is unnecessary to specifically train on Japanese text to enhance abilities for solving Japanese code generation, arithmetic reasoning, commonsense, and reading comprehension tasks. In contrast, training on Japanese text could improve question-answering tasks about Japanese knowledge and English-Japanese translation, which indicates that abilities for solving these two tasks can be regarded as \textit{Japanese abilities} for LLMs. Furthermore, we confirmed that the Japanese abilities scale with the computational budget for Japanese text.

en cs.CL

Detail Sumber

arXiv Open Access 2024

Why do objects have many names? A study on word informativeness in language use and lexical systems

Eleonora Gualdoni, Gemma Boleda

Human lexicons contain many different words that speakers can use to refer to the same object, e.g., "purple" or "magenta" for the same shade of color. On the one hand, studies on language use have explored how speakers adapt their referring expressions to successfully communicate in context, without focusing on properties of the lexical system. On the other hand, studies in language evolution have discussed how competing pressures for informativeness and simplicity shape lexical systems, without tackling in-context communication. We aim at bridging the gap between these traditions, and explore why a soft mapping between referents and words is a good solution for communication, by taking into account both in-context communication and the structure of the lexicon. We propose a simple measure of informativeness for words and lexical systems, grounded in a visual space, and analyze color naming data for English and Mandarin Chinese. We conclude that optimal lexical systems are those where multiple words can apply to the same referent, conveying different amounts of information. Such systems allow speakers to maximize communication accuracy and minimize the amount of information they convey when communicating about referents in contexts.

en cs.CL

Detail Sumber

DOAJ Open Access 2023

Dédicace de Lucius Mummius à Zeus Olympien

Guénette, Maxime

Au lendemain de la défaite de la Ligue achéenne et de la destruction de Corinthe en 146 av. J.-C., le général romain Lucius Mummius Achaïcus effectua un tour de Grèce afin de réorganiser le territoire grec désormais sous la domination romaine. En plus de régler des différents politiques entre cités, on constate que Mummius laissa sur son passage plusieurs offrandes et monuments dans d’importants villes, temples et sanctuaires. En prenant comme point de départ cette inscription, une dédicace de Lucius Mummius d’une statue équestre à Zeus Olympien, nous analyserons les différentes médiums et stratégies de communication employés par Lucius Mummius pour marquer sa victoire et ses exploits dans la mémoire collective des Grecs.

Ancient history, Greek philology and language

Detail DOI Sumber

arXiv Open Access 2023

Systematic Offensive Stereotyping (SOS) Bias in Language Models

Fatma Elsafoury

In this paper, we propose a new metric to measure the SOS bias in language models (LMs). Then, we validate the SOS bias and investigate the effectiveness of removing it. Finally, we investigate the impact of the SOS bias in LMs on their performance and fairness on hate speech detection. Our results suggest that all the inspected LMs are SOS biased. And that the SOS bias is reflective of the online hate experienced by marginalized identities. The results indicate that using debias methods from the literature worsens the SOS bias in LMs for some sensitive attributes and improves it for others. Finally, Our results suggest that the SOS bias in the inspected LMs has an impact on their fairness of hate speech detection. However, there is no strong evidence that the SOS bias has an impact on the performance of hate speech detection.

en cs.CL

Detail Sumber

arXiv Open Access 2023

Factuality Challenges in the Era of Large Language Models

Isabelle Augenstein, Timothy Baldwin, Meeyoung Cha et al.

The emergence of tools based on Large Language Models (LLMs), such as OpenAI's ChatGPT, Microsoft's Bing Chat, and Google's Bard, has garnered immense public attention. These incredibly useful, natural-sounding tools mark significant advances in natural language generation, yet they exhibit a propensity to generate false, erroneous, or misleading content -- commonly referred to as "hallucinations." Moreover, LLMs can be exploited for malicious applications, such as generating false but credible-sounding content and profiles at scale. This poses a significant challenge to society in terms of the potential deception of users and the increasing dissemination of inaccurate information. In light of these risks, we explore the kinds of technological innovations, regulatory reforms, and AI literacy initiatives needed from fact-checkers, news organizations, and the broader research and policy communities. By identifying the risks, the imminent threats, and some viable solutions, we seek to shed light on navigating various aspects of veracity in the era of generative AI.

en cs.CL, cs.AI

Detail DOI Sumber

arXiv Open Access 2023

Abstract Visual Reasoning Enabled by Language

Giacomo Camposampiero, Loic Houmard, Benjamin Estermann et al.

While artificial intelligence (AI) models have achieved human or even superhuman performance in many well-defined applications, they still struggle to show signs of broad and flexible intelligence. The Abstraction and Reasoning Corpus (ARC), a visual intelligence benchmark introduced by François Chollet, aims to assess how close AI systems are to human-like cognitive abilities. Most current approaches rely on carefully handcrafted domain-specific program searches to brute-force solutions for the tasks present in ARC. In this work, we propose a general learning-based framework for solving ARC. It is centered on transforming tasks from the vision to the language domain. This composition of language and vision allows for pre-trained models to be leveraged at each stage, enabling a shift from handcrafted priors towards the learned priors of the models. While not yet beating state-of-the-art models on ARC, we demonstrate the potential of our approach, for instance, by solving some ARC tasks that have not been solved previously.

en cs.AI, cs.CL

Detail Sumber

Hasil untuk "Greek philology and language"