{"results":[{"id":"arxiv_2602.05150","title":"GreekMMLU: A Native-Sourced Multitask Benchmark for Evaluating Language Models in Greek","authors":[{"name":"Yang Zhang"},{"name":"Mersin Konomi"},{"name":"Christos Xypolopoulos"},{"name":"Konstantinos Divriotis"},{"name":"Konstantinos Skianis"},{"name":"Giannis Nikolentzos"},{"name":"Giorgos Stamou"},{"name":"Guokan Shang"},{"name":"Michalis Vazirgiannis"}],"abstract":"Large Language Models (LLMs) are commonly trained on multilingual corpora that include Greek, yet reliable evaluation benchmarks for Greek-particularly those based on authentic, native-sourced content-remain limited. Existing datasets are often machine-translated from English, failing to capture Greek linguistic and cultural characteristics. We introduce GreekMMLU, a native-sourced benchmark for massive multitask language understanding in Greek, comprising 21,805 multiple-choice questions across 45 subject areas, organized under a newly defined subject taxonomy and annotated with educational difficulty levels spanning primary to professional examinations. All questions are sourced or authored in Greek from academic, professional, and governmental exams. We publicly release 16,857 samples and reserve 4,948 samples for a private leaderboard to enable robust and contamination-resistant evaluation. Evaluations of over 80 open- and closed-source LLMs reveal substantial performance gaps between frontier and open-weight models, as well as between Greek-adapted models and general multilingual ones. Finally, we provide a systematic analysis of factors influencing performance-including model scale, adaptation, and prompting-and derive insights for improving LLM capabilities in Greek.","source":"arXiv","year":2026,"language":"en","subjects":["cs.CL"],"url":"https://arxiv.org/abs/2602.05150","pdf_url":"https://arxiv.org/pdf/2602.05150","is_open_access":true,"published_at":"2026-02-05T00:12:18Z","score":70},{"id":"arxiv_2501.12826","title":"Open or Closed LLM for Lesser-Resourced Languages? Lessons from Greek","authors":[{"name":"John Pavlopoulos"},{"name":"Juli Bakagianni"},{"name":"Kanella Pouli"},{"name":"Maria Gavriilidou"}],"abstract":"Natural Language Processing (NLP) for lesser-resourced languages faces persistent challenges, including limited datasets, inherited biases from high-resource languages, and the need for domain-specific solutions. This study addresses these gaps for Modern Greek through three key contributions. First, we evaluate the performance of open-source (Llama-70b) and closed-source (GPT-4o mini) large language models (LLMs) on seven core NLP tasks with dataset availability, revealing task-specific strengths, weaknesses, and parity in their performance. Second, we expand the scope of Greek NLP by reframing Authorship Attribution as a tool to assess potential data usage by LLMs in pre-training, with high 0-shot accuracy suggesting ethical implications for data provenance. Third, we showcase a legal NLP case study, where a Summarize, Translate, and Embed (STE) methodology outperforms the traditional TF-IDF approach for clustering \\emph{long} legal texts. Together, these contributions provide a roadmap to advance NLP in lesser-resourced languages, bridging gaps in model evaluation, task innovation, and real-world impact.","source":"arXiv","year":2025,"language":"en","subjects":["cs.CL","cs.AI","cs.LG"],"url":"https://arxiv.org/abs/2501.12826","pdf_url":"https://arxiv.org/pdf/2501.12826","is_open_access":true,"published_at":"2025-01-22T12:06:16Z","score":69},{"id":"arxiv_2503.21676","title":"How do language models learn facts? Dynamics, curricula and hallucinations","authors":[{"name":"Nicolas Zucchet"},{"name":"Jörg Bornschein"},{"name":"Stephanie Chan"},{"name":"Andrew Lampinen"},{"name":"Razvan Pascanu"},{"name":"Soham De"}],"abstract":"Large language models accumulate vast knowledge during pre-training, yet the dynamics governing this acquisition remain poorly understood. This work investigates the learning dynamics of language models on a synthetic factual recall task, uncovering three key findings: First, language models learn in three phases, exhibiting a performance plateau before acquiring precise factual knowledge. Mechanistically, this plateau coincides with the formation of attention-based circuits that support recall. Second, the training data distribution significantly impacts learning dynamics, as imbalanced distributions lead to shorter plateaus. Finally, hallucinations emerge simultaneously with knowledge, and integrating new knowledge into the model through fine-tuning is challenging, as it quickly corrupts its existing parametric memories. Our results emphasize the importance of data distribution in knowledge acquisition and suggest novel data scheduling strategies to accelerate neural network training.","source":"arXiv","year":2025,"language":"en","subjects":["cs.CL","cs.LG"],"url":"https://arxiv.org/abs/2503.21676","pdf_url":"https://arxiv.org/pdf/2503.21676","is_open_access":true,"published_at":"2025-03-27T16:43:45Z","score":69},{"id":"arxiv_2505.13772","title":"Krikri: Advancing Open Large Language Models for Greek","authors":[{"name":"Dimitris Roussis"},{"name":"Leon Voukoutis"},{"name":"Georgios Paraskevopoulos"},{"name":"Sokratis Sofianopoulos"},{"name":"Prokopis Prokopidis"},{"name":"Vassilis Papavasileiou"},{"name":"Athanasios Katsamanis"},{"name":"Stelios Piperidis"},{"name":"Vassilis Katsouros"}],"abstract":"We introduce Llama-Krikri-8B, a cutting-edge Large Language Model tailored for the Greek language, built on Meta's Llama 3.1-8B. Llama-Krikri-8B has been extensively trained on high-quality Greek data to ensure superior adaptation to linguistic nuances. With 8 billion parameters, it offers advanced capabilities while maintaining efficient computational performance. Llama-Krikri-8B supports both Modern Greek and English, and is also equipped to handle polytonic text and Ancient Greek. The chat version of Llama-Krikri-8B features a multi-stage post-training pipeline, utilizing both human and synthetic instruction and preference data, by applying techniques such as MAGPIE. In addition, for evaluation, we propose three novel public benchmarks for Greek. Our evaluation on existing as well as the proposed benchmarks shows notable improvements over comparable Greek and multilingual LLMs in both natural language understanding and generation as well as code generation.","source":"arXiv","year":2025,"language":"en","subjects":["cs.CL"],"url":"https://arxiv.org/abs/2505.13772","pdf_url":"https://arxiv.org/pdf/2505.13772","is_open_access":true,"published_at":"2025-05-19T23:18:27Z","score":69},{"id":"arxiv_2502.18772","title":"Plutus: Benchmarking Large Language Models in Low-Resource Greek Finance","authors":[{"name":"Xueqing Peng"},{"name":"Triantafillos Papadopoulos"},{"name":"Efstathia Soufleri"},{"name":"Polydoros Giannouris"},{"name":"Ruoyu Xiang"},{"name":"Yan Wang"},{"name":"Lingfei Qian"},{"name":"Jimin Huang"},{"name":"Qianqian Xie"},{"name":"Sophia Ananiadou"}],"abstract":"Despite Greece's pivotal role in the global economy, large language models (LLMs) remain underexplored for Greek financial context due to the linguistic complexity of Greek and the scarcity of domain-specific datasets. Previous efforts in multilingual financial natural language processing (NLP) have exposed considerable performance disparities, yet no dedicated Greek financial benchmarks or Greek-specific financial LLMs have been developed until now. To bridge this gap, we introduce Plutus-ben, the first Greek Financial Evaluation Benchmark, and Plutus-8B, the pioneering Greek Financial LLM, fine-tuned with Greek domain-specific data. Plutus-ben addresses five core financial NLP tasks in Greek: numeric and textual named entity recognition, question answering, abstractive summarization, and topic classification, thereby facilitating systematic and reproducible LLM assessments. To underpin these tasks, we present three novel, high-quality Greek financial datasets, thoroughly annotated by expert native Greek speakers, augmented by two existing resources. Our comprehensive evaluation of 22 LLMs on Plutus-ben reveals that Greek financial NLP remains challenging due to linguistic complexity, domain-specific terminology, and financial reasoning gaps. These findings underscore the limitations of cross-lingual transfer, the necessity for financial expertise in Greek-trained models, and the challenges of adapting financial LLMs to Greek text. We release Plutus-ben, Plutus-8B, and all associated datasets publicly to promote reproducible research and advance Greek financial NLP, fostering broader multilingual inclusivity in finance.","source":"arXiv","year":2025,"language":"en","subjects":["cs.CL"],"url":"https://arxiv.org/abs/2502.18772","pdf_url":"https://arxiv.org/pdf/2502.18772","is_open_access":true,"published_at":"2025-02-26T03:04:01Z","score":69},{"id":"arxiv_2412.08520","title":"GR-NLP-TOOLKIT: An Open-Source NLP Toolkit for Modern Greek","authors":[{"name":"Lefteris Loukas"},{"name":"Nikolaos Smyrnioudis"},{"name":"Chrysa Dikonomaki"},{"name":"Spyros Barbakos"},{"name":"Anastasios Toumazatos"},{"name":"John Koutsikakis"},{"name":"Manolis Kyriakakis"},{"name":"Mary Georgiou"},{"name":"Stavros Vassos"},{"name":"John Pavlopoulos"},{"name":"Ion Androutsopoulos"}],"abstract":"We present GR-NLP-TOOLKIT, an open-source natural language processing (NLP) toolkit developed specifically for modern Greek. The toolkit provides state-of-the-art performance in five core NLP tasks, namely part-of-speech tagging, morphological tagging, dependency parsing, named entity recognition, and Greeklishto-Greek transliteration. The toolkit is based on pre-trained Transformers, it is freely available, and can be easily installed in Python (pip install gr-nlp-toolkit). It is also accessible through a demonstration platform on HuggingFace, along with a publicly available API for non-commercial use. We discuss the functionality provided for each task, the underlying methods, experiments against comparable open-source toolkits, and future possible enhancements. The toolkit is available at: https://github.com/nlpaueb/gr-nlp-toolkit","source":"arXiv","year":2024,"language":"en","subjects":["cs.CL","cs.AI","cs.SE"],"url":"https://arxiv.org/abs/2412.08520","pdf_url":"https://arxiv.org/pdf/2412.08520","is_open_access":true,"published_at":"2024-12-11T16:34:23Z","score":68},{"id":"arxiv_2408.10962","title":"NLP for The Greek Language: A Longer Survey","authors":[{"name":"Katerina Papantoniou"},{"name":"Yannis Tzitzikas"}],"abstract":"English language is in the spotlight of the Natural Language Processing (NLP) community with other languages, like Greek, lagging behind in terms of offered methods, tools and resources. Due to the increasing interest in NLP, in this paper we try to condense research efforts for the automatic processing of Greek language covering the last three decades. In particular, we list and briefly discuss related works, resources and tools, categorized according to various processing layers and contexts. We are not restricted to the modern form of Greek language but also cover Ancient Greek and various Greek dialects. This survey can be useful for researchers and students interested in NLP tasks, Information Retrieval and Knowledge Management for the Greek language.","source":"arXiv","year":2024,"language":"en","subjects":["cs.CL"],"url":"https://arxiv.org/abs/2408.10962","pdf_url":"https://arxiv.org/pdf/2408.10962","is_open_access":true,"published_at":"2024-08-20T15:57:18Z","score":68},{"id":"arxiv_2407.20743","title":"Meltemi: The first open Large Language Model for Greek","authors":[{"name":"Leon Voukoutis"},{"name":"Dimitris Roussis"},{"name":"Georgios Paraskevopoulos"},{"name":"Sokratis Sofianopoulos"},{"name":"Prokopis Prokopidis"},{"name":"Vassilis Papavasileiou"},{"name":"Athanasios Katsamanis"},{"name":"Stelios Piperidis"},{"name":"Vassilis Katsouros"}],"abstract":"We describe the development and capabilities of Meltemi 7B, the first open Large Language Model for the Greek language. Meltemi 7B has 7 billion parameters and is trained on a 40 billion token Greek corpus. For the development of Meltemi 7B, we adapt Mistral, by continuous pretraining on the Greek Corpus. Meltemi 7B contains up-to-date information up to September 2023. Furthermore, we have translated and curated a Greek instruction corpus, which has been used for the instruction-tuning of a chat model, named Meltemi 7B Instruct. Special care has been given to the alignment and the removal of toxic content for the Meltemi 7B Instruct. The developed models are evaluated on a broad set of collected evaluation corpora, and examples of prompts and responses are presented. Both Meltemi 7B and Meltemi 7B Instruct are available at https://huggingface.co/ilsp under the Apache 2.0 license.","source":"arXiv","year":2024,"language":"en","subjects":["cs.CL"],"url":"https://arxiv.org/abs/2407.20743","pdf_url":"https://arxiv.org/pdf/2407.20743","is_open_access":true,"published_at":"2024-07-30T11:22:52Z","score":68},{"id":"arxiv_2409.02228","title":"Unforgettable Generalization in Language Models","authors":[{"name":"Eric Zhang"},{"name":"Leshem Chosen"},{"name":"Jacob Andreas"}],"abstract":"When language models (LMs) are trained to forget (or \"unlearn'') a skill, how precisely does their behavior change? We study the behavior of transformer LMs in which tasks have been forgotten via fine-tuning on randomized labels. Such LMs learn to generate near-random predictions for individual examples in the \"training'' set used for forgetting. Across tasks, however, LMs exhibit extreme variability in whether LM predictions change on examples outside the training set. In some tasks (like entailment classification), forgetting generalizes robustly, and causes models to produce uninformative predictions on new task instances; in other tasks (like physical commonsense reasoning and scientific question answering) forgetting affects only the training examples, and models continue to perform the \"forgotten'' task accurately even for examples very similar to those that appeared in the training set. Dataset difficulty is not predictive of whether a behavior can be forgotten; instead, generalization in forgetting is (weakly) predicted by the confidence of LMs' initial task predictions and the variability of LM representations of training data, with low confidence and low variability both associated with greater generalization. Perhaps most surprisingly, random-label forgetting appears to be somewhat insensitive to the contents of the training set: for example, models trained on science questions with random labels continue to answer other science questions accurately, but begin to produce random labels on entailment classification tasks. Finally, we show that even generalizable forgetting is shallow: linear probes trained on LMs' representations can still perform tasks reliably after forgetting. Our results highlight the difficulty and unpredictability of performing targeted skill removal from models via fine-tuning.","source":"arXiv","year":2024,"language":"en","subjects":["cs.LG","cs.CL"],"url":"https://arxiv.org/abs/2409.02228","pdf_url":"https://arxiv.org/pdf/2409.02228","is_open_access":true,"published_at":"2024-09-03T18:55:54Z","score":68},{"id":"arxiv_2407.09861","title":"A Systematic Survey of Natural Language Processing for the Greek Language","authors":[{"name":"Juli Bakagianni"},{"name":"Kanella Pouli"},{"name":"Maria Gavriilidou"},{"name":"John Pavlopoulos"}],"abstract":"Comprehensive monolingual Natural Language Processing (NLP) surveys are essential for assessing language-specific challenges, resource availability, and research gaps. However, existing surveys often lack standardized methodologies, leading to selection bias and fragmented coverage of NLP tasks and resources. This study introduces a generalizable framework for systematic monolingual NLP surveys. Our approach integrates a structured search protocol to minimize bias, an NLP task taxonomy for classification, and language resource taxonomies to identify potential benchmarks and highlight opportunities for improving resource availability. We apply this framework to Greek NLP (2012-2023), providing an in-depth analysis of its current state, task-specific progress, and resource gaps. The survey results are publicly available (https://doi.org/10.5281/zenodo.15314882) and are regularly updated to provide an evergreen resource. This systematic survey of Greek NLP serves as a case study, demonstrating the effectiveness of our framework and its potential for broader application to other not so well-resourced languages as regards NLP.","source":"arXiv","year":2024,"language":"en","subjects":["cs.CL","cs.AI"],"url":"https://arxiv.org/abs/2407.09861","pdf_url":"https://arxiv.org/pdf/2407.09861","is_open_access":true,"published_at":"2024-07-13T12:01:52Z","score":68},{"id":"doaj_10.11576/lgnrw-6468","title":"Mit antiken Texten politisch denken und urteilen lernen","authors":[{"name":"Jochen Sauer"}],"abstract":"","source":"DOAJ","year":2023,"language":"","subjects":["Greek language and literature. Latin language and literature","Philology. Linguistics"],"doi":"10.11576/lgnrw-6468","url":"https://www.biejournals.de/index.php/lgnrw/article/view/6468","is_open_access":true,"published_at":"","score":67},{"id":"doaj_10.11576/lgnrw-6481","title":"Impressum","authors":[{"name":"Susanne Aretz"}],"abstract":"","source":"DOAJ","year":2023,"language":"","subjects":["Greek language and literature. Latin language and literature","Philology. Linguistics"],"doi":"10.11576/lgnrw-6481","url":"https://www.biejournals.de/index.php/lgnrw/article/view/6481","is_open_access":true,"published_at":"","score":67},{"id":"doaj_10.24215/23468890e088","title":"Estefanía, D., Eneida. Virgilio, Bahía Blanca, Editorial de la Universidad Nacional del Sur, 2023, 478 pp., ISBN 978-987-655-330-8","authors":[{"name":"María Emilia Cairo"}],"abstract":"\nRevisión del libro Eneida. Virgilio por Estefanía, D.\n","source":"DOAJ","year":2023,"language":"","subjects":["Philology. Linguistics","Greek language and literature. Latin language and literature"],"doi":"10.24215/23468890e088","url":"https://www.auster.fahce.unlp.edu.ar/article/view/14214","is_open_access":true,"published_at":"","score":67},{"id":"arxiv_2305.01099","title":"Logion: Machine Learning for Greek Philology","authors":[{"name":"Charlie Cowen-Breen"},{"name":"Creston Brooks"},{"name":"Johannes Haubold"},{"name":"Barbara Graziosi"}],"abstract":"This paper presents machine-learning methods to address various problems in Greek philology. After training a BERT model on the largest premodern Greek dataset used for this purpose to date, we identify and correct previously undetected errors made by scribes in the process of textual transmission, in what is, to our knowledge, the first successful identification of such errors via machine learning. Additionally, we demonstrate the model's capacity to fill gaps caused by material deterioration of premodern manuscripts and compare the model's performance to that of a domain expert. We find that best performance is achieved when the domain expert is provided with model suggestions for inspiration. With such human-computer collaborations in mind, we explore the model's interpretability and find that certain attention heads appear to encode select grammatical features of premodern Greek.","source":"arXiv","year":2023,"language":"en","subjects":["cs.CL","cs.LG"],"url":"https://arxiv.org/abs/2305.01099","pdf_url":"https://arxiv.org/pdf/2305.01099","is_open_access":true,"published_at":"2023-05-01T21:56:25Z","score":67},{"id":"arxiv_2305.13698","title":"Exploring Large Language Models for Classical Philology","authors":[{"name":"Frederick Riemenschneider"},{"name":"Anette Frank"}],"abstract":"Recent advances in NLP have led to the creation of powerful language models for many languages including Ancient Greek and Latin. While prior work on Classical languages unanimously uses BERT, in this work we create four language models for Ancient Greek that vary along two dimensions to study their versatility for tasks of interest for Classical languages: we explore (i) encoder-only and encoder-decoder architectures using RoBERTa and T5 as strong model types, and create for each of them (ii) a monolingual Ancient Greek and a multilingual instance that includes Latin and English. We evaluate all models on morphological and syntactic tasks, including lemmatization, which demonstrates the added value of T5's decoding abilities. We further define two probing tasks to investigate the knowledge acquired by models pre-trained on Classical texts. Our experiments provide the first benchmarking analysis of existing models of Ancient Greek. Results show that our models provide significant improvements over the SoTA. The systematic analysis of model types can inform future research in designing language models for Classical languages, including the development of novel generative tasks. We make all our models available as community resources, along with a large curated pre-training corpus for Ancient Greek, to support the creation of a larger, comparable model zoo for Classical Philology. Our models and resources are available at https://github.com/Heidelberg-NLP/ancient-language-models.","source":"arXiv","year":2023,"language":"en","subjects":["cs.CL"],"url":"https://arxiv.org/abs/2305.13698","pdf_url":"https://arxiv.org/pdf/2305.13698","is_open_access":true,"published_at":"2023-05-23T05:21:02Z","score":67},{"id":"arxiv_2308.12008","title":"Graecia capta ferum victorem cepit. Detecting Latin Allusions to Ancient Greek Literature","authors":[{"name":"Frederick Riemenschneider"},{"name":"Anette Frank"}],"abstract":"Intertextual allusions hold a pivotal role in Classical Philology, with Latin authors frequently referencing Ancient Greek texts. Until now, the automatic identification of these intertextual references has been constrained to monolingual approaches, seeking parallels solely within Latin or Greek texts. In this study, we introduce SPhilBERTa, a trilingual Sentence-RoBERTa model tailored for Classical Philology, which excels at cross-lingual semantic comprehension and identification of identical sentences across Ancient Greek, Latin, and English. We generate new training data by automatically translating English texts into Ancient Greek. Further, we present a case study, demonstrating SPhilBERTa's capability to facilitate automated detection of intertextual parallels. Our models and resources are available at https://github.com/Heidelberg-NLP/ancient-language-models.","source":"arXiv","year":2023,"language":"en","subjects":["cs.CL"],"url":"https://arxiv.org/abs/2308.12008","pdf_url":"https://arxiv.org/pdf/2308.12008","is_open_access":true,"published_at":"2023-08-23T08:54:05Z","score":67},{"id":"doaj_10.11576/lgnrw-5355","title":"Damit wir glücklich sind - Jason und Medea im 2. Epeisodion der Euripideischen Medea","authors":[{"name":"Susanne Aretz"}],"abstract":"","source":"DOAJ","year":2022,"language":"","subjects":["Greek language and literature. Latin language and literature","Philology. Linguistics"],"doi":"10.11576/lgnrw-5355","url":"https://www.biejournals.de/index.php/lgnrw/article/view/5355","is_open_access":true,"published_at":"","score":66},{"id":"doaj_10.13135/2532-5353/6867","title":"Frontespizio e Sommario","authors":[{"name":"Ermanno Malaspina, EM"}],"abstract":"\nSommario del Fascicolo\n","source":"DOAJ","year":2022,"language":"","subjects":["Philology. Linguistics","Greek language and literature. Latin language and literature"],"doi":"10.13135/2532-5353/6867","url":"https://www.ojs.unito.it/index.php/COL/article/view/6867","is_open_access":true,"published_at":"","score":66},{"id":"doaj_Impressum","title":"Impressum","authors":[{"name":"Susanne Aretz"}],"abstract":"","source":"DOAJ","year":2022,"language":"","subjects":["Greek language and literature. Latin language and literature","Philology. Linguistics"],"url":"https://www.biejournals.de/index.php/lgnrw/article/view/5598","is_open_access":true,"published_at":"","score":66},{"id":"arxiv_2210.10623","title":"Names from Greek Myth in Fundamental Physics","authors":[{"name":"Nirmal Raj"}],"abstract":"Greek mythology supplies fundamental physics with the names of numerous (100+) experiments, machines, codes, and phenomena. I present the central narrative of Greek mythos via these names. Hyperlinks are provided for their physics counterparts, and the names are collected in myth- and physics-themed indices.","source":"arXiv","year":2022,"language":"en","subjects":["physics.pop-ph","hep-ex","hep-ph","physics.ed-ph"],"url":"https://arxiv.org/abs/2210.10623","pdf_url":"https://arxiv.org/pdf/2210.10623","is_open_access":true,"published_at":"2022-10-11T17:36:08Z","score":66}],"total":1455819,"page":1,"page_size":20,"sources":["DOAJ","arXiv","CrossRef"],"query":"Greek philology and language"}