Hasil untuk "Greek language and literature. Latin language and literature"

Menampilkan 20 dari ~2863424 hasil · dari DOAJ, CrossRef, Semantic Scholar, arXiv

JSON API
DOAJ Open Access 2026
Estoy al servicio de Eros: un análisis lingüístico-literario de Alciphr. I 21

Ismael El Bahraoui Pérez

El presente estudio pretende llevar a cabo un análisis lingüístico-literario de la vigésimo primera misiva que conforma el primer libro del epistolario alcifroneo, estructurada a partir de una écfrasis en torno a la cual el epistológrafo, en tanto que πεπαιδευμένος, hace uso de variados recursos que ofrecen los manuales de retórica (Robb, 1994, pp. 10-25), con especial atención al empleo del relato y el lugar común como elementos distintivos. Esta epístola, que tiene como emisor a Éuploo y como receptor a Taláseros, describe la dilapidación de todos los bienes por parte del último tras haberse encantado por una músico que toca la lira y, en consecuencia, el marco narrativo aparece caracterizado por la fuerte presencia del tópico del naufragio de amor, en concreto la variante del naufragio en tierra firme, muy frecuente en el V y XII libro de la Antología Palatina. En este sentido, dicho examen nos permitirá ofrecer una panorámica de aquellos pasajes –pertenecientes a la poesía epigramática erótica–, que entran en relación con nuestra carta desde el punto de vista de la topificación amatoria.

Greek language and literature. Latin language and literature
DOAJ Open Access 2025
Two Weddings and a Funeral

Michael Edward Stewart

The marriage of Germanus, nephew of Emperor Justin I (r. 518–527), to Matasuintha, former Gothic queen and granddaughter of Theoderic the Great (r. 475–526), in late 549 or early 550, was a significant yet often overlooked moment in the later stages of the Gothic War. Scholars generally interpret the marriage as a pragmatic alliance shaped by immediate strategic concerns – either a political manoeuvre by Justinian or a personal initiative by Germanus following his appointment as commander in Italy. This article revisits that assumption by exploring three related questions. First, did the marriage and military appointment signal a reconciliation between Justinian and Germanus, or a calculated attempt by the emperor to stabilize a deteriorating political situation? Second, how did their relationship evolve in the years leading up to the union, particularly after Theodora’s death in 548? Finally, more speculatively, was Germanus’ earlier decision to marry his daughter to the general John in 545 connected to his own dynastic ambitions?

Ancient history, Greek language and literature. Latin language and literature
arXiv Open Access 2025
Open or Closed LLM for Lesser-Resourced Languages? Lessons from Greek

John Pavlopoulos, Juli Bakagianni, Kanella Pouli et al.

Natural Language Processing (NLP) for lesser-resourced languages faces persistent challenges, including limited datasets, inherited biases from high-resource languages, and the need for domain-specific solutions. This study addresses these gaps for Modern Greek through three key contributions. First, we evaluate the performance of open-source (Llama-70b) and closed-source (GPT-4o mini) large language models (LLMs) on seven core NLP tasks with dataset availability, revealing task-specific strengths, weaknesses, and parity in their performance. Second, we expand the scope of Greek NLP by reframing Authorship Attribution as a tool to assess potential data usage by LLMs in pre-training, with high 0-shot accuracy suggesting ethical implications for data provenance. Third, we showcase a legal NLP case study, where a Summarize, Translate, and Embed (STE) methodology outperforms the traditional TF-IDF approach for clustering \emph{long} legal texts. Together, these contributions provide a roadmap to advance NLP in lesser-resourced languages, bridging gaps in model evaluation, task innovation, and real-world impact.

en cs.CL, cs.AI
arXiv Open Access 2025
How do language models learn facts? Dynamics, curricula and hallucinations

Nicolas Zucchet, Jörg Bornschein, Stephanie Chan et al.

Large language models accumulate vast knowledge during pre-training, yet the dynamics governing this acquisition remain poorly understood. This work investigates the learning dynamics of language models on a synthetic factual recall task, uncovering three key findings: First, language models learn in three phases, exhibiting a performance plateau before acquiring precise factual knowledge. Mechanistically, this plateau coincides with the formation of attention-based circuits that support recall. Second, the training data distribution significantly impacts learning dynamics, as imbalanced distributions lead to shorter plateaus. Finally, hallucinations emerge simultaneously with knowledge, and integrating new knowledge into the model through fine-tuning is challenging, as it quickly corrupts its existing parametric memories. Our results emphasize the importance of data distribution in knowledge acquisition and suggest novel data scheduling strategies to accelerate neural network training.

en cs.CL, cs.LG
arXiv Open Access 2025
Soft Inductive Bias Approach via Explicit Reasoning Perspectives in Inappropriate Utterance Detection Using Large Language Models

Ju-Young Kim, Ji-Hong Park, Se-Yeon Lee et al.

Recent incidents in certain online games and communities, where anonymity is guaranteed, show that unchecked inappropriate remarks frequently escalate into verbal abuse and even criminal behavior, raising significant social concerns. Consequently, there is a growing need for research on techniques that can detect inappropriate utterances within conversational texts to help build a safer communication environment. Although large-scale language models trained on Korean corpora and chain-of-thought reasoning have recently gained attention, research applying these approaches to inappropriate utterance detection remains limited. In this study, we propose a soft inductive bias approach that explicitly defines reasoning perspectives to guide the inference process, thereby promoting rational decision-making and preventing errors that may arise during reasoning. We fine-tune a Korean large language model using the proposed method and conduct both quantitative performance comparisons and qualitative evaluations across different training strategies. Experimental results show that the Kanana-1.5 model achieves an average accuracy of 87.0046, improving by approximately 3.89 percent over standard supervised learning. These findings indicate that the proposed method goes beyond simple knowledge imitation by large language models and enables more precise and consistent judgments through constrained reasoning perspectives, demonstrating its effectiveness for inappropriate utterance detection.

en cs.CL
S2 Open Access 2024
FREQUENCY OF LATIN PHRASEOLOGICAL UNITS IN THE PERSEUS DIGITAL LIBRARY GREEK AND ROMAN (LATIN) CORPS AND THE CROATIAN LANGUAGE LIBRARY:TEST PROBE A

Matija Zorić

Education systems worldwide use a large number of Latin sayings and proverbs popularly called Dicta et sententiae (hereafter: D&S) which are taught randomly, without sub-categorizing. Usually, their importance to the educational area of a particular school is taken into account. This may suit law schools to some extent due to the formulaic nature of the legal discourse, while the choice of sayings in high school, medical, philosophical, and social-humanistic schools, in general, is left to the subjective choice of the teacher. The largest Croatian collection of Latin D&S Latinum in aeternum (LIA), Marević (2002) contains 18,632 references, and in secondary schools, 100–200 D&S are taught in two to four years of learning Latin. The catalog of the Croatian state graduation exams in Latin language (2016–2021) contains less than 200 sayings. It is therefore imperative to determine the frequency of Latin D&S in the largest possible corpus of Roman (and later Latin) literature and sub-categorize D&S, thus providing school systems around the world with a much-needed cata-log based on D&S frequency as a tool for compiling teaching and examination catalogs. Manual entry of sayings into the corpus search engine and the extensiveness of the LIA collection require too much effort and time for a single scientific paper. Therefore, the corpus representation of the test probe of 1270 D&S from the LIA collection starting with the letter A in the Perseus Digital Library Greek and Roman Materials corpus is investigated here and also in the corpus of the Croatian language repository (HJR) of the Institute for the Croatian Language and Linguistics. The search results are useful for teaching Latin worldwide and for teaching Latin and Croatian in Croatian schools. Keywords: frequency of Latin sayings, Latin phraseological units, dicta et sententiae, Latin phraseology, Latin in Cro-atian

S2 Open Access 2024
On some ancient Greek and Latin medical recipes in verse. Their position in the world. Part A

Athanasios Diamandopoulos

In Part A of our article, we examine medical recipes in verse within Greek literature, spanning from the Hellenistic period to the Roman Imperial era. We also briefly touch upon analogous recipes in Classical and Late Latin, as these two literary forms were intertwined for centuries. A comprehensive analysis of Latin literature in this domain remains a necessity. We explore the motivations behind these didactic poems and the metrical patterns employed in their composition. The article presents fragments of these recipes, all translated into English and several retained in their original language as well, arranged chronologically alongside succinct biographical details of their authors. These include Homer, considered their distant forebear, followed by Ovid, Aglaias of Byzantium, Andromachus the Elder, Philo of Tarsus, Damocrates, Nicander, Rufus of Ephesus, Eudemus of Pergamum, Galen, Serenus Sammonicus, The Carmen graecum de herbis, and Marcellus Empiricus. In Part B, we will continue to explore similar recipes from the Middle and Late Byzantine periods. This section will also feature examples from Medieval Latin and Islamic medical literature, illustrating the intercultural context in which these Greek verses stood. A General Discussion and Conclusions will be provided at the end of Part B.

arXiv Open Access 2024
Unforgettable Generalization in Language Models

Eric Zhang, Leshem Chosen, Jacob Andreas

When language models (LMs) are trained to forget (or "unlearn'') a skill, how precisely does their behavior change? We study the behavior of transformer LMs in which tasks have been forgotten via fine-tuning on randomized labels. Such LMs learn to generate near-random predictions for individual examples in the "training'' set used for forgetting. Across tasks, however, LMs exhibit extreme variability in whether LM predictions change on examples outside the training set. In some tasks (like entailment classification), forgetting generalizes robustly, and causes models to produce uninformative predictions on new task instances; in other tasks (like physical commonsense reasoning and scientific question answering) forgetting affects only the training examples, and models continue to perform the "forgotten'' task accurately even for examples very similar to those that appeared in the training set. Dataset difficulty is not predictive of whether a behavior can be forgotten; instead, generalization in forgetting is (weakly) predicted by the confidence of LMs' initial task predictions and the variability of LM representations of training data, with low confidence and low variability both associated with greater generalization. Perhaps most surprisingly, random-label forgetting appears to be somewhat insensitive to the contents of the training set: for example, models trained on science questions with random labels continue to answer other science questions accurately, but begin to produce random labels on entailment classification tasks. Finally, we show that even generalizable forgetting is shallow: linear probes trained on LMs' representations can still perform tasks reliably after forgetting. Our results highlight the difficulty and unpredictability of performing targeted skill removal from models via fine-tuning.

en cs.LG, cs.CL
arXiv Open Access 2024
Automated Literature Review Using NLP Techniques and LLM-Based Retrieval-Augmented Generation

Nurshat Fateh Ali, Md. Mahdi Mohtasim, Shakil Mosharrof et al.

This research presents and compares multiple approaches to automate the generation of literature reviews using several Natural Language Processing (NLP) techniques and retrieval-augmented generation (RAG) with a Large Language Model (LLM). The ever-increasing number of research articles provides a huge challenge for manual literature review. It has resulted in an increased demand for automation. Developing a system capable of automatically generating the literature reviews from only the PDF files as input is the primary objective of this research work. The effectiveness of several Natural Language Processing (NLP) strategies, such as the frequency-based method (spaCy), the transformer model (Simple T5), and retrieval-augmented generation (RAG) with Large Language Model (GPT-3.5-turbo), is evaluated to meet the primary objective. The SciTLDR dataset is chosen for this research experiment and three distinct techniques are utilized to implement three different systems for auto-generating the literature reviews. The ROUGE scores are used for the evaluation of all three systems. Based on the evaluation, the Large Language Model GPT-3.5-turbo achieved the highest ROUGE-1 score, 0.364. The transformer model comes in second place and spaCy is at the last position. Finally, a graphical user interface is created for the best system based on the large language model.

en cs.CL, cs.AI
arXiv Open Access 2024
Are Compressed Language Models Less Subgroup Robust?

Leonidas Gee, Andrea Zugarini, Novi Quadrianto

To reduce the inference cost of large language models, model compression is increasingly used to create smaller scalable models. However, little is known about their robustness to minority subgroups defined by the labels and attributes of a dataset. In this paper, we investigate the effects of 18 different compression methods and settings on the subgroup robustness of BERT language models. We show that worst-group performance does not depend on model size alone, but also on the compression method used. Additionally, we find that model compression does not always worsen the performance on minority subgroups. Altogether, our analysis serves to further research into the subgroup robustness of model compression.

en cs.LG, cs.CL
arXiv Open Access 2024
How Important Is Tokenization in French Medical Masked Language Models?

Yanis Labrak, Adrien Bazoge, Beatrice Daille et al.

Subword tokenization has become the prevailing standard in the field of natural language processing (NLP) over recent years, primarily due to the widespread utilization of pre-trained language models. This shift began with Byte-Pair Encoding (BPE) and was later followed by the adoption of SentencePiece and WordPiece. While subword tokenization consistently outperforms character and word-level tokenization, the precise factors contributing to its success remain unclear. Key aspects such as the optimal segmentation granularity for diverse tasks and languages, the influence of data sources on tokenizers, and the role of morphological information in Indo-European languages remain insufficiently explored. This is particularly pertinent for biomedical terminology, characterized by specific rules governing morpheme combinations. Despite the agglutinative nature of biomedical terminology, existing language models do not explicitly incorporate this knowledge, leading to inconsistent tokenization strategies for common terms. In this paper, we seek to delve into the complexities of subword tokenization in French biomedical domain across a variety of NLP tasks and pinpoint areas where further enhancements can be made. We analyze classical tokenization algorithms, including BPE and SentencePiece, and introduce an original tokenization strategy that integrates morpheme-enriched word segmentation into existing tokenization methods.

en cs.CL, cs.AI
arXiv Open Access 2024
Clinical information extraction for Low-resource languages with Few-shot learning using Pre-trained language models and Prompting

Phillip Richter-Pechanski, Philipp Wiesenbach, Dominic M. Schwab et al.

Automatic extraction of medical information from clinical documents poses several challenges: high costs of required clinical expertise, limited interpretability of model predictions, restricted computational resources and privacy regulations. Recent advances in domain-adaptation and prompting methods showed promising results with minimal training data using lightweight masked language models, which are suited for well-established interpretability methods. We are first to present a systematic evaluation of these methods in a low-resource setting, by performing multi-class section classification on German doctor's letters. We conduct extensive class-wise evaluations supported by Shapley values, to validate the quality of our small training data set and to ensure the interpretability of model predictions. We demonstrate that a lightweight, domain-adapted pretrained model, prompted with just 20 shots, outperforms a traditional classification model by 30.5% accuracy. Our results serve as a process-oriented guideline for clinical information extraction projects working with low-resource.

en cs.CL, cs.AI
S2 Open Access 2023
The Enlightened Apology of the Latin Language by Marko Faustin Galjuf from Dubrovnik

Teodora Shek Brnardić

In recent Enlightenment studies, a trend can be termed as the “classical turn” because it places a focus on the classical heritage as an integral part of the eighteenth-century culture. Interest in antiquity encompassed Greek and Roman literature, philosophy, and art, and Enlightenment thinkers were particularly fascinated and inspired by the rationalism, humanism, and civic virtues of the ancient world. Archaeological excavations in Italy supported the development of neoclassical style, experiencing a true revival with Rome as its centre. Countless translations of classical authors were in line with “the taste of the time”, and improvisations of poetry from contemporary languages into Latin were especially valued. The Piarist from Dubrovnik, Marko Faustin Galjuf (1765-1834), was one of the most renowned Latin improvisers of his time. He began his teaching career in Rome and later became politically and academically engaged in the pro-French Roman and Ligurian Republics. After the fall of Napoleon’s Empire in 1815, Galjuf fell out of favour due to his past. In 1833, he published an apology for the use of the Latin language titled Essay on the Fortune of the Latin Language (Specimen de fortuna Latinitatis), seeking a way to return to the unforsaken Rome under the rule of Pope Gregory XVI. This paper will explore the Enlightenment socio-cultural context of the creation and arguments of this forgotten but significant piece for the history of cultural patterns of that period. It will be argued that Galjuf’s intention for writing his apology was of an enlightened rather than a conservative nature.

1 sitasi en
S2 Open Access 2023
SPECIFICITY AND FUNCTIONS OF THE LATIN LANGUAGE DISCOURSE IN “A CRAZY GREEK” THOMAS NASHE’S PROSE

L. Fedoriaka

B a c k g r o u n d. The article explores the specificity of the Latin language discourse functioning in the works of the Elizabethan writer Thomas Nashe (1567-1601?) The novel “The Unfortunate Traveller” (1593) and the pamphlet “Pierce Penniless” became here the subject of research as Nashe quotes Ovid most often in them. Classical humanitarian education at Saint Jones college, sincere interest in ancient literature and culture, brilliant knowledge of Latin stimulated the usage of Latin expressions in his works, and also, these factors made it possible to determine the peculiarities of the Latin language discourse in Th. Nashe’s satirical fiction. M e t h o d s. Different methods were used to determine the specificity of the Latin discourse in the two works of Th. Nashe. Analytical, synthetic, and cultural historical methods were prioritized for the literary and critical understanding of texts. Textological, stylistic, biographical and interpretive research methods were chosen for the analysis. R e s u l t s. The author of the article came to the conclusion that his favorite classic writer was Ovid as sentences from his “Amores” enrich the texts of most English writer’s works. The usage of Latin quoting clearly indicates erudition, encyclopedic knowledge and inexhaustible creative energy of one of the most intelligent “university wits” Th. Nashe. However, the main goal of this article was to establish the functionality of Ovid’s words. C o n c l u s I o n s. The analysis of several fragments from the novel and the pamphlet allowed to state that Latin-language Ovid’s maxims underwent changes in the satirical context of the Elizabethan writer’s works. As a result. They lose the emotional and pragmatic meaning inherent in the original text and begin to play the role of an intensifier of the English author’s satirical imperatives.

S2 Open Access 2023
A new Chinese-language textbook of ancient Greek with a historical outline of teaching Greek and Latin in China

Xavier Gheerbrant, Y. Zeng

In this paper we present the project of a new Chinese-language textbook of ancient Greek. This textbook is intended for students majoring in philosophy. In the first part of the paper, we provide an introduction to the historical circumstances inwhich ancient Greek literature and philosophy were originally introduced to China. We draw an outline of the cultural significance of the studies of ancient Greek in China, especially from the beginning of the 20th century. This helps explains the reasons why studies of ancient Greek were tightly connected to those of ancient Greek philosophy — and hence shed light on the intended focus of our textbook. In the second part of the paper, we present the textbook properly speaking: its intended audience and general structure, and an overview of the linguistic diff erence between ancient Greek and modern Chinese. This overview will reveal the types of issues faced by native Chinese speakers when learning ancient Greek and show that in some cases, such as the verbal aspect, Chinese has better resources to translate Greek sentences than English. We provide the example of a sample chapter to make the discussion more material and also to illustrate how the specific resources of Chinese language allow the translator to render the uses of subjunctive in ancient Greek.

S2 Open Access 2023
Effective Vocabulary Instruction: Building Academic Vocabulary Knowledge with Greek and Latin Roots

C. Jones

: Knowledge of academic words and their meanings is increasingly recognized as an important focus of effective vocabulary instruction. Given that over seventy-six percent of academic words share common morphological roots and that more than ninety percent of discipline specific words are of Greek or Latin origin, it is important for educators to build student knowledge of roots. This article presents research-based information about this important component of vocabulary instruction, key terms, guidelines

arXiv Open Access 2023
A Comprehensive Review of State-of-The-Art Methods for Java Code Generation from Natural Language Text

Jessica López Espejel, Mahaman Sanoussi Yahaya Alassan, El Mehdi Chouham et al.

Java Code Generation consists in generating automatically Java code from a Natural Language Text. This NLP task helps in increasing programmers' productivity by providing them with immediate solutions to the simplest and most repetitive tasks. Code generation is a challenging task because of the hard syntactic rules and the necessity of a deep understanding of the semantic aspect of the programming language. Many works tried to tackle this task using either RNN-based, or Transformer-based models. The latter achieved remarkable advancement in the domain and they can be divided into three groups: (1) encoder-only models, (2) decoder-only models, and (3) encoder-decoder models. In this paper, we provide a comprehensive review of the evolution and progress of deep learning models in Java code generation task. We focus on the most important methods and present their merits and limitations, as well as the objective functions used by the community. In addition, we provide a detailed description of datasets and evaluation metrics used in the literature. Finally, we discuss results of different models on CONCODE dataset, then propose some future directions.

arXiv Open Access 2023
Turkish Native Language Identification V2

Ahmet Yavuz Uluslu, Gerold Schneider

This paper presents the first application of Native Language Identification (NLI) for the Turkish language. NLI is the task of automatically identifying an individual's native language (L1) based on their writing or speech in a non-native language (L2). While most NLI research has focused on L2 English, our study extends this scope to L2 Turkish by analyzing a corpus of texts written by native speakers of Albanian, Arabic and Persian. We leverage a cleaned version of the Turkish Learner Corpus and demonstrate the effectiveness of syntactic features, comparing a structural Part-of-Speech n-gram model to a hybrid model that retains function words. Our models achieve promising results, and we analyze the most predictive features to reveal L1-specific transfer effects. We make our data and code publicly available for further study.

en cs.CL

Halaman 1 dari 143172