Hasil untuk "Greek language and literature. Latin language and literature"

Menampilkan 20 dari ~2863500 hasil · dari DOAJ, Semantic Scholar, CrossRef, arXiv

JSON API
S2 Open Access 2026
Semantic Similarities and Differences of Terms in The English And Uzbek Languages

Khayitova Alokhon Ilyosbek qizi

This article examines the semantic similarities and differences of terms in the English and Uzbek languages from a comparative linguistic perspective. The study aims to identify how terminological units function, develop, and acquire meaning in two languages that belong to different language families and typological systems. Special attention is given to the processes of term formation, semantic shifts, borrowing, and adaptation in scientific and technological discourse. The research employs comparative, descriptive, and semantic analysis methods to reveal both universal and language-specific features of terminology. The findings indicate that English and Uzbek share a number of semantic parallels in international scientific terminology due to globalization and active borrowing, particularly from Latin and Greek sources via English and Russian. However, significant differences are observed in word-formation models, semantic transparency, polysemy, and the degree of terminological standardization. Uzbek tends to preserve agglutinative morphological patterns and semantic motivation, whereas English terminology often demonstrates higher levels of lexicalization and structural compression. The article concludes that understanding these similarities and differences is essential for accurate translation, effective terminology management, and the development of Uzbek scientific language in the context of global integration.

S2 Open Access 2026
Relative Clauses in Proto-Indo-European

Krishnan J. Ram-Prasad

Although the unattested language of Proto-Indo-European has been studied for over 200 years, the greater part of this literature has focused on its phonology and morphology, with comparatively little known of its syntax. This book aims to redress the balance by reconstructing the syntax of relative clauses. It examines evidence from a wide range of archaic Indo-European languages, analysing them through the lens of generative linguistic theory. It also explains the methodological challenges of syntactic reconstruction and how they may be tackled. Ram-Prasad also alights on a wide range of points of comparative interest, including pronominal morphology, discourse movement and Wackernagel's Law. This book will appeal to classicists interested in understanding the Latin and Greek languages in their Indo-European context, as well as to trained comparative philologists and historical linguists with particular interests in syntax and reconstruction.

S2 Open Access 2026
What was that you said?

B. Rubin, Dominic A. Fitzgerald

In this article the curmudgeonly authors take a broad strokes reflection upon the misuse of language in the medical literature through the lens of paediatric pulmonology. The paper addresses the cult of conciseness as reflected in the explosion of ambiguous acronyms, the dismantling of the Latin and Greek origins of medical terminology, and the insatiable desire to change taxonomy seemingly for change sake. The paper serves as a call to arms for those seasoned clinicians of a certain age to push back against the tic-toc approach to learning in the literature and at the bedside in modern paediatric respiratory medicine.

S2 Open Access 2025
Semantics of Ancient Words and Expressions in the Poetry of M. Kuzmin

Evgeniya Litinskaya

The article examines the use of untransliterated lexemes, idioms and phraseological combinations, quotations from monuments of classical literature, and reminiscences of ancient literature in the works of various genres written by the poet in different years; the expression of the Greek-Latin language culture in Kuzmin’s poetics is clarified. The Greek and Latin inclusions perform not only structural and intertextual functions, but also enrich the content, filling the central images of the works with “eternal” semantics. Each of the considered expressions (Sine sole sileo, Victori Duci, et coetera, Θάλασσα, Ἄβραξας, Pax Romana, Fides Apostolica…, Orbis pictus, Natura naturans…, Ultima Thule) has a unique meaning: from metaphors of creativity and the image of a spiritual mentor to an erotic allusion, a symbol of secret knowledge, imperial myth, confessional motif and cultural memory. Kuzmin’s sources of Greek and Latin phraseology included epic poems and lyrical texts written by ancient authors (Virgil, Horace), Greek historiography (Herodotus, Xenophon), idiomatic expressions assimilated from Latin in Russian poetic works (A. Fet, K. Balmont), as well as the Philosophical thesaurus of European thought of the New Age (B. Spinoza) based on the Latin lexis. Kuzmin’s interest in antiquity was preserved throughout his creative path, as manifested in various forms and under various circumstances, and the Greek and Latin languages, along with the languages of modern Romanesque cultures, are the foundation of hislanguage skills.

S2 Open Access 2025
IN MEMORY OF THE SCHOLAR: SCIENTIFIC HERITAGE OF CLASSICAL PHILOLOGIST PAVLENKO LEONID VASILYEVICH

E. Polhovskaya, E. Mazina

The object of research in the article is scientific and methodological heritage of Crimean classical philologist Pavlenko Leonid Vasilyevich. The research presented allows to estimate the scholar’s important input into the development of national Hellenistic studies and studies of the Greek language (Ancient and Modern Greek) in Crimea. L.V. Pavlenko’s philological researches comprise the issues of Ancient Greek comedy (newly-found text by Menander in particular), Byzantian Christian literature and its influence on Slavic literature. The activity of the scholar in writing the textbooks in the Ancient Greek and Latin languages, textbooks and methodological materials in Byzantium literature can be called titanic and selfless.

S2 Open Access 2025
ANALIZA LEXICOGRAFICĂ A ÎMPRUMUTURILOR LEXICALE DE ORIGINE FRANCEZĂ ÎN DEX2 ‒ 2016 (I)

Lorena KAIZER-PORUMB

Contemporary Romanian vocabulary is currently heterogeneous, which is due to the many linguistic influences it has suffered along the times. While, in the old ages of the history of literary Romanian language, Slavic, Greek, Neo-Greek, Russian, Hungarian etc. lexical elements helped enrich the internal structure of the vocabulary, the latter has been reconfigured as of the modern period, since previous linguistic models are replaced by Romance ones. Against this background, loanwords from scholarly Latin, which were taken over directly by the other Romance languages, enter Romanian language by means of French, Italian, etc., accelerating the modernisation of the lexicon, which is coined as re-latinisation in scientific literature.

S2 Open Access 2025
The Historical Formation And Development Of Pedagogical Terminology In Karakalpak And English Languages

Sultanova Dilfuza Kamalovna

This article examines the historical formation and development of pedagogical terminology in Karakalpak and English, emphasizing the linguistic, cultural, and social factors that shaped their evolution. Although the two languages belong to distinct linguistic families, both have developed rich terminological systems that reflect their educational philosophies, national identities, and scientific influences. The study highlights how Karakalpak pedagogical terminology emerged from traditional cultural values, oral literature, and later Soviet and global scientific influence, whereas English terminology evolved through extensive borrowing from Latin, Greek, and French, as well as the contributions of major modern educational theorists. The analysis also compares pedagogically oriented proverbs in both languages, demonstrating how cultural values are transmitted through language. Ultimately, the article argues that despite structural and historical differences, both Karakalpak and English pedagogical terminologies reveal universal tendencies driven by globalization, technological development, and modern educational reforms.

S2 Open Access 2025
Etimologías de los taxones supraespecíficos empleados en el filo Bryozoa

Oscar Reverter-Gil, P. Romero

The literature in Spanish related to the meanings of the taxa is scarce, so it is necessary to value the etymological studies. In this work, 1238 names of supraspecific taxa of bryozoans are compiled. For each of the genera, the original reference, the etymology and meaning of the name, and the source from which it was obtained is included, if it is different from the original. Most of the suprageneric taxa derive directly from some genus, so they do not have their own etymology. Etymological roots can be classified into different categories. In bryozoans, most of them refer to the anatomy or morphology of zooids and colonies. Certain suffixes, such as –pora, –ella, –cella or –zoon, are also very frequently used to build new names. Other common categories are pre-existing genus names, eponyms, numbers, geography, mythology, plants, and habitats. The vast majority of the names have been constructed from Latin and ancient Greek words, while only seven generic names are based on a different language.

arXiv Open Access 2025
Large language models have learned to use language

Gary Lupyan

Acknowledging that large language models have learned to use language can open doors to breakthrough language science. Achieving these breakthroughs may require abandoning some long-held ideas about how language knowledge is evaluated and reckoning with the difficult fact that we have entered a post-Turing test era.

en cs.CL
arXiv Open Access 2025
Natural Language Generation

Emiel van Miltenburg, Chenghua Lin

This article provides a brief overview of the field of Natural Language Generation. The term Natural Language Generation (NLG), in its broadest definition, refers to the study of systems that verbalize some form of information through natural language. That information could be stored in a large database or knowledge graph (in data-to-text applications), but NLG researchers may also study summarisation (text-to-text) or image captioning (image-to-text), for example. As a subfield of Natural Language Processing, NLG is closely related to other sub-disciplines such as Machine Translation (MT) and Dialog Systems. Some NLG researchers exclude MT from their definition of the field, since there is no content selection involved where the system has to determine what to say. Conversely, dialog systems do not typically fall under the header of Natural Language Generation since NLG is just one component of dialog systems (the others being Natural Language Understanding and Dialog Management). However, with the rise of Large Language Models (LLMs), different subfields of Natural Language Processing have converged on similar methodologies for the production of natural language and the evaluation of automatically generated text.

en cs.CL
arXiv Open Access 2025
Model Merging to Maintain Language-Only Performance in Developmentally Plausible Multimodal Models

Ece Takmaz, Lisa Bylinina, Jakub Dotlacil

State-of-the-art vision-and-language models consist of many parameters and learn from enormous datasets, surpassing the amounts of linguistic data that children are exposed to as they acquire a language. This paper presents our approach to the multimodal track of the BabyLM challenge addressing this discrepancy. We develop language-only and multimodal models in low-resource settings using developmentally plausible datasets, with our multimodal models outperforming previous BabyLM baselines. One finding in the multimodal language model literature is that these models tend to underperform in \textit{language-only} tasks. Therefore, we focus on maintaining language-only abilities in multimodal models. To this end, we experiment with \textit{model merging}, where we fuse the parameters of multimodal models with those of language-only models using weighted linear interpolation. Our results corroborate the findings that multimodal models underperform in language-only benchmarks that focus on grammar, and model merging with text-only models can help alleviate this problem to some extent, while maintaining multimodal performance.

en cs.CL, cs.CV
arXiv Open Access 2025
Small Language Models Reshape Higher Education: Courses, Textbooks, and Teaching

Jian Zhang, Jia Shao

While large language models (LLMs) have introduced novel paradigms in science and education, their adoption in higher education is constrained by inherent limitations. These include a tendency to produce inaccuracies and high computational requirements, which compromise the strict demands for accurate and reliable knowledge essential in higher education. Small language models (MiniLMs), by contrast, offer distinct advantages in professional education due to their lightweight nature and precise retrieval capabilities. This research takes "Atmospheric Physics" as an example. We established a specialized corpus and image repository by gathering over 550,000 full-text PDFs from over 130 international well-respected journals in Earth and environmental science. From this collection, we extracted over 100 million high-quality sentence-level corpus and more than 3 million high-resolution academic images. Using MiniLMs, these resources were organized into a high-dimensional vector library for precise retrieval and efficient utilization of extensive educational content. Consequently, we systematically redesigned the courses, textbooks, and teaching strategies for "Atmospheric Physics" based on MiniLMs. The course is designed as a "interdisciplinary-frontier" system, breaking down traditional boundaries between atmospheric science, space science, hydrology, and remote sensing. Teaching materials are transformed from static, lagging text formats into a dynamic digital resource library powered by MiniLM. For teaching methods, we have designed a question-based learning pathway. This paradigm promotes a shift from passive knowledge transfer to active cognitive development. Consequently, this MiniLM-driven "Atmospheric Physics" course demonstrates a specific avenue for "AI for education".

en physics.ed-ph, cs.CL
arXiv Open Access 2024
Pragmatic Competence Evaluation of Large Language Models for the Korean Language

Dojun Park, Jiwoo Lee, Hyeyun Jeong et al.

Benchmarks play a significant role in the current evaluation of Large Language Models (LLMs), yet they often overlook the models' abilities to capture the nuances of human language, primarily focusing on evaluating embedded knowledge and technical skills. To address this gap, our study evaluates how well LLMs understand context-dependent expressions from a pragmatic standpoint, specifically in Korean. We use both Multiple-Choice Questions (MCQs) for automatic evaluation and Open-Ended Questions (OEQs) assessed by human experts. Our results show that GPT-4 leads with scores of 81.11 in MCQs and 85.69 in OEQs, closely followed by HyperCLOVA X. Additionally, while few-shot learning generally improves performance, Chain-of-Thought (CoT) prompting tends to encourage literal interpretations, which may limit effective pragmatic inference. Our findings highlight the need for LLMs to better understand and generate language that reflects human communicative norms.

en cs.CL
arXiv Open Access 2024
Part-of-Speech Tagger for Bodo Language using Deep Learning approach

Dhrubajyoti Pathak, Sanjib Narzary, Sukumar Nandi et al.

Language Processing systems such as Part-of-speech tagging, Named entity recognition, Machine translation, Speech recognition, and Language modeling (LM) are well-studied in high-resource languages. Nevertheless, research on these systems for several low-resource languages, including Bodo, Mizo, Nagamese, and others, is either yet to commence or is in its nascent stages. Language model plays a vital role in the downstream tasks of modern NLP. Extensive studies are carried out on LMs for high-resource languages. Nevertheless, languages such as Bodo, Rabha, and Mising continue to lack coverage. In this study, we first present BodoBERT, a language model for the Bodo language. To the best of our knowledge, this work is the first such effort to develop a language model for Bodo. Secondly, we present an ensemble DL-based POS tagging model for Bodo. The POS tagging model is based on combinations of BiLSTM with CRF and stacked embedding of BodoBERT with BytePairEmbeddings. We cover several language models in the experiment to see how well they work in POS tagging tasks. The best-performing model achieves an F1 score of 0.8041. A comparative experiment was also conducted on Assamese POS taggers, considering that the language is spoken in the same region as Bodo.

en cs.CL, cs.AI
arXiv Open Access 2024
Morphological evaluation of subwords vocabulary used by BETO language model

Óscar García-Sierra, Ana Fernández-Pampillón Cesteros, Miguel Ortega-Martín

Subword tokenization algorithms used by Large Language Models are significantly more efficient and can independently build the necessary vocabulary of words and subwords without human intervention. However, those subwords do not always align with real morphemes, potentially impacting the models' performance, though it remains uncertain when this might occur. In previous research, we proposed a method to assess the morphological quality of vocabularies, focusing on the overlap between these vocabularies and the morphemes of a given language. Our evaluation method was built on three quality measures, relevance, cohesion, and morphological accuracy, and a procedure for their assessment. By applying this method to vocabularies created by three subword tokenization algorithms, BPE, Wordpiece, and Unigram, we concluded that these vocabularies generally exhibit very low morphological quality. In this article, we apply this evaluation to the tokenizer of BETO, a BERT language model trained on large Spanish corpora. This evaluation, along with our previous results, helped us conclude that its vocabulary has a low morphological quality, and we also found that training the tokenizer in a larger corpus does not improve the morphological quality of the generated vocabulary. Additionally, this evaluation helps clarify the algorithm used by the tokenizer, that is, Wordpiece, given the inconsistencies between the authors' claims and the model's configuration.

en cs.CL, cs.AI
arXiv Open Access 2024
Manipulating language models' training data to study syntactic constraint learning: the case of English passivization

Cara Su-Yi Leong, Tal Linzen

Grammatical rules in natural languages are often characterized by exceptions. How do language learners learn these exceptions to otherwise general patterns? Here, we study this question through the case study of English passivization. While passivization is in general quite productive, there are cases where it cannot apply (cf. the following sentence is ungrammatical: *One hour was lasted by the meeting). Using neural network language models as theories of language acquisition, we explore the sources of indirect evidence that a learner can leverage to learn whether a verb can be passivized. We first characterize English speakers' judgments of exceptions to the passive, and confirm that speakers find some verbs more passivizable than others. We then show that a neural network language model's verb passivizability judgments are largely similar to those displayed by humans, suggesting that evidence for these exceptions is available in the linguistic input. Finally, we test two hypotheses as to the source of evidence that language models use to learn these restrictions: frequency (entrenchment) and semantics (affectedness). We do so by training models on versions of the corpus that have had sentences of the types implicated by each hypothesis removed, altered, or introduced. We find support for both hypotheses: entrenchment and affectedness make independent contributions to a verb's passivizability. From a methodological point of view, this study highlights the utility of altering a language model's training data for answering questions where complete control over a learner's input is vital.

en cs.CL
arXiv Open Access 2024
A Survey of Large Language Models for Arabic Language and its Dialects

Malak Mashaabi, Shahad Al-Khalifa, Hend Al-Khalifa

This survey offers a comprehensive overview of Large Language Models (LLMs) designed for Arabic language and its dialects. It covers key architectures, including encoder-only, decoder-only, and encoder-decoder models, along with the datasets used for pre-training, spanning Classical Arabic, Modern Standard Arabic, and Dialectal Arabic. The study also explores monolingual, bilingual, and multilingual LLMs, analyzing their architectures and performance across downstream tasks, such as sentiment analysis, named entity recognition, and question answering. Furthermore, it assesses the openness of Arabic LLMs based on factors, such as source code availability, training data, model weights, and documentation. The survey highlights the need for more diverse dialectal datasets and attributes the importance of openness for research reproducibility and transparency. It concludes by identifying key challenges and opportunities for future research and stressing the need for more inclusive and representative models.

en cs.CL, cs.AI

Halaman 4 dari 143175