Hasil "Ural-Altaic languages"

DOAJ Open Access 2025

Proverbs and sayings in the diwan of Shah Ismail Khatai

Ali

The poets of Divan Literature included proverbs and idioms in their works in order to express what they wanted to say in a short and concise way, to enhance the effect of the words and to convince and warn the reader. For this reason, divans are among the important sources in terms of proverbs and idioms as well as the rich language, religion and literature material they contain. In this study, which is divided into sections, the proverbs and idioms in the divan of Shah Ismail Hatâyî (1487-1524), one of the important ruler poets of Turkish literature in the 16th century Anatolia, Afghanistan, Azerbaijan, Iraq, Syria and Turkistan fields, are emphasized. In the study, first of all, information was given about Shah Ismail's life, works, studies on these works, proverbs, which are stereotyped sayings, and idioms, which are groups of words that are generally used with metaphorical meanings to express a situation more effectively. Then, the proverbs and idioms in the divan of Shah Ismail Khatayi and identified by us were classified within themselves. Finally, the findings obtained as a result of the examination of the idioms and proverbs in the divan of Shah Ismail, which are at the forefront of the characteristics of the society to which the ruler belongs, are listed.

Language and Literature, Ural-Altaic languages

Detail DOI Sumber

DOAJ Open Access 2025

„Mo südda ligob mo sees.” Kirjakeelest kirjanduskeeleks

Liina Lukas

Although Estonian written languages had existed since at least the first half of the 17th century, for a long time they were not used by native speakers either as writers or as readers. It was only with the arrival of the Moravian Brethren movement that a new attitude toward the written word emerged, making it easier for both writers and readers to adopt Estonian as a written language – previously perceived mainly as an instrument of colonization. The origins of Estonian narrative literature lie in the Moravian movement. On the initiative of the Brethren in Urvaste, the first Estonian-language storybook intended for an Estonian readership was published in 1737: the South Estonian Kolm kaunist Waggausse Eenkojut (“Three Beautiful Examples of Piety”), written by the local pastor Johann Christian Quandt. The book contained three stories: the lives of the shepherd Henning Kuuse, the small craftsman Jörgel, and the maiden Armelle Nikolas. This work introduced into Estonian literature a new kind of narrative scheme – the life story or conversion narrative – in which the central event was an awakening: through the impact of a certain experience, a person re-evaluates their former life and embarks on a new, transformed life in Christ. The protagonists were people of humble origin with whom the Estonian reader could easily identify. The literary-historical value of the collection lies in its recognition of emotional life and in its search for new stylistic means to express it – in other words, in the development of a literary language. The awakening narrative model proved remarkably durable, offering a model for depicting human destiny even outside a strictly religious teleology. In the first half of the 19th century, during a new wave of devotional literature in Estonian and Latvian writing, pietistic religiosity was fused with sentimentalism, and biblical stories were interwoven with traditional (hagiographic) legends and literary narrative plots. For example, the fate of Genoveva, daughter of the King of Brabant, is presented in line with the pietistic life-story model, though the heroine’s transformation stems not from an inner awakening but from external circumstances. Sentimentalist stories appealed to the reader’s compassion and touched the heart, often proving more persuasive than the competing popular-educational storybooks that conveyed rationalist teachings and promoted a hierarchical view of the world order. The pietistic-sentimentalist narrative carried a more democratic message: those who follow their hearts attain salvation – at least in the world to come – and before the law of the heart all people are equal.

Other Finnic languages and dialects

Detail DOI Sumber

arXiv Open Access 2025

Have Object-Oriented Languages Missed a Trick with Class Function and its Subclasses?

Lloyd Allison

Compared to functions in mathematics, functions in programming languages seem to be under classified. Functional programming languages based on the lambda calculus famously treat functions as first-class values. Object-oriented languages have adopted ``lambdas'', notably for call-back routines in event-based programming. Typically a programming language has functions, a function has a type, and some functions act on other functions and/or return functions but there is generally a lack of (i) ``class Function'' in the OO sense of the word class and particularly (ii) subclasses of Function for functions having specific properties. Some such classes are presented here and programmed in some popular programming languages as an experimental investigation into OO languages missing this opportunity.

en cs.PL

Detail Sumber

arXiv Open Access 2025

A substitution lemma for multiple context-free languages

Andrew Duncan, Murray Elder, Lisa Frenkel et al.

We present a new criterion for proving that a language is not multiple context-free, which we call a Substitution Lemma. We apply it to show a sample selection of languages are not multiple context-free, including the word problem of $F_2\times F_2$. Our result is in contrast to Kanazawa et al. [2014, Theory Comput. Syst.] who proved that it was not possible to generalise the standard pumping lemma for context-free languages to multiple context-free languages, and Kanazawa [2019, Inform. and Comput.] who showed a weak variant of generalised Ogden's lemma does not apply to multiple context-free languages. We also record that groups with multiple context-free word problem have decidable rational subset membership problem.

en cs.FL, math.GR

Detail Sumber

arXiv Open Access 2025

The Relative Monadic Metalanguage

Jack Liell-Cock, Zev Shirazi, Sam Staton

Relative monads provide a controlled view of computation. We generalise the monadic metalanguage to a relative setting and give a complete semantics with strong relative monads. Adopting this perspective, we generalise two existing program calculi from the literature. We provide a linear-non-linear language for graded monads, LNL-RMM, along with a semantic proof that it is a conservative extension of the graded monadic metalanguage. Additionally, we provide a complete semantics for the arrow calculus, showing it is a restricted relative monadic metalanguage. This motivates the introduction of ARMM, a computational lambda calculus-style language for arrows that conservatively extends the arrow calculus.

en cs.PL, math.CT

Detail DOI Sumber

S2 Open Access 2025

TOPONYMIC ASPECT OF THE HISTORY OF THE DEVELOPMENT OF MINERAL DEPOSITS IN THE SOUTHERN URALS

Yuri S. Kostylev

The article analyzes the official toponymic systems of mine deposits located in the territory of the modern Chelyabinsk region. It considers the names of the objects of the Ilmen Reserve, Kochkar gold deposit, Astafyevsky and Svetlinsky deposits of rock crystal, and various deposits of Bredinsky district. Specialized scientific literature of geological, historical and topographic profile was used as a source of material. The volume of this material, which includes names related to the extraction of completely different minerals in a relatively large area over a significant chronological period, and a significant number of analyzed toponyms (about 950 units), makes it possible to systematically present the history of the development of the natural resources of the Southern Urals, as reflected in the language. It is concluded that extralinguistic factors of a natural-geological, and most importantly, socio-economic nature, are of great importance and have a decisive influence on the formation and system of the object names. The toponymic system allows reconstructing the social portrait of the nominating team with considerable accuracy. The prospector reflects in the name characteristics of the object that are essential for the work, the owner of the capital reflects the organizational specifics, but in such systems the share of anthroponymic entities is also high. The Soviet mining engineer points out the general geological features of an object that are important for mining, or simply marks its location by metonymic transfer. At the same time, the ideological attitudes are not significant.

en

Detail DOI Sumber

S2 Open Access 2025

Yakut Reindeer Breeding Vocabulary (a contrastive aspect)

N. I. Danilova, F. N. Diachkovskiy

The paper presents a semantic analysis of the Yakut lexemes expressing reindeers of a certain sex and/or age as compared to the corresponding lexemes in Turkic, Tungusic, and Mongolic languages. The study is based on explanatory, bilingual, and etymological dictionaries of the languages of interest. Descriptive, comparative, and contrastive methods as well as purposive sampling were used. Our research shows that there are two main ways to express a reindeer’s sex and age; one of them is heteronomy, or lexical suppletion, e.g. ньуоҕурхана ‘a four-year-old doe’. This method is characteristic of the respective vocabulary in Tungusic languages. The second method of word formation, known as lexical-syntactic, uses combinations of two or more words: буур таба ‘male deer’, тыһы таба taba ‘female deer’, etc. These combinations have a common origin in Turkic languages and are widely used in the literary Yakut language. The Yakut language uses composite words including the stems meaning ‘male’ or ‘female’ as their first parts; e.g. тыһы ‘female’, атыыр ‘male’ (тыһы үөҥэс ‘a three-year-old female reindeer’, атыыр таба ‘a male reindeer’).Our analysis has revealed semantic similarities and differences of each lexical unit with the meaning ‘a reindeer of a certain age or sex’ in different Altaic languages. It has turned out that the names of a reindeer’s sex, as well as those used for young animals are of Turkic origin. We have also found some common Altaic stems. Specified age names of male and female reindeers in Yakut dialects are usually regionally restricted and are often words borrowed from Evenki or Even. Further research may involve other Turkic and Mongolic languages in order to study the reindeer vocabulary in both historical and typological perspectives.

en

Detail DOI Sumber

S2 Open Access 2024

Text Summarization and Temporal Learning Models Applied to Portuguese Fake News Detection in a Novel Brazilian Corpus Dataset

Gabriel Lino Garcia, P. H. Paiola, D. Jodas et al.

5 sitasi en Computer Science

Detail Sumber

S2 Open Access 2024

Prompting as Panacea? A Case Study of In-Context Learning Performance for Qualitative Coding of Classroom Dialog

Ananya Ganesh, Chelsea Chandler, Sidney D'Mello et al.

5 sitasi en Computer Science

Detail Sumber

S2 Open Access 2024

Summary of the Visually Grounded Story Generation Challenge

Xudong Hong, Asad Sayeed, Vera Demberg

Recent advancements in vision-and-language models have opened new possibilities for nat-ural language generation, particularly in generating creative stories from visual input. We thus host an open-sourced shared task, Visually Grounded Story Generation (VGSG), to explore whether these models can create coherent, diverse, and visually grounded narratives. This task challenges participants to generate coherent stories based on sequences of images, where characters and events must be grounded in the images provided. The task is structured into two tracks: the Closed track with constraints on fixed visual features and the Open track which allows all kinds of models. We propose the first two-stage model using GPT-4o as the baseline for the Open track that first generates descriptions for the images and then creates a story based on those descriptions. Human and automatic evaluations indicate that: 1) Retrieval augmentation helps generate more human-like stories, and 2) Large-scale pre-trained LLM improves story quality by a large margin; 3) Traditional automatic metrics can not capture the overall quality. 1

5 sitasi en

Detail DOI Sumber

S2 Open Access 2024

Iteratively Calibrating Prompts for Unsupervised Diverse Opinion Summarization

Jian Wang, Yuqing Sun, Yanjie Liang et al.

. Diverse opinion summarization aims to generate a sum-mary that captures multiple opinions in texts. Although large language models (LLMs) have become the main choice for this task, the performance is highly depend on prompts. In this paper, we pro-pose a self-evaluation based prompt calibration framework to stimulate LLM for generating high quality summary. It adopts the reinforcement learning mechanism to calibrate prompts for maximizing the reward of summary. The framework contains three parts. In the prompt construction part, we design the prompt that contains topic, task instruction and key opinion reference. The topic indicates the main focus of documents, the instruction describes the task with nat-ural language and the key opinion reference is the explicit constraint on the expected opinions. In the reward part, for each summary, its coverage score and diversity score are used to represent the semantic coverage to the source documents and the inter opinion differences, respectively. The prompt calibration part selects the sentences in generated summaries to calibrate the prompts for the next iteration. With this framework, we use a LLM with 7B parameters to generate summaries, which outperforms large GPT-4 and multiple strong base-lines. The ablation studies indicate the effectiveness of the iterative calibration process. We analyze the opinion difference in terms of the tendencies of sentences in summaries and use the Natural Language Inference (NLI)-based method to evaluate the faithfulness of summaries. Experiment results show that our method generates summaries with high opinion difference and faithfulness.

5 sitasi en Computer Science

Detail DOI Sumber

S2 Open Access 2023

From Query Tools to Causal Architects: Harnessing Large Language Models for Advanced Causal Discovery from Data

Taiyu Ban, Lyuzhou Chen, Xiangyu Wang et al.

38 sitasi en Computer Science

Detail DOI Sumber

S2 Open Access 2024

An Incomplete Loop: Deductive, Inductive, and Abductive Learning in Large Language Models

Emmy Liu, Graham Neubig, Jacob Andreas

4 sitasi en Computer Science

Detail DOI Sumber

S2 Open Access 2024

Upaya at ArabicNLU Shared-Task: Arabic Lexical Disambiguation using Large Language Models

P. Rajpoot, A. Jindal, Ankur Parikh

Disambiguating a word’s intended meaning(sense) in a given context is important in Nat-ural Language Understanding (NLU). WSDaims to determine the correct sense of ambigu-ous words in context. At the same time, LMD(a WSD variation) focuses on disambiguatinglocation mention. Both tasks are vital in Nat-ural Language Processing (NLP) and informa-tion retrieval, as they help correctly interpretand extract information from text. Arabic ver-sion is further challenging because of its mor-phological richness, encompassing a complexinterplay of roots, stems, and affixes. This pa-per describes our solutions to both tasks, em-ploying Llama3 and Cohere-based models un-der Zero-Shot Learning and Re-Ranking, re-spectively. Both the shared tasks were partof the second Arabic Natural Language Pro-cessing Conference co-located with ACL 2024.Overall, we achieved 1st rank in the WSD task(accuracy 78%) and 2nd rank in the LMD task(MRR@1 0.59)

3 sitasi en Computer Science

Detail DOI Sumber

DOAJ Open Access 2024

Impersonal or passive? Some approaches to analysis (based on the Veps and Estonian languages)

Polina Oskolskaia

This article deals with constructions of the form “to be + passive participle” in Veps and Estonian. Depending on the syntactic context, these constructions can be considered either impersonal or passive. Cases where the syntactic properties of the context do not allow us to determine whether a construction is impersonal or passive are the main object of the study. The article proposes two approaches to analysing these cases, using a corpus study in Veps and the analysis of a native speaker survey in Estonian. Analysis of the Veps data shows that 66% of the sample collected cannot be unambiguously attributed to the impersonal or the passive construction. At the same time, there is a correlation between polarity and construction choice: the passive occurs more often in negative contexts and the impersonal occurs more often in affirmative contexts. The results of the Estonian survey show that 88% of constructions are interpreted as passive. Verb tense and stative/dynamic semantics do not correlate with construction type, but there is a relationship between the preverbal position of the nominative argument and the passive construction. It was assumed that in the impersonal construction the argument has a special status and is not a prototypical object but has both object and subject features. Kokkuvõte. Polina Oskolskaia: Kas impersonaal või passiiv? Mõned lähenemised umbisikuliste konstruktsioonide analüüsile (vepsa ja eesti keele materjali põhjal). Artiklis käsitletakse „olla + passiivne partitiiv -tud“ konstruktsioone vepsa ja eesti keeles. Sõltuvalt süntaktilistest tingimustest võivad need konstruktsioonid olla impersonaalsed või passiivsed. Artiklis uuritakse neid olukordi, kus konteksti süntaktilised omadused ei võimalda täpselt kindlaks teha, kas konstruktsioon on impersonaalne või passiivne. Tehakse ettepanek kaaluda kahte lähenemist selliste olukordade analüüsimiseks vepsa keele korpusuuringu ja eesti keele emakeelekõnelejate küsitluse analüüsi näitel. Vepsa andmete analüüs näitab, et 66% kogutud valimist ei saa üheselt seostada ei impersonaali ega passiivi konstruktsiooniga. Samal ajal esineb korrelatsioon polaarsuse ja konstruktsioonivaliku vahel: passiiv esineb sagedamini eitavates kontekstides ja impersonaal esineb sagedamini jaatavates kontekstides. Eesti keele uuringu tulemused näitavad, et 88% konstruktsioonidest tõlgendatakse passiivina. Verbi aeg ja statiivne / dünaamiline semantika ei korreleeru konstruktsioonitüübiga, kuid on olemas seos nominatiivse argumendi preverbaalse positsiooni ja passiivkonstruktsiooni vahel. Eeldati, et impersonaalses konstruktsioonis on argumendil eriline staatus ja see ei ole prototüüpne objekt, vaid sellel on nii objekti kui ka subjekti tunnused.

Philology. Linguistics, Finnic. Baltic-Finnic

Detail DOI Sumber

DOAJ Open Access 2024

Reflections on the term 'stem' based on Turkish grammar resources in Turkey

Mustafa Kemal, Arife Ece

This study analyzed issues surrounding the concept of "stem" in Turkish linguistics. It began by establishing a general framework based on existing definitions of "root" and "stem" in the literature. Subsequently, it analyzed the treatment and examples of "stem" presented in Turkish grammar books published from 1940 to the present. The analysis revealed inconsistencies in the scope and definition of "stem," with terms like "root," "base," "derived root," "closed stem" and "closed form unit" often used interchangeably in examples. These inconsistencies stem not only from varying terminological preferences but also from the challenge of handling words derived from roots that do not exist independently. While some researchers propose adopting the term "base" to address these limitations, modern linguists increasingly favor terms like "form unit" and "word unit."

Language and Literature, Ural-Altaic languages

Detail DOI Sumber

arXiv Open Access 2024

Multi-Lingual Development & Programming Languages Interoperability: An Empirical Study

Tsvi Cherny-Shahar, Amiram Yehudai

As part of a research on a novel in-process multiprogramming-language interoperability system, this study investigates the interoperability and usage of multiple programming languages within a large dataset of GitHub projects and Stack Overflow Q\&A. It addresses existing multi-lingual development practices and interactions between programming languages, focusing on in-process multi-programming language interoperability. The research examines a dataset of 414,486 GitHub repositories, 22,156,001 Stack Overflow questions from 2008-2021 and 173 interoperability tools. The paper's contributions include a comprehensive dataset, large-scale analysis, and insights into the prevalence, dominant languages, interoperability tools, and related issues in multi-language programming. The paper presents the research results, shows that C is a central pillar in programming language interoperability, and outlines \emph{simple interoperability} guidelines. These findings and guidelines contribute to our multi-programming language interoperability system research, also laying the groundwork for other systems and tools by suggesting key features for future interoperability tools.

en cs.PL

Detail Sumber

arXiv Open Access 2024

Navigational hierarchies of regular languages

Thomas Place, Marc Zeitoun

We study the class of star-free languages. A long-standing goal is to classify them by the complexity of their descriptions. The most influential research effort involves concatenation hierarchies, which measure alternations between ``complement'' and ``union plus concatenation''. We explore alternative hierarchies that also stratify star-free languages. They are built with an operator $C\mapsto TL(C)$. From an input class $C$, it produces a larger one $TL(C)$, consisting of all languages definable in a variant of unary temporal logic, where temporal modalities depend on $C$. Level $n$ in the navigational hierarchy of basis $C$ is constructed by applying this operator $n$ times to $C$. As bases $G$, we focus on group languages and natural extensions thereof, denoted $G^+$. We prove that the navigational hierarchies of bases $G$ and $G^+$ are strictly intertwined and conduct a thorough investigation of their relationships with concatenation hierarchies. We also look at two problems on classes of languages: membership (decide if a language is in the class) and separation (decide, for two languages $L_1,L_2$, if there is a language $K$ in the class with $L_1\subseteq K$ and $L_2\cap K=\emptyset$). We prove that if separation is decidable for $G$, then so is membership for level \emph{two} in the navigational hierarchies of bases $G$ and $G^+$. We take a look at the trivial class $ST=\{\emptyset,A^*\}$. For the bases $ST$ and $ST^+$, the levels \emph{one} are standard variants of unary temporal logic. The levels \emph{two} correspond to variants of two-variable logic, investigated recently by Krebs, Lodaya, Pandya and Straubing. We solve one of their conjectures. We also prove that for these two bases, level \emph{two} has decidable \emph{separation}. Combined with earlier results on the operator $C\mapsto TL(C)$, this implies that level \emph{three} has decidable membership.

en cs.FL

Detail Sumber

arXiv Open Access 2024

Positional $ω$-regular languages

Antonio Casares, Pierre Ohlmann

In the context of two-player games over graphs, a language $L$ is called positional if, in all games using $L$ as winning objective, the protagonist can play optimally using positional strategies, that is, strategies that do not depend on the history of the play. In this work, we describe the class of parity automata recognising positional languages, providing a complete characterisation of positionality for $ω$-regular languages. As corollaries, we establish decidability of positionality in polynomial time, finite-to-infinite and 1-to-2-players lifts, and show the closure under union of prefix-independent positional objectives, answering a conjecture by Kopczyński in the $ω$-regular case.

en cs.FL, cs.GT

Detail DOI Sumber

S2 Open Access 2023

Neural Speech Synthesis for Austrian Dialects with Standard German Grapheme-to-Phoneme Conversion and Dialect Embeddings

Lorenz Gutscher, Michael Pucher, Victor García

For languages where extensive audio data and text transcriptions are available, text-to-speech (TTS) systems have show-cased the ability to generate speech that closely resembles nat-ural human speech. However, the development of TTS systems for dialects and language varieties poses challenges such as limited data availability and strong regional variations. This paper presents a TTS system tailored for under-resourced language varieties spoken in Austrian regions. The system is built upon the FastSpeech 2 architecture and includes modifications to incorporate dialect embeddings for training and inference. It is demonstrated that employing dialect embeddings and a standard German grapheme-to-phoneme conversion is effective in modeling language varieties and provides means to shift a person’s spoken variety from one to another. This allows for the generation of regional standards for dialect speakers or the generation of dialect speech with the voice of a standard speaker. The findings unveil new possibilities and applications in other multilingual contexts where shared characteristics within the language or dialect embedding space can be leveraged.

3 sitasi en

Detail DOI Sumber

Hasil untuk "Ural-Altaic languages"