Hasil "Slavic languages. Baltic languages. Albanian languages"

DOAJ Open Access 2026

Artistic Features of Eastern Chronotope in Andrei Volos's Novel “Return to Panjrud”

H. F. H. S. Badiyeh, S. V. Burdina, B. V. Kondakov

This article explores the artistic characteristics of the chronotope of the East using Andrei Volos’s novel “Return to Panjrud” as a case study. The originality of this work lies in its first-time determination of the specifics of organizing time and space within the global socioethnic category of “East,” along with an analysis of their functions in the fictional universe of the text. It is demonstrated that the chronotope of the East has a complex structure encompassing simpler chronotopes. Special attention is given to the primary spatial loci of the novel — pit, home (permanent or temporary), city — which collectively form the everyday chronotope. Additionally, it is argued that beyond the everyday chronotope, the novel presents historical space chronotope associated with the Samanid Empire era, legendary space chronotope linked to distant pasts of Central Asian peoples, and mythopoetic space chronotope, which are layered over the everyday chronotope and imbue the text with historical-philosophical depth. Ultimately, the authors conclude that key features of the Eastern chronotope in the novel include its depiction as a unique spiritual state characterizing memory space, close connection with symbolic elements found in the mythology, folklore, and literature of Central Asian peoples, and representation of temporal progression as cyclical movement implying constant return to origin.

Slavic languages. Baltic languages. Albanian languages

Detail DOI Sumber

arXiv Open Access 2026

From Separate Compilation to Sound Language Composition

Federico Bruzzone, Walter Cazzola, Luca Favalli

The development of programming languages involves complex theoretical and practical challenges, particularly when addressing modularity and reusability through language extensions. While language workbenches aim to enable modular development under the constraints of the language extension problem, one critical constraint -- separate compilation -- is often relaxed due to its complexity. However, this relaxation undermines artifact reusability and integration with common dependency systems. A key difficulty under separate compilation arises from managing attribute grammars, as extensions may introduce new attributes that invalidate previously generated abstract syntax tree structures. Existing approaches, such as the use of dynamic maps in the Neverlang workbench, favor flexibility at the cost of compile-time correctness, leading to potential runtime errors due to undefined attributes. This work addresses this issue by introducing nlgcheck, a theoretically sound static analysis tool based on data-flow analysis for the Neverlang language workbench. nlgcheck detects potential runtime errors -- such as undefined attribute accesses -- at compile time, preserving separate compilation while maintaining strong static correctness guarantees. Experimental evaluation using mutation testing on Neverlang-based projects demonstrates that nlgcheck effectively enhances robustness without sacrificing modularity or flexibility and with a level of performance that does not impede its adoption in daily development activities.

en cs.PL, cs.SE

Detail Sumber

arXiv Open Access 2025

VisCoder2: Building Multi-Language Visualization Coding Agents

Yuansheng Ni, Songcheng Cai, Xiangchao Chen et al.

Large language models (LLMs) have recently enabled coding agents capable of generating, executing, and revising visualization code. However, existing models often fail in practical workflows due to limited language coverage, unreliable execution, and lack of iterative correction mechanisms. Progress has been constrained by narrow datasets and benchmarks that emphasize single-round generation and single-language tasks. To address these challenges, we introduce three complementary resources for advancing visualization coding agents. VisCode-Multi-679K is a large-scale, supervised dataset containing 679K validated and executable visualization samples with multi-turn correction dialogues across 12 programming languages. VisPlotBench is a benchmark for systematic evaluation, featuring executable tasks, rendered outputs, and protocols for both initial generation and multi-round self-debug. Finally, we present VisCoder2, a family of multi-language visualization models trained on VisCode-Multi-679K. Experiments show that VisCoder2 significantly outperforms strong open-source baselines and approaches the performance of proprietary models like GPT-4.1, with further gains from iterative self-debug, reaching 82.4% overall execution pass rate at the 32B scale, particularly in symbolic or compiler-dependent languages.

en cs.SE, cs.AI

Detail Sumber

arXiv Open Access 2025

Can a domain-specific language improve program structure comprehension of data pipelines? A mixed-methods study

Philip Heltweg, Georg-Daniel Schwarz, Dirk Riehle

In many application domains, domain-specific languages can allow domain experts to contribute to collaborative projects more correctly and efficiently. To do so, they must be able to understand program structure from reading existing source code. With high-quality data becoming an increasingly important resource, the creation of data pipelines is an important application domain for domain-specific languages. We execute a mixed-method study consisting of a controlled experiment and a follow-up descriptive survey among the participants to understand the effects of a domain-specific language on bottom-up program understanding and generate hypotheses for future research. During the experiment, participants need the same time to solve program structure comprehension tasks, but are significantly more correct when using the domain-specific language. In the descriptive survey, participants describe reasons related to the programming language itself, such as a better pipeline overview, more enforced code structure, and a closer alignment to the mental model of a data pipeline. In addition, human factors such as less required programming experience and the ability to reuse experience from other data engineering tools are discussed. Based on these results, domain-specific languages are a promising tool for creating data pipelines that can increase correct understanding of program structure and lower barriers to entry for domain experts. Open questions exist to make more informed implementation decisions for domain-specific languages for data pipelines in the future.

en cs.PL

Detail DOI Sumber

DOAJ Open Access 2024

Language form compression in modern Russian speech

Elena M. Markova

Compressive word formation results from the active tendency to save speech effort and, accordingly, to save linguistic means, which makes its research relevant. At the lexical level, the principle of economy is realized in constriction, truncation, semantic condensation, i.e. processes based on reduction, minimization of structures. The aim of the study is to identify and fix the manifestations of this tendency in modern Russian speech, to generalize the ways of linguistic compression as mechanisms of its representation, to establish their universality and specificity in the Russian language in the pan-Slavic context. The research material is Internet resources, dictionaries of neologisms, the Russian national corpus, the speech of young people. The material was analyzed with both general scientific (relevant material collection, observation, analysis, systematization, description, interpretation), and linguistic methods (methods of word-formation, component, contextual, comparative analysis; structural-semantic method; method of modeling the derivational basis). Among extralinguistic and linguistic reasons for active compression word production, the author points out the influence of the English language as the most important factor, acting both as a donor of short lexical units and as a translator of derivation methods and mechanisms. Among the compression phenomena in modern speech, the article considers such phenomena as borrowings, collocations, composites, clipping, derivation with zero affixation, univerbation, semantic condensation. It is revealed that form compression not only strengthens its communicative function, but also, due to the pragmatic possibilities of the compression, increases its emotional-evaluative and relational functions. The analysis of the main manifestations of form compression in modern Russian speech allows us to conclude that they are universal, but at the same time have some specific features due to the specific national derivational system.

Slavic languages. Baltic languages. Albanian languages

Detail DOI Sumber

arXiv Open Access 2024

Comparing large language models and human programmers for generating programming code

Wenpin Hou, Zhicheng Ji

We systematically evaluated the performance of seven large language models in generating programming code using various prompt strategies, programming languages, and task difficulties. GPT-4 substantially outperforms other large language models, including Gemini Ultra and Claude 2. The coding performance of GPT-4 varies considerably with different prompt strategies. In most LeetCode and GeeksforGeeks coding contests evaluated in this study, GPT-4 employing the optimal prompt strategy outperforms 85 percent of human participants. Additionally, GPT-4 demonstrates strong capabilities in translating code between different programming languages and in learning from past errors. The computational efficiency of the code generated by GPT-4 is comparable to that of human programmers. These results suggest that GPT-4 has the potential to serve as a reliable assistant in programming code generation and software development.

en cs.SE, cs.AI

Detail DOI Sumber

DOAJ Open Access 2023

Reparations Withdrawals Of Metallurgical Equipment in Central and Eastern European Countries at Final Stage of Great Patriotic War (1944—1945)

V. V. Zapariy, V. V. Zapariy, N. N. Melnikov

This article is dedicated to the lesser-known aspects of Soviet policy regarding the compulsory seizure of industrial equipment in territories liberated by the Red Army from the fascist bloc states in 1944—1945. Based on an analysis of decrees by the State Defense Committee, the principles of interaction between representatives of industrial People’s Commissariats and military authorities in liberated territories are revealed, with the aim of dismantling and transporting the most promising industrial assets back to the USSR for the needs of restoring the country’s metallurgical complex. The paper provides examples of property disputes among leading economic entities — the industrial People's Commissariats of the USSR — over the right to dismantle industrial facilities for their own benefit. It also sheds light on the activities of Special Assembly Managements under the People’s Commissariat for Construction of the USSR and their authority in the process of seizing industrial assets in territories freed from German control in Eastern Europe. Using the case studies of non-ferrous and ferrous metallurgy enterprises located in territories of fascist bloc countries (Germany, Hungary) liberated by the Red Army, typical approaches to organizing the compulsory dismantling of their equipment are analyzed. The research conducted vividly demonstrates the significance of reparations seizures of industrial equipment in Eastern Europe for the modernization and recovery of the USSR’s metallurgical complex during 1944—1945.

Slavic languages. Baltic languages. Albanian languages

Detail DOI Sumber

arXiv Open Access 2023

Finer characterization of bounded languages described by GF(2)-grammars

Vladislav Makarov, Marat Movsin

GF(2)-grammars are a somewhat recently introduced grammar family that have some unusual algebraic properties and are closely connected to unambiguous grammars. In "Bounded languages described by GF(2)-grammars", Makarov proved a necessary condition for subsets of $a_1^* a_2^* \cdots a_k^*$ to be described by some GF(2)-grammar. By extending these methods further, we prove an even stronger upper bound for these languages. Moreover, we establish a lower bound that closely matches the proven upper bound. Also, we prove the exact characterization for the special case of linear GF(2)-grammars. Finally, by using the previous result, we show that the class of languages described by linear GF(2)-grammars is not closed under GF(2)-concatenation

en cs.FL

Detail Sumber

arXiv Open Access 2023

GenCodeSearchNet: A Benchmark Test Suite for Evaluating Generalization in Programming Language Understanding

Andor Diera, Abdelhalim Dahou, Lukas Galke et al.

Language models can serve as a valuable tool for software developers to increase productivity. Large generative models can be used for code generation and code completion, while smaller encoder-only models are capable of performing code search tasks using natural language queries.These capabilities are heavily influenced by the quality and diversity of the available training data. Source code datasets used for training usually focus on the most popular languages and testing is mostly conducted on the same distributions, often overlooking low-resource programming languages. Motivated by the NLP generalization taxonomy proposed by Hupkes et.\,al., we propose a new benchmark dataset called GenCodeSearchNet (GeCS) which builds upon existing natural language code search datasets to systemically evaluate the programming language understanding generalization capabilities of language models. As part of the full dataset, we introduce a new, manually curated subset StatCodeSearch that focuses on R, a popular but so far underrepresented programming language that is often used by researchers outside the field of computer science. For evaluation and comparison, we collect several baseline results using fine-tuned BERT-style models and GPT-style large language models in a zero-shot setting.

en cs.CL, cs.PL

Detail Sumber

DOAJ Open Access 2022

Despicable Metal: the Entry of the Idiom into Literature

Konstantin V. Dushenko

The Russian idiom “despicable metal” in the meaning of “gold/ money” was descendant of the phrases “vile metal” (calque from the French “vil metal”), “worthless metal,” “despicable gold” (calque from the French “or méprisable”). At first, they were used in a moralistic and exemplary context, as a sign of condemnation of the desire for enrichment. The idiom “despicable metal” also has a counterpart in French and German (“métal méprisable,” “verächtliche Metall”). It entered Russian literature at the turn of the 1830s and 1840s, and among the authors of the first row this expression is invariably given in an ironic and parodic manner, even before Goncharov’s Ordinary Story (1847). Nevertheless, the role of Goncharov’s novel in the perception of an idiom new to the Russian language was exceptionally great. “Despicable metal” is one of the cross-cutting motifs of the novel, arising in the context of fundamental polemics with a pseudo-romantic life concepts.

Literature (General), Slavic languages. Baltic languages. Albanian languages

Detail DOI Sumber

arXiv Open Access 2022

Type-Directed Synthesis of Visualizations from Natural Language Queries

Qiaochu Chen, Shankara Pailoor, Celeste Barnaby et al.

We propose a new technique based on program synthesis for automatically generating visualizations from natural language queries. Our method parses the natural language query into a refinement type specification using the intents-and-slots paradigm and leverages type-directed synthesis to generate a set of visualization programs that are most likely to meet the user's intent. Our refinement type system captures useful hints present in the natural language query and allows the synthesis algorithm to reject visualizations that violate well-established design guidelines for the input data set. We have implemented our ideas in a tool called Graphy and evaluated it on NLVCorpus, which consists of 3 popular datasets and over 700 real-world natural language queries. Our experiments show that Graphy significantly outperforms state-of-the-art natural-language-based visualization tools, including transformer and rule-based ones.

en cs.PL

Detail Sumber

arXiv Open Access 2022

Decision trees for regular factorial languages

Mikhail Moshkov

In this paper, we study arbitrary regular factorial languages over a finite alphabet $Σ$. For the set of words $L(n)$ of the length $n$ belonging to a regular factorial language $L$, we investigate the depth of decision trees solving the recognition and the membership problems deterministically and nondeterministically. In the case of recognition problem, for a given word from $L(n)$, we should recognize it using queries each of which, for some $ i\in \{1,\ldots ,n\}$, returns the $i$th letter of the word. In the case of membership problem, for a given word over the alphabet $Σ$ of the length $n$, we should recognize if it belongs to the set $L(n)$ using the same queries. For a given problem and type of trees, instead of the minimum depth $h(n)$ of a decision tree of the considered type solving the problem for $L(n)$, we study the smoothed minimum depth $H(n)=\max\{h(m):m\le n\}$. With the growth of $n$, the smoothed minimum depth of decision trees solving the problem of recognition deterministically is either bounded from above by a constant, or grows as a logarithm, or linearly. For other cases (decision trees solving the problem of recognition nondeterministically, and decision trees solving the membership problem deterministically and nondeterministically), with the growth of $n$, the smoothed minimum depth of decision trees is either bounded from above by a constant or grows linearly. As corollaries of the obtained results, we study joint behavior of smoothed minimum depths of decision trees for the considered four cases and describe five complexity classes of regular factorial languages. We also investigate the class of regular factorial languages over the alphabet $\{0,1\}$ each of which is given by one forbidden word.

en cs.FL, cs.CC

Detail Sumber

DOAJ Open Access 2021

Vilnius Alma Mater – Cultural and Scientific Link of Polish-Lithuanian History

Małgorzata Misiak

The discussed monograph is an attempt to present Vilnius Alma Mater as a cultural and scientific link of Polish-Lithuanian history. The texts that make up the volume concern thematically Polish-Lithuanian relations from the 16th century to the present day, perceived in several aspects: historical and cultural, literary, linguistic and educational. The articles collected in the volume are arranged into specific five themes. These are: the heritage of the Polish–Lithuanian Commonwealth, the Grand Duchy of Lithuania in the works of 19th-century artists, The History of Stefan Batory University (1919–1939), The interpretation of the space of Vilnius and the Grand Duchy of Lithuania from the perspective of the 20th and 21st centuries, the study of phenomena belonging to the cultural and cultural borderland linguistic.

Slavic languages. Baltic languages. Albanian languages

Detail DOI Sumber

DOAJ Open Access 2021

“It is Sad that Lackeys Have Power...” (Unknown Letters from the Archive of B.V. Tomashevsky)

O. V. Nikitin

The article is devoted to the epistolary heritage of the legendary philologist of the XX century — Professor Boris Viktorovich Tomashevsky (1890—1957) and his relations with the leading scientists in the context of historical, literary, linguistic and ideological polemics mainly of the 1930s and 1950s. The paper emphasizes the contribution of the scientist to the development of world science in the era of cultural upheavals. Special attention is paid to the publications of unknown epistles to B.V. Tomashevsky by domestic and foreign philologists: I.L. Andronikov, G.O. Vinokur, D.S. Likhachev, A. A. Reformatsky, B. Unbegaun, M. Vasmer, R. O. Jakobson. The published letters describe the situation in science of that period: they show the difficulties in working on the publication of Pushkin’s “Complete Works”, reveal the polemics around controversial issues of the theory and practice of the text, express the attitude of cor-respondents to the facts of ideological pressing on science, describe the difficulties of wartime, etc. B. V. Tomashevsky’s epistolary is also considered in the context of the role of Soviet linguistic personality in creating scientific tradition. The relevance of the study is due to the use of archival mate-rials in the paradigm of the humanities to clarify poorly studied facts of the 1930s and 1950s.

Slavic languages. Baltic languages. Albanian languages

Detail DOI Sumber

DOAJ Open Access 2021

Modern Professional Language of Robotics (Terminological Family “Robot”)

A. S. Zaitseva

The article is devoted to the study of the terminological family with the base term “robot”. The issues concerning the structure of the terminological family, its basic lexical units, the most productive models of term formation and most frequent grammatical forms of terms are considered. Particular emphasis is placed on the specifics of the terminological family. The characteristic features of terminological elements used in the formation of new terms are defined. It is emphasised that the emergence of terminological combinations is a common way of forming new special words. In the course of study, it was found that the terminological family with the base term “robot” belongs to the class of highly deployed families. A five-stage model of the terminological family is presented. About 200 derived terms have been analysed at the five stages of term formation relating to the present state of language development. The scientific hypothesis that the emergence of terms-specifiers of the original concept is the way of forming a special professional terminological family has been proved. The material for research includes the “English-Russian Explanatory Dictionary” by E. M. Proydakov and L. A. Teplitskiy (Moscow, 2019), monographs, scientific publications on robotics in peer-reviewed journals in 2010—2021.

Slavic languages. Baltic languages. Albanian languages

Detail DOI Sumber

arXiv Open Access 2021

The Resh Programming Language for Multirobot Orchestration

Martin Carroll, Kedar S. Namjoshi, Itai Segall

This paper describes Resh, a new, statically typed, interpreted programming language and associated runtime for orchestrating multirobot systems. The main features of Resh are: (1) It offloads much of the tedious work of programming such systems away from the programmer and into the language runtime; (2) It is based on a small set of temporal and locational operators; and (3) It is not restricted to specific robot types or tasks. The Resh runtime consists of three engines that collaborate to run a Resh program using the available robots in their current environment. This paper describes both Resh and its runtime and gives examples of its use.

en cs.PL, cs.RO

Detail Sumber

DOAJ Open Access 2020

Christianisation as Cultural Guilt: The Bulgarian Experience

Sirma Danova

Christianisation as Cultural Guilt: The Bulgarian Experience This article contextualises the idea of Christianisation as cultural guilt within the Bulgarian context, particularly at the time of the Bulgarian National Revival. This theory has been most radically depicted in the published works of the outstanding revolutionary and poet of the National Revival, Hristo Botev. The origin of this idea is studied through his cultural dialogue with the texts of Georgi Rakovski (another eminent revolutionary and poet from the same period) to prove that the decadent version of Christianisation was known amongst the Bulgarian elite, most of whom were educated at Russian institutions, where they became familiar with the essays of the German historiogJohann Christian von Engel. Under his influence, Christianisation is considered to be a cultural and political invasion on the part of the Byzantine Empire. In this context, the motive of the lost brightness of the Bulgarian Middle Ages emerges from the literary works of the National Revival. The valuation of the medieval symbols of the obliterated Bulgarian antiquity culminated in the 1920s and 1930s. During that period, aspects of national identification were sought from magicians, dualists, and anchorites, which ultimately did not yield the desired result for the official nationalism but rather caused a crisis of symbols. Chrystianizacja jako wina kulturowa. Doświadczenie bułgarskie W niniejszym artykule postawiono sobie za cel kontekstualizację idei chrystianizacji jako winy kulturowej na gruncie bułgarskim, znanej od czasów bułgarskiego odrodzenia narodowego. Teza ta została najbardziej radykalnie wyrażona w dziełach publicystycznych wybitnego rewolucjonisty i poety odrodzenia narodowego, Christa Botewa. Poddano tu analizie genezę tej idei poprzez jego dialog kulturowy z tekstami Georgiego Rakowskiego, innego wybitnego rewolucjonisty i poety z tego samego okresu, aby udowodnić, że dekadencka wersja chrystianizacji jest znana wśród bułgarskiej elity, uformowanej głównie w rosyjskich instytucjach oświatowych, dzięki esejom niemieckiego historyka Johanna Christiana von Engela. Pod jego wpływem chrystianizacja jest uważana za inwazję kulturową i polityczną cesarstwa bizantyjskiego. W tym kontekście w dziełach literackich odrodzenia narodowego pojawia się motyw utraconej świetności bułgarskiego średniowiecza. Wartościowanie średniowiecznych symboli zatartej starożytności bułgarskiej ma swą kulminację w okresie międzywojennym, w latach 20. i 30. XX wieku. W tym czasie poszukiwano tożsamości narodowej u magów, dualistów, anachoretów, co ostatecznie nie przyniosło pożądanego efektu dla oficjalnego mitonacjonalizmu, lecz spowodowało kryzys symboli.

Ethnology. Social and cultural anthropology, Slavic languages. Baltic languages. Albanian languages

Detail DOI Sumber

arXiv Open Access 2020

Exploring Software Naturalness through Neural Language Models

Luca Buratti, Saurabh Pujar, Mihaela Bornea et al.

The Software Naturalness hypothesis argues that programming languages can be understood through the same techniques used in natural language processing. We explore this hypothesis through the use of a pre-trained transformer-based language model to perform code analysis tasks. Present approaches to code analysis depend heavily on features derived from the Abstract Syntax Tree (AST) while our transformer-based language models work on raw source code. This work is the first to investigate whether such language models can discover AST features automatically. To achieve this, we introduce a sequence labeling task that directly probes the language models understanding of AST. Our results show that transformer based language models achieve high accuracy in the AST tagging task. Furthermore, we evaluate our model on a software vulnerability identification task. Importantly, we show that our approach obtains vulnerability identification results comparable to graph based approaches that rely heavily on compilers for feature extraction.

en cs.CL, cs.LG

Detail Sumber

DOAJ Open Access 2019

Прояви та специфіка бінарних опозицій “свій – чужий” під час Революції гідності (на матеріалі першої книги збірника “Майдан. Пряма мова”)

Oxana Kovalyova

The “Friend or Foe” conception is an integral component of formation of each society. This binary archetype is the object of study of several scientific disciplines and sometimes appears in their interface. It gives the opportunity to use different methodology to study its manifestations. This article examines this concept during the Revolution of Dignity in Ukraine (November 2013 February 2014), its manifestations in the dichotomy of binaries, erasing of their usual limits, uniting the former opponents under the banner of dignity, creating the new “Friend or Foe” binaries etc. The material for the study is the oral histories of participants and witnesses of the Revolution of Dignity, gathered in the first book of “Maidan. Direct speech”. The respondents are the representatives of different ages, professions, religions, social strata etc., thus giving a retrospective picture for research. There are clear “Friend or Foe” oppositions, represented with the negative opposition of Maidan members to police force, government, titushkas, pro-government press; Maidan to Anti-Maidan etc. The oppositions inside the Maydan (for example, different hundreds or locations) have different connotations: “Friend or Stranger” with neutral or positive context. Special attention is given to markers that allowed to recognize “friends” between the Maidan members. The article text is supplemented by quotes from the aforementioned collection of interviews. The style of speech of the 166 respondents is preserved. This research may be useful for the specialists of different humanitarian scientific disciplines.

Slavic languages. Baltic languages. Albanian languages

Detail DOI Sumber

arXiv Open Access 2019

Attention-based method for categorizing different types of online harassment language

Christos Karatsalos, Yannis Panagiotakis

In the era of social media and networking platforms, Twitter has been doomed for abuse and harassment toward users specifically women. Monitoring the contents including sexism and sexual harassment in traditional media is easier than monitoring on the online social media platforms like Twitter, because of the large amount of user generated content in these media. So, the research about the automated detection of content containing sexual or racist harassment is an important issue and could be the basis for removing that content or flagging it for human evaluation. Previous studies have been focused on collecting data about sexism and racism in very broad terms. However, there is no much study focusing on different types of online harassment attracting natural language processing techniques. In this work, we present an multi-attention based approach for the detection of different types of harassment in tweets. Our approach is based on the Recurrent Neural Networks and particularly we are using a deep, classification specific multi-attention mechanism. Moreover, we tackle the problem of imbalanced data, using a back-translation method. Finally, we present a comparison between different approaches based on the Recurrent Neural Networks.

en cs.CL, cs.LG

Detail DOI Sumber

Hasil untuk "Slavic languages. Baltic languages. Albanian languages"