Aleksandar Vučo’s text “Ljuskari na prsima” (“Crustaceans on the Chest”) was published in the surrealist almanac “Nemoguće” (“The Impossible,” 1930) accompanied by thirteen unsigned black-and-white drawings, whose authorship was attributed to Marko Ristić by Miodrag B. Protić. This article discusses the differing contemporary interpretations of Vučo’s work in the studies of Milanka Todić, Radovan Vučković, and Božidar Zečević. Scholars who have written about this surrealist screenplay emphasize the “absence of a logical narrative flow,” “alogical structure and discontinuity,” and the “imitation of a dream.” The study considers the nature of the dream in the screenplay, the moment at which the dream begins and ends, and the interrelationship between the dream in Vučo’s text and Ristić’s drawing-collages. Vučo’s reflections on dreams from the questionnaire “The Jaws of Dialectics” published in the “Nemoguće” almanac, as well as oneiric fragments from his novel “The Root of Vision” (1928), are cited to emphasize the importance of dreams in the author’s earlier works. The analysis closely follows the images, characters, and motifs of “Ljuskari na prsima” and their transformations, seeking interpretive grounding in psychoanalysis, the poetics of surrealist dream imitation, and the use of defamiliarization (“ostranie”) and the grotesque.
Original title in Serbian: "Поетика ониризма надреалистичког 'филма' "Љускари на прсима" (1930) Александра Вуча."
Slavic languages. Baltic languages. Albanian languages
This study examines the lexicon of the Middle Volga region through the lens of inter-dialectal correspondences. It draws upon materials from regional atlases and dictionaries, as well as recordings and observations made during dialectological expeditions. The relevance of this research lies in its ability to diagnose a range of specific lexical-semantic characteristics of secondary dialects with northern roots in the Middle Volga area based on linguistic geography data. Phonetic evidence indicates that the Middle Volga dialects are connected to northern roots not directly with Northern Russian dialects, but rather with Vladimir-Volga type dialects, known for their transitional nature. The paper presents evidence of parallelism in phonetic and lexical features between the studied dialects and those of the VladimirVolga group. It is noted that, at the lexical level, secondary dialects often exhibit loss and semantic transformation of Northern Russian vocabulary. Additionally, for certain lexemes, the presence of stable areas with preserved dialectal affiliations is observed among the dialects dialects with okanye that have developed under new living conditions. The conclusion drawn is that, on one hand, there has been a weakening of genetic ties in the lexicon of dialects with Northern Russian roots; on the other hand, a number of common lexical-semantic features have emerged due to the integration of migrant dialects.
Slavic languages. Baltic languages. Albanian languages
This article examines the influence of Mikhail Sholokhov’s short story “The Fate of a Man” on the development of Chinese military fiction. The study identifies three distinct phases in the story’s reception within Chinese literary scholarship: an initial period of ideological appropriation, a subsequent phase of diminished interest, and a more recent stage of nuanced critical analysis. The author argues that Sholokhov’s work profoundly influenced the formation of a new “little man” archetype in Chinese war prose, fostering a re-evaluation of the theme of suffering and a departure from heroic pathos toward a more realistic and psychologically nuanced portrayal of warfare and its aftermath. Through a comparative typological analysis, the article establishes both genetic and typological connections, highlighting the universality of the theme of human resilience as well as its culturally specific interpretations. The analysis demonstrates that the works “The Last Soldier” by Shi Zhongshan, “Hymn to a Hero” by Liu Zhen, and “My Korean War” by Zhang Zeshi exhibit a genetic kinship with Sholokhov’s poetics, particularly in their anti-heroic characterization, use of circular narrative structure, and the aesthetics of war trauma. Furthermore, the study reveals a distinct political cyclicity in the patterns of translation and scholarly engagement, characterized by surges of interest during periods of diplomatic rapprochement — such as the late 1950s, mid-1980s, and early 2000s—followed by declines during times of bilateral tension. The author concludes that this specific pattern of reception vividly illustrates the symbiotic relationship between literary communication and geopolitics within the framework of comparative literature.
Slavic languages. Baltic languages. Albanian languages
This article examines the role of two historical regions — Eastern Galicia and Subcarpathian Ruthenia — in French counter-Soviet strategies from early 1920 to late 1923. The novelty lies in considering these territories not only within the framework of bilateral relations between France and states where they were located (Poland for Eastern Galicia and Czechoslovakia for Subcarpathian Ruthenia), but also within a broader context of French conceptions about security architecture, including their relationship with Moscow. The relevance of this study is underscored by current heightened military-political tensions in contemporary Eastern Europe. Based on published French diplomatic and military documents as well as archival materials from both French and Russian repositories, the author concludes that Paris attached significant importance to Eastern Galicia and Subcarpathian Ruthenia as two echelons of an anti-Soviet frontline. It is demonstrated that French authorities recognized the vulnerability of these areas and their potential transformation into a strategic corridor for advancements by the Workers' and Peasants' Red Army towards Western Europe. However, it is emphasized that French calculations often diverged from local realities, thereby weakening the robustness of the socalled “sanitary cordon.”
Slavic languages. Baltic languages. Albanian languages
Compilers play a central role in translating high-level code into executable programs, making their correctness essential for ensuring code safety and reliability. While extensive research has focused on verifying the correctness of compilers for single-language compilation, the correctness of cross-language compilation - which involves the interaction between two languages and their respective compilers - remains largely unexplored. To fill this research gap, we propose CrossLangFuzzer, a novel framework that introduces a universal intermediate representation (IR) for JVM-based languages and automatically generates cross-language test programs with diverse type parameters and complex inheritance structures. After generating the initial IR, CrossLangFuzzer applies three mutation techniques - LangShuffler, FunctionRemoval, and TypeChanger - to enhance program diversity. By evaluating both the original and mutated programs across multiple compiler versions, CrossLangFuzzer successfully uncovered 10 confirmed bugs in the Kotlin compiler, 4 confirmed bugs in the Groovy compiler, 7 confirmed bugs in the Scala 3 compiler, 2 confirmed bugs in the Scala 2 compiler, and 1 confirmed bug in the Java compiler. Among all mutators, TypeChanger is the most effective, detecting 11 of the 24 compiler bugs. Furthermore, we analyze the symptoms and root causes of cross-compilation bugs, examining the respective responsibilities of language compilers when incorrect behavior occurs during cross-language compilation. To the best of our knowledge, this is the firstwork specifically focused on identifying and diagnosing compiler bugs in cross-language compilation scenarios. Our research helps to understand these challenges and contributes to improving compiler correctness in multi-language environments.
Large Language Models (LLMs) have emerged as a promising alternative to traditional static program analysis methods, such as symbolic execution, offering the ability to reason over code directly without relying on theorem provers or SMT solvers. However, LLMs are also inherently approximate by nature, and therefore face significant challenges in relation to the accuracy and scale of analysis in real-world applications. Such issues often necessitate the use of larger LLMs with higher token limits, but this requires enterprise-grade hardware (GPUs) and thus limits accessibility for many users. In this paper, we propose LLM-based symbolic execution -- a novel approach that enhances LLM inference via a path-based decomposition of the program analysis tasks into smaller (more tractable) subtasks. The core idea is to generalize path constraints using a generic code-based representation that the LLM can directly reason over, and without translation into another (less-expressive) formal language. We implement our approach in the form of AutoBug, an LLM-based symbolic execution engine that is lightweight and language-agnostic, making it a practical tool for analyzing code that is challenging for traditional approaches. We show that AutoBug can improve both the accuracy and scale of LLM-based program analysis, especially for smaller LLMs that can run on consumer-grade hardware.
Artykuł dotyczy zjawiska dekonstrukcji rosyjskiego dyskursu politycznego we współczesnych parodiach audiowizualnych. Podstawowym celem tekstu jest identyfikacja i analiza technik stosowanych przez parodystów w celu dekonstrukcji wybranych środków językowych wykorzystywanych w wypowiedziach rytualnych pochodzących z rosyjskiego dyskursu politycznego. Za materiał posłużyły rosyjskojęzyczne audiowizualne utwory parodystyczne publikowane na YouTube. Badanie polega na wyjawieniu intertekstualnych powiązań pomiędzy oryginalnymi wypowiedziami rosyjskich polityków i tekstami parodystycznymi, określeniu cech dyskursu politycznego poddanych parodii, analizie technik modyfikacji i określeniu funkcji tych transformacji. Artykuł obrazuje sposób, w jaki parodie odzwierciedlają idee ukryte pod warstwą werbalną wypowiedzi, uczestnicząc w konstruowaniu krytyki politycznej.
Slavic languages. Baltic languages. Albanian languages, History (General) and history of Europe
The Constitution of the Republic of Lithuania has established the highest official status of a language, i.e. a state language. Compliance with this status requires an appropriate legal framework and institutions that develop and enforce language policy and legislation.
The article reviews the development of the status of the Lithuanian language from 1918 to the present day, and presents the institutions that supervise the use and correctness of the language.
The article describes the legal regulation of the official language in Lithuania. It reviews the legal acts that establish the requirements for the correctness and use of the language, and presents a hierarchical system of the related legal acts. The institutions responsible for language policy and protection are described.
Slavic languages. Baltic languages. Albanian languages
Abstract:The paper discusses a constraint on the distribution of homorganic CVC sequences known as Similar Place Avoidance (SPA). Though proposed as a statistical universal, it has been little considered in Slavic and other Indo-European languages. We evaluate the CVC distribution in 100 recorded and reconstructed varieties, of which 18 are Slavic, 44 are non-Slavic Indo-European, and 38 are non-Indo-European. The SPA principle has been formulated as pertaining to CVC sequences of two consonants sharing the same place, but it has also been suggested that coronals are dependent on sonorancy agreement for the constraint to take effect. This dependency is indeed observable but concerns dento-alveolars only, not coronals as a whole class. SPA weakly restricts combinations of dento-alveolar sonorants with palatal sonorants. Combinations of different-place coronal obstruents are disfavored, but this is instead due to sibilancy avoidance (a restriction of the co-occurrence of two sibilants in a CVC sequence, previously unreported). Finally, combinations of palatals (including post-alveolars) are less often subject to an SPA effect, and the Slavic languages virtually lack this kind of restriction.
This article examines the morphological categories of the noun of two typologically different languages – Russian and Azerbaijani. Thus, Russian is an inflectional language, while Azerbaijani is agglutinative. A comparative study of the grammatical categories of a noun in these two languages will help identify similarities and divergences in the languages under study, which is valuable when compiling manuals, reading comparative special courses, translation, and language teaching.
The study is dedicated to the prerequisites and results of the Interregional Marathon of Vepsian and Karelian speech recordings “Listening to my native dialect” (January - October 2023). An important component of the policy of preserving the autochthonous languages of the Republic of Karelia is the enrichment with new audio samples of the Baltic-Finnish Speech Corpus developed on the basis of the “Open corpus of Veps and Karelian languages” (VepKar). The speech corpus was developed by the staff of the Institute for the Language of Literature and History and the Institute of Applied Mathematical Research. The corpus includes a collection of spoken texts in different dialects of the Karelian and Vepsian languages, provided with transcription, markup and translation into Russian. The corpus also contains search filters necessary for work (search by language/dialect, place and year of recording, informant and collector, source). Researchers use three main sources to fill the corpus with audio recordings of Karelian and Vepsian speech: audiocollections of the Phonogram Archive of the ILLH KRC RAS, audiorecordings of broadcasts in the Karelian language, as well as field materials of the authors recorded during the expeditions. In order to replenish the corpus map with new audio fragments and ensure a decrease in public interest in the dialects of the Karelian and Vepsian languages was held the “Listening to my native dialect” marathon. Everyone was invited to the latter, who could register both themselves and take part in the role of collectors. The purpose of this study, firstly, is to analyze the results of the marathon as a tool for popularizing and improving the status of the indigenous languages of Karelia, and secondly, as a tool for documenting these languages, which resulted in a collection of recordings of Vepsian and Karelian speech, collected in a linguistic corpus.
We present CLASSLA-Stanza, a pipeline for automatic linguistic annotation of the South Slavic languages, which is based on the Stanza natural language processing pipeline. We describe the main improvements in CLASSLA-Stanza with respect to Stanza, and give a detailed description of the model training process for the latest 2.1 release of the pipeline. We also report performance scores produced by the pipeline for different languages and varieties. CLASSLA-Stanza exhibits consistently high performance across all the supported languages and outperforms or expands its parent pipeline Stanza at all the supported tasks. We also present the pipeline's new functionality enabling efficient processing of web data and the reasons that led to its implementation.
Michal vStef'anik, Marek Kadlcík, Piotr Gramacki
et al.
Despite the rapid recent progress in creating accurate and compact in-context learners, most recent work focuses on in-context learning (ICL) for tasks in English. However, the ability to interact with users of languages outside English presents a great potential for broadening the applicability of language technologies to non-English speakers.In this work, we collect the infrastructure necessary for training and evaluation of ICL in a selection of Slavic languages: Czech, Polish, and Russian. We link a diverse set of datasets and cast these into a unified instructional format through a set of transformations and newly-crafted templates written purely in target languages.Using the newly-curated dataset, we evaluate a set of the most recent in-context learners and compare their results to the supervised baselines. Finally, we train, evaluate and publish a set of in-context learning models that we train on the collected resources and compare their performance to previous work.We find that ICL models tuned in English are also able to learn some tasks from non-English contexts, but multilingual instruction fine-tuning consistently improves the ICL ability. We also find that the massive multitask training can be outperformed by single-task training in the target language, uncovering the potential for specializing in-context learners to the language(s) of their application.
This paper describes Adam Mickiewicz University’s (AMU) solution for the 4th Shared Task on SlavNER. The task involves the identification, categorization, and lemmatization of named entities in Slavic languages. Our approach involved exploring the use of foundation models for these tasks. In particular, we used models based on the popular BERT and T5 model architectures. Additionally, we used external datasets to further improve the quality of our models. Our solution obtained promising results, achieving high metrics scores in both tasks. We describe our approach and the results of our experiments in detail, showing that the method is effective for NER and lemmatization in Slavic languages. Additionally, our models for lemmatization will be available at: https://huggingface.co/amu-cai.
This paper describes Slav-NER: the 4th Multilingual Named Entity Challenge in Slavic languages. The tasks involve recognizing mentions of named entities in Web documents, normalization of the names, and cross-lingual linking. This version of the Challenge covers three languages and five entity types. It is organized as part of the 9th Slavic Natural Language Processing Workshop, co-located with the EACL 2023 Conference.Seven teams registered and three participated actively in the competition. Performance for the named entity recognition and normalization tasks reached 90% F1 measure, much higher than reported in the first edition of the Challenge, but similar to the results reported in the latest edition. Performance for the entity linking task for individual language reached the range of 72-80% F1 measure. Detailed evaluation information is available on the Shared Task web page.
Nikola Ivačič, Thi-Hanh Tran, Boshko Koloski
et al.
This paper analyzes a Named Entity Recognition task for South-Slavic languages using the pre-trained multilingual neural network models. We investigate whether the performance of the models for a target language can be improved by using data from closely related languages. We have shown that the model performance is not influenced substantially when trained with other than a target language. While for Slovene, the monolingual setting generally performs better, for Croatian and Serbian the results are slightly better in selected cross-lingual settings, but the improvements are not large. The most significant performance improvement is shown for the Serbian language, which has the smallest corpora. Therefore, fine-tuning with other closely related languages may benefit only the “low resource” languages.
Ethnolects have been defined as varieties linked to particular ethnic minorities by the minorities themselves or by other ethnic communities. The present paper investigates this association between ethnic groups and language varieties in the Greek context. I seek to answer whether there is an association made (by Albanians or Greeks) between Albanian migrants in Greece and a particular variety that is not their L1, i.e., Albanian, and if so, whether this is an Albanian ethnolect of Greek. I show experimentally that, in fact, there is a variety of Greek that is linked with listeners’ perceptions of Albanian migrants. However, that criterion is not enough in itself to designate the variety as an ethnolect as the acquisition of this variety by the second or subsequent generations of migrants is not evidenced. Rather, those generations are undergoing language shift from Albanian to Greek. Therefore, the classification of Albanian Greek as an Albanian ethnolect of Greek is not possible despite the association between the variety and the particular minority in Greece. Classification as an L2 Greek variety or a Mock Albanian Greek (MAG) variety is instead argued.
This paper describes Adam Mickiewicz University's (AMU) solution for the 4th Shared Task on SlavNER. The task involves the identification, categorization, and lemmatization of named entities in Slavic languages. Our approach involved exploring the use of foundation models for these tasks. In particular, we used models based on the popular BERT and T5 model architectures. Additionally, we used external datasets to further improve the quality of our models. Our solution obtained promising results, achieving high metrics scores in both tasks. We describe our approach and the results of our experiments in detail, showing that the method is effective for NER and lemmatization in Slavic languages. Additionally, our models for lemmatization will be available at: https://huggingface.co/amu-cai.
Multiple types can represent the same concept. For example, lists and trees can both represent sets. Unfortunately, this easily leads to incomplete libraries: some set-operations may only be available on lists, others only on trees. Similarly, subtypes and quotients are commonly used to construct new type abstractions in formal verification. In such cases, one often wishes to reuse operations on the representation type for the new type abstraction, but to no avail: the types are not the same. To address these problems, we present a new framework that transports programs via equivalences. Existing transport frameworks are either designed for dependently typed, constructive proof assistants, use univalence, or are restricted to partial quotient types. Our framework (1) is designed for simple type theory, (2) generalises previous approaches working on partial quotient types, and (3) is based on standard mathematical concepts, particularly Galois connections and equivalences. We introduce the notion of partial Galois connections and equivalences and prove their closure properties under (dependent) function relators, (co)datatypes, and compositions. We formalised the framework in Isabelle/HOL and provide a prototype. This is the extended version of "Transport via Partial Galois Connections and Equivalences", 21st Asian Symposium on Programming Languages and Systems, 2023.