Some Remarks on Marginal Code Languages
Stavros Konstantinidis
A prefix code L satisfies the condition that no word of L is a proper prefix of another word of L. Recently, Ko, Han and Salomaa relaxed this condition by allowing a word of L to be a proper prefix of at most k words of L, for some `margin' k, introducing thus the class of k-prefix-free languages, as well as the similar classes of k-suffix-free and k-infix-free languages. Here we unify the definitions of these three classes of languages into one uniform definition in two ways: via the method of partial orders and via the method of transducers. Thus, for any known class of code-related languages definable via the transducer method, one gets a marginal version of that class. Building on the techniques of Ko, Han and Salomaa, we discuss the \emph{uniform} satisfaction and maximality problems for marginal classes of languages.
Ukrainian Indefinite Pronouns and Language Typology
T. Nesset, Yuliia Palii
The present article offers an empirical analysis of Ukrainian indefinite pronouns and adverbs based on data from the GRAC corpus. The proposed analysis has ramifications for Ukrainian linguistics, Slavic linguistics, and language typology. With regard to Ukrainian linguistics, we identify substantial frequency differences and suggest distinguishing between a “core” system including the indefiniteness markers de-, -s’ and bud’-, and three “peripheral” markers, viz. -nebud’, aby-, and kazna. From the perspective of Slavic linguistics, the proposed analysis facilitates comparison with other Slavic languages, such as Polish and Russian. Our analysis pinpoints a number of similarities across Ukrainian, Polish and Russian, but also demonstrates that Ukrainian has a distinct system that merits investigation in its own right. For language typology, the analysis we propose shows how frequency information can be integrated in semantic maps, which arguably makes semanticmaps a more powerful tool for cross-linguistic comparison.
Compositional Incrementality Based on Polish Reveal-Type Verbs and Verbal Nouns
Karolina Zuchewicz
This article focuses on the realization of incrementality in Polish verbal and nominal constructions. The object of investigation is clause-embedding reveal-type concepts like ‘prove’, ‘reveal’, or ‘show’. In Slavic languages, incremental relations have traditionally been examined in direct relation to (im)perfectivity, with imperfective verbs enforcing partial affectedness of events and objects, and perfective verbs enforcing their total affectedness. In the present paper, I take a closer look at the incremental output within the reveal-type concept. I investigate whether an incremental event comes with a fixed incremental path that remains intact independently of any morphological or syntactic modifications. My research question is: Is an incremental feature specified in the lexicon as is the aspectual value ‘(im)perfective’, or does it rather arise compositionally? To answer this question, I analyze the impact of the dative argument and the nominalization on the incremental output of clause-embedding reveal-type predicates. I demonstrate that incremental meanings are affected by the properties of an entire construction. Based on that, I propose to distinguish between two types of incrementality: the non-modifiable (im)perfectivity-dependent partial and total integration requirement, and the compositional incrementality that arises as an interplay between lexical semantics, argument structure, and the morphological shape of the respective lexeme.
L’électrification du corps et de l’art dans l’URSS des années 1920
Aleksandra Selivanova
The article examines the representations of electricity in both poetic and prose texts, as well as in artistic and scientific experiments in the USSR in the 1920s. For various artists and writers, electrification became a metaphor for the essential transformation of space, time, and the human body, which now interacted with electrified mechanisms. Solomon Nikritin’s and Sergey Luchishkin’s “Projectionist Theater” at the Central Institute of Labor in Moscow; Kliment Redko’s art group “Electroorganism”; the poetry of Alexei Gastev and Velimir Khlebnikov, as well as Andrei Platonov’s prose, all outlined a new reality and new relationships between humans and machines, echoed in both mass culture and in the image of robots in films and plays. The incorporation of electricity into simulators, as well as into the psychotechnical and diagnostic equipment at CIT ; Alexei Gastev’s radical manifestos and instructions ; the experiments conducted in various Soviet scientific institutes in the 1920s in order to cyclograph the movements of actors, dancers, and musicians, all demonstrated the reality of effective interactions (and even union) between the body and the electrified machine.
Slavic languages. Baltic languages. Albanian languages
Theory of Communication in Gorgias and Plato’s Cratylus
Mantas Adomėnas
This essay attempts to establish a link between the third part of the Περὶ τοῦ μὴ ὄντος that contains Gorgias’ critique of communication and Plato’s theory of communication (as developed in the Cratylus). After analysing the text of, and attempting to reconstruct the original structure of Gorgias’ argument in, Part 3 of the Περὶ τοῦ μὴ ὄντος, the author contends that Plato, in spite of never addressing Gorgias’ work directly, formulates his theory of communication in the Cratylus within the conceptual schemes present in Gorgias’ treatise. Plato implicitly criticises Gorgias’ refutation of the possibility of communication, while retaining, nevertheless, certain features of Gorgias’ argument.
Literature (General), Slavic languages. Baltic languages. Albanian languages
Dynamic Membership for Regular Tree Languages
Antoine Amarilli, Corentin Barloy, Louis Jachiet
et al.
We study the dynamic membership problem for regular tree languages under relabeling updates: we fix an alphabet $Σ$ and a regular tree language $L$ over $Σ$ (expressed, e.g., as a tree automaton), we are given a tree $T$ with labels in $Σ$, and we must maintain the information of whether the tree $T$ belongs to $L$ while handling relabeling updates that change the labels of individual nodes in $T$. Our first contribution is to show that this problem admits an $O(\log n / \log \log n)$ algorithm for any fixed regular tree language, improving over known $O(\log n)$ algorithms. This generalizes the known $O(\log n / \log \log n)$ upper bound over words, and it matches the lower bound of $Ω(\log n / \log \log n)$ from dynamic membership to some word languages and from the existential marked ancestor problem. Our second contribution is to introduce a class of regular languages, dubbed almost-commutative tree languages, and show that dynamic membership to such languages under relabeling updates can be decided in constant time per update. Almost-commutative languages generalize both commutative languages and finite languages: they are the analogue for trees of the ZG languages enjoying constant-time dynamic membership over words. Our main technical contribution is to show that this class is conditionally optimal when we assume that the alphabet features a neutral letter, i.e., a letter that has no effect on membership to the language. More precisely, we show that any regular tree language with a neutral letter which is not almost-commutative cannot be maintained in constant time under the assumption that the prefix-U1 problem from (Amarilli, Jachiet, Paperman, ICALP'21) also does not admit a constant-time algorithm.
Measure-Theoretic Aspects of Star-Free and Group Languages
Ryoma Sin'ya, Takao Yuyama
A language $L$ is said to be ${\cal C}$-measurable, where ${\cal C}$ is a class of languages, if there is an infinite sequence of languages in ${\cal C}$ that ``converges'' to $L$. We investigate the properties of ${\cal C}$-measurability in the cases where ${\cal C}$ is SF, the class of all star-free languages, and G, the class of all group languages. It is shown that a language $L$ is SF-measurable if and only if $L$ is GD-measurable, where GD is the class of all generalised definite languages (a more restricted subclass of star-free languages). This means that GD and SF have the same ``measuring power'', whereas GD is a very restricted proper subclass of SF. Moreover, we give a purely algebraic characterisation of SF-measurable regular languages, which is a natural extension of Schutzenberger's theorem stating the correspondence between star-free languages and aperiodic monoids. We also show the probabilistic independence of star-free and group languages, which is an important application of the former result. Finally, while the measuring power of star-free and generalised definite languages are equal, we show that the situation is rather opposite for subclasses of group languages as follows. For any two local subvarieties ${\cal C} \subsetneq {\cal D}$ of group languages, we have $\{L \mid L \text{ is } {\cal C}\text{-measurable}\} \subsetneq \{ L \mid L \text{ is } {\cal D}\text{-measurable}\}$.
A Multilingual Python Programming Language
Saad Ahmed Bazaz, Mirza Omer Beg
All widely used and useful programming languages have a common problem. They restrict entry on the basis of knowledge of the English language. The lack of knowledge of English poses a major hurdle to many newcomers who do not have the resources, in terms of time and money, to learn the English language. Studies show that people learn better in their own language. Therefore, we propose a language transpiler built on top of the Python programming language, called UniversalPython, which allows one to write Python in their own human language. We demonstrate the ability to create an "Urdu Python" with this transpiler. In the future, we aim to scale the language to encapsulate more human languages to increase the availability of programming. The source code for this transpiler is open-source, and available at https://github.com/universalpython/universalpython
Aspectual Restriction on Sorting in Czech and Slovak
M. Dočekal, Michaela Hulmanová, Aviv Schoenfeld
This article is about the cross-linguistic universality of the so-called Universal Sorter, where a noun N means ‘kind of N’. We discuss two restrictions in two Slavic languages which are absent from English, pertaining to perfective verbs and numerically modified count nouns. We establish, first with introspective judgments (for Czech) and then experimentally (for Slovak), that both restrictions are present in a way which supports our analysis of the first restriction as stemming from Slavic, unlike English, having perfective verbs which force a completive reading of an incremental theme.
The category of definiteness-indefiniteness in Bulgarian and Russian
Damyan Mitev
One of the striking features that distinguishes Bulgarian from Russian and other Slavic languages is the presence of morphological definiteness, expressed through the articulation of forms. The corresponding linguistic function in Russian is usually performed in the course of the semantic-syntactic organization of the utterance by units that may belong to different linguistic levels. A similar expression, different from the morphological one, also exists in Bulgarian. In search of a unified approach to comparative research, the author adopts a broader understanding of the category of definiteness-indefiniteness – as an invariant unity of series of categorical semantic features that are opposite in content and function, expressed through certain (including different in nature) linguistic means and devices. In practice, the analysis of linguistic features in the compared languages is carried out with regard to the main components of the categorical opposition: a) the semantics of definiteness and, respectively, indefiniteness; b) the main linguistic means for their expression, and c) the non-basic (additional, peripheral) means and ways of expressing the respective categorical meanings. It is important for the comparative analysis to reveal the type and functional-semantic properties of the individual elements, as well as their interaction.
Relative Clauses in Native Lower Sorbian and the Relativizer how
Andreas Pankau
Native Lower Sorbian, an endangered West Slavic minority language spoken in Germany, possesses a relative clause formation strategy employing the invariant relativizer ak and optional resumption. The focus of this paper lies on the status of ak. In other languages that have them, invariant relativizers are drawn from the set of complementizers, wh-words, or demonstratives. ak seems to differ in that respect because it belongs to neither category. In this paper, I argue that ak is not an outlier. Instead, ak is a variant of the manner wh-word kak ‘how’ in its non-manner use as a complementizer. After I show how the complementizer kak differs from the wh-adverb kak and that relative clauses in Native Lower Sorbian feature empty operator movement, I argue that the empty operator sitting in SpecCP triggers a rule partially deleting the complementizer kak. More specifically, the rule elides the initial [k] of kak, reducing it to ak. This makes Native Lower Sorbian similar to Bern German or West Frisian, both of which also feature the partial deletion of a complementizer in the presence of a moved element in SpecCP. Furthermore, Native Lower Sorbian is yet another language where how has a non-manner use.
Comparative linguistics in the teaching of Spanish as a foreign language (ELE): applications and didactic strategies
Carlos Melgar García
The acquisition of Spanish as a foreign language (ELE) is shaped by the student’s native language, which can either aid or hinder learning. Comparative Linguistics helps understand language transfer and reduce interference. This study examines its impact on ELE, identifying frequent errors and suggesting contrastive teaching strategies. Recurrent mistakes in verb conjugation, article usage, and pronunciation are analyzed through comparisons with Romance, Germanic, Slavic, and non-Indo-European languages. Adapted teaching materials are proposed. Results emphasize the need to integrate Comparative Linguistics into ELE to optimize Spanish learning, reduce fossilization, and enhance communicative competence. A contrastive approach promotes more effective and tailored learning to linguistic diversity
Between Ethnic and Cultural Identity: The Effect of Turkish Religious Literature on the Lifestyle of the Lithuanian Muslim Community
Gintarė Lukoševičiūtė
This paper examines the increasing presence of Turkish religious literature in Kaunas, Lithuania, home to the only brick mosque in the Baltic States and an active Muslim community with a Turkish imam conducting services. In the globalised context, spiritual texts play a key role in shaping identity and communal lifestyle, and Turkish authors’ literature, due to accessibility in local languages, may be relevant for Lithuanian Muslims. By focusing on two Muslim groups, Lithuanian Tatars and converts, the research investigates how religious translations are transmitted, adapted and integrated into the local community. The analysis focuses on the Islamic religion and communal expressions and explores whether translations influence the identity of Lithuanian Muslims, their spiritual practices, linguistic preferences, historical consciousness or socio-political approaches. Additionally, it provides insights into how Turkish literature serves as both a cultural artefact and an element for identity formation for these two Muslim minority groups.
Prerequisites for the emergence of communicative barriers in the acquisition of the Croatian language by Ukrainians (using the example of an adult audience)
L. Petrovska
The article examines the prerequisites for the emergence of communication barriers, which are obstacles in communication that forcibly displaced persons from Ukraine who found themselves on the territory of Croatia encounter in the Croatian language learning. The study was conducted by observing of six groups of Ukrainian citizens who attended the Croatian language course during 2023 –2025 at the primary and intermediate levels. Successful integration of migrants into the local community involves employment and ensuring an adequate standard of living, a key prerequisite for which is mastering the language of the country. Despite the fact that Ukraine and Croatia belong to the same Slavic linguistic and cultural space, mastering the Croatian language poses many challenges for many Ukrainians to the same extent as the process of social adaptation. Four primary factors that influence the emergence of communicative barriers in the communication of Ukrainians in Croatian are analyzed: age, educational and professional, knowledge of other languages, and the close relationship of the Ukrainian and Croatian languages. During the observation, we discover that each of the analyzed factors has an equally positive and negative impact on the process of learning the language. On the one hand, a higher level of education of the speaker and his/her proficiency in other foreign languages contributes to faster and more effective mastery of the Croatian language, but we often encounter phenomena opposite to what is expected. Thus, a wide vocabulary of the native language hinders the speaker in building elementary speech structures in a foreign language. In the same way, knowledge of other foreign languages forces the speaker to rely on previously acquired speech experience and is manifested in a slowdown in the production of new language structures. Interestingly, bilingual Ukrainians with dominant Russian have a much harder time mastering Croatian than those whose dominant language is Ukrainian, which can be recorded at different stages of language learning and at different language levels. Here, perhaps the key factor in the emergence of communicative barriers is the fact of the close relationship of the languages that are in focus, which is caused by the discrepancy between the level of understanding and the level of production of language structures. Understanding the prerequisites for the emergence of communicative barriers is a necessary component in choosing a strategy for building a foreign language teaching, which, among other things, is one of the key factors for successful social adaptation and integration of migrants in crisis situations. At the same time, the primary task of a foreign language teacher is to promote the development of both linguistic and cultural competence.
Cattle colours with a dendrological component as an ethnolinguistic phenomenon
Tsimur Buiko
In Slavic languages, a large number of colour designations are derived from the names of trees; these designations are not only formed morphologically, but also in a lexico-semantic way. This mainly concerns the name of the birch tree, which is noticeable primarily in numerous Pol-ish derivatives, some of which can be built up into Proto-Slavic prototypes. However, a similar phenomenon can be observed in other languages around the world. Generally speaking, these coloratives are of both narrow linguistic (etymological) and ethnolinguistic interest. They reflect the view of the Slavic peoples on the importance of dendroflora in material and spiritual life and help shed light on the worldview of the ancient Slavs.
Problems of transliteration and translation of Kazakh geographical names
Sholpan K. Zharkynbekova, Zhazira Agabekova, Assem Zh. Aksholakova
The article addresses the challenges associated with standardizing and unifying the spelling of toponyms in Kazakhstan. The authors conduct an analysis of the linguistic variability of toponyms, exploring methods for their transcription into Kazakh, Russian, and English languages. The study's findings reveal that a majority of the country's geographical names undergo various modifications. The authors identify and scrutinize several types of transformations, including transliteration, phonetic adjustments, morphological changes, lexical transformations, reduction (pollination), translation or calquing, reinterpretation, and renaming (denomination). The study establishes that these modifications adhere to general language laws and are influenced by differences in the typological characteristics of Turkic and Slavic languages. The article argues that the intensification of toponym renaming processes necessitates coordination and control by state administration bodies. This involves systematic organization and standardization of geographical names. The issue of standardizing geographical names in Kazakhstan is particularly pertinent, especially amid ongoing discussions about the country's potential shift to the Latin script.
Provincial Administrative Elite and State Duma: Career and Kinship Connections (1907-1912)
V. A. Lovtsov, S. К. Lyamin
This article focuses on the socio-political life during the terms of the II, III, and IV State Dumas, specifically examining the career and kinship connections within the “Stolypin administration” and the deputy corps. The study covers the period from April 26, 1906, to September 5, 1911, when P. A. Stolypin held the position of Minister of Internal Affairs. It also explores the period from the election of the II Duma on February 20, 1907, to the election of the IV Duma on November 15, 1912, to identify former or future governors or vice-governors among the deputies of the Stolypin era. The geographical scope of the study includes the provinces of European Russia. The authors aim to analyze the relationship between bureaucratic and political elites during the Stolypin era. The article examines the career biographies and factional affiliations of Stolypin administrators connected to the Duma, as well as the kinship ties between the governor corps and Duma deputies. The authors conclude that provincial administrators were predominantly associated with the Union of October 17 and right-wing factions, with most not seeking to pursue parliamentary careers. Typically, these were public figures who had previously held positions in local government and nobility self-government before serving in provincial administration.
Slavic languages. Baltic languages. Albanian languages
Algebraic Language Theory with Effects
Fabian Lenke, Stefan Milius, Henning Urbat
et al.
Regular languages -- the languages accepted by deterministic finite automata -- are known to be precisely the languages recognized by finite monoids. This characterization is the origin of algebraic language theory. In this paper, we generalize the correspondence between automata and monoids to automata with generic computational effects given by a monad, providing the foundations of an effectful algebraic language theory. We show that, under suitable conditions on the monad, a language is computable by an effectful automaton precisely when it is recognizable by (1) an effectful monoid morphism into an effect-free finite monoid, and (2) a monoid morphism into a monad-monoid bialgebra whose carrier is a finitely generated algebra for the monad, the former mode of recognition being conceptually completely new. Our prime application is a novel algebraic approach to languages computed by probabilistic finite automata. Additionally, we derive new algebraic characterizations for nondeterministic probabilistic finite automata and for weighted finite automata over unrestricted semirings, generalizing previous results on weighted algebraic recognition over commutative rings.
Directed Regular and Context-Free Languages
Moses Ganardi, Irmak Saglam, Georg Zetzsche
We study the problem of deciding whether a given language is directed. A language $L$ is \emph{directed} if every pair of words in $L$ have a common (scattered) superword in $L$. Deciding directedness is a fundamental problem in connection with ideal decompositions of downward closed sets. Another motivation is that deciding whether two \emph{directed} context-free languages have the same downward closures can be decided in polynomial time, whereas for general context-free languages, this problem is known to be coNEXP-complete. We show that the directedness problem for regular languages, given as NFAs, belongs to $AC^1$, and thus polynomial time. Moreover, it is NL-complete for fixed alphabet sizes. Furthermore, we show that for context-free languages, the directedness problem is PSPACE-complete.
On the Impact of Language Selection for Training and Evaluating Programming Language Models
Jonathan Katzy, Maliheh Izadi, Arie van Deursen
The recent advancements in Transformer-based Language Models have demonstrated significant potential in enhancing the multilingual capabilities of these models. The remarkable progress made in this domain not only applies to natural language tasks but also extends to the domain of programming languages. Despite the ability of these models to learn from multiple languages, evaluations typically focus on particular combinations of the same languages. In this study, we evaluate the similarity of programming languages by analyzing their representations using a CodeBERT-based model. Our experiments reveal that token representation in languages such as C++, Python, and Java exhibit proximity to one another, whereas the same tokens in languages such as Mathematica and R display significant dissimilarity. Our findings suggest that this phenomenon can potentially result in performance challenges when dealing with diverse languages. Thus, we recommend using our similarity measure to select a diverse set of programming languages when training and evaluating future models.