Shuffles of Context-Free Languages along Regular Trajectories
Corentin Barloy, Michaël Cadilhac, Kyle Ockerlund
In single-core processors, when multiple processes execute concurrently, they are, in practice, intertwined by a scheduler as a single thread of execution. The language-theoretic operation that corresponds to this is the shuffle of two languages: in general, this is defined as the set of words obtained by interleaving words from the first and second language in an arbitrary fashion. It is well known that regular languages are closed under shuffles, while context-free languages (CFL) are not. Following an established line of research, this paper considers shufflings according to regular "trajectories", that is, subject to scheduling constraints expressed by an automaton. Unsurprisingly, some trajectories, such as "a word from the first language first, then a word from the second", allow for CFLs to be shuffled into CFLs, while some other trajectories do not. This paper provides a robust toolset to show that a given trajectory would always shuffle two nonregular CFLs into a nonCFL. In the case of deterministic CFLs (DCFLs), a salient trichotomy of trajectories depending on how they shuffle DCFLs is provided. These results are based on intricate expressiveness lemmas for CFLs and DCFLs of independent interest, the latter lemma relying on a recent result of Jančar and Šíma (MFCS'2021).
On A. V. Anisimov's problem for finding a polynomial algorithm checking inclusion of context-free languages in group languages
Krasimir Yordzhev
The work investigates the problem of whether a context-free language is a subset of a group language. A.~V. Anisimov has shown that the problem of determining the unambiguity of finite automata is a special case of this problem. Then the question of finding polynomial algorithm verifying the inclusion of context-free languages in group languages naturally arises. The article focuses on this open problem. For the purpose, the paper describes an unconventional method of description of context-free languages, namely a representation with the help of a finite digraph whose arcs are labelled with a specially defined monoid $\mathcal{U}$. Also, we define a semiring $\mathcal{S}_\mathcal{U}$ whose elements are the set $2^\mathcal{U}$ of all subsets of $\mathcal{U}$ and with operations - product and union of the elements of $2^\mathcal{U}$. The described algorithm executes no more than $O(n^3)$ operations in $\mathcal{S}_\mathcal{U}$.
Inimese ja looduse suhte mõtestamine esseistikas. Funktsionaalne vaade eesti esseistikale
Asko Lõhmus
The problem of how knowledge becomes understanding of complex phenomena, such as social-ecological systems, is of both theoretical and applied interest. This article is built on the premise that published essays serve as an archive of these cognitive processes. An important reason is that essay-writing is practised across different cultural systems (including scientists), providing potential grounds for cultural exchange.
I conduct a functional analysis of 108 essays, selected as a stratified sample from 36 Estonian authors representing three groups (literature; science; other) since the 20th century. The main criteria were that (i) each essay explores the relationship between humans (societies) and the nature, along with the interactions between both systems; and (ii) the author has at least three such essays (if more, three were chosen to reflect a diversity of thought). The presence of texts by all three groups of authors reveals three periods of heightened discourse on human-nature relationships: 1967–1980, 1996–2003, and from 2017 onwards.
Structurally, ¾ of the essays represent isolated lines of thought (only 10% form coherent programs), where the authors elaborate on their own ideas while referencing mainly distant (foreign) authors. Direct reference to other Estonian authors or building on one’s own prior work is rare. Thus, essays mostly function as “idea laboratories”, with limited trans-disciplinary co-creation of understanding. Furthermore, while readily incorporating new knowledge and ideas, the essays reveal a relatively stable mixture of three main ethical attitudes towards nature throughout their history.
By distinguishing the key components of social-ecological complexity (Fig. 2), a “frame text” was defined as its most complete (or, secondarily, the most concise) representation. Jaan Kaplinski’s “Ecology and Economics” (1972/1996) and Hando Runnel’s “Caretakers and Saviours” (1988) can be seen as frame texts among Estonian essays on human-nature relationships.
Other Finnic languages and dialects
„Ma olen kõnelenud!”. Varauusaegse Tartu ülikooli (1632–1710) akadeemiliste oratsioonide korpusest
Rahel Toomik
“I have spoken!”
The corpus of academic orations at the University of Tartu during the early modern period (1632–1710)
This article introduces the genre of orations, or academic speeches, at the University of Tartu during the early modern period (1632–1710) and examines their role as a central literary, pedagogical, social, and performative practice in university life. The Tartu oration corpus consists of approximately 230 printed texts, mostly in Latin, authored by both students and other members of the academic community. These speeches were composed on a wide range of topics, and for various purposes and occasions. Alongside a description of the corpus based on bibliographic metadata and a comparative analysis of its paratextual features (title pages, dedications, and congratulatory poems), the article focuses on identifying the key characteristics of the oration genre and distinguishing orations from other academic genres, particularly disputations, which have often received greater scholarly attention and overshadowed orations in historical research. The article also explores the value of orations as sources for intellectual history and considers why academic speeches and the oratory tradition have at times been overlooked or dismissed. It provides an overview of existing research on the Tartu oration corpus, offers new perspectives for understanding the genre, highlights accessibility issues related to bibliographic data, and reflects on how and why distant reading techniques used in the digital humanities could be used to further investigate and elevate the corpus.
Other Finnic languages and dialects
Quantitative Language Automata
Thomas A. Henzinger, Pavol Kebis, Nicolas Mazzocchi
et al.
A quantitative word automaton (QWA) defines a function from infinite words to values. For example, every infinite run of a limit-average QWA A obtains a mean payoff, and every word w is assigned the maximal mean payoff obtained by nondeterministic runs of A over w. We introduce quantitative language automata (QLAs) that define functions from language generators (i.e., implementations) to values, where a language generator can be nonprobabilistic, defining a set of infinite words, or probabilistic, defining a probability measure over infinite words. A QLA consists of a QWA and a language aggregator. For example, given a QWA A, the infimum aggregator maps each language L to the greatest lower bound assigned by A to any word in L. For boolean value sets, QWAs capture trace properties, and QLAs capture hyperproperties. For more general value sets, QLAs serve as a specification language for a generalization of hyperproperties, called quantitative hyperproperties. A nonprobabilistic (resp. probabilistic) quantitative hyperproperty assigns a value to each set (resp. distribution) G of traces, e.g., the minimal (resp. expected) average response time exhibited by the traces in G (resp. by traces sampled according to G). We give several examples of quantitative hyperproperties and investigate three paradigmatic problems for QLAs: evaluation, nonemptiness, and universality. In the evaluation problem, given a QLA AA and an implementation G, we ask for the value that AA assigns to G. In the nonemptiness (resp. universality) problem, given a QLA AA, a threshold k, and a comparison in {>, >=} we ask whether AA assigns a value meeting the threshold to some (resp. every) language. We provide a comprehensive picture of decidability and complexity for these problems for QLAs with common aggregators as well as their restrictions to omega-regular languages and distributions generated by finite Markov chains.
(Im)politeness research on historical texts: an interaction-based approach on the samples derived
Mustafa, Zeynep
(Im)politeness research is a field of study that is not limited to politeness and
impoliteness methods used in contemporary languages but is also becoming enriched with
studies on the data obtained from historical texts. Research on the earlier periods of different
languages provides valuable information about the development the phenomenon of linguistic
politeness. Studies on different periods of Turkic, especially the Old Uyghur Turkic period, are
gradually increasing. Conducting interaction-based analyzes that consider the turn structure to
understand how forms of politeness are used in spoken language is of great importance for data
obtained from historical texts as well as contemporary languages. However, the lack of
databases created by using video or audio recordings alleviates the need for a unique method for
interaction-based politeness studies in historical texts. In response to the mentioned need, this
study aims to present a sample research process for interaction-based politeness analysis in
dialogues obtained from historical texts. This process can be summarized as the application of
(im)politeness analysis upon the preliminary analysis called the Context-Dialogue-Interaction
Analysis trilogy. The trilogy is not limited to politeness research but can also be used to
investigate other similar phenomena in pragmatics. In the article, some sample analyzes
obtained from the work called Hamzanâmeler, belonging to the Old Anatolian Turkish period,
will be presented.
Language and Literature, Ural-Altaic languages
Nõukogude aja kirjandus- ja kultuurielu. Saateks
Tõnu Tannberg
Literary and cultural life in Soviet times. Foreword
Other Finnic languages and dialects
Lexical outcomes of Karelian-Russian bilingualism in Tver Karelian
Susanna Tavi
This study investigates the language contact between Tver Karelian and Russian, attempting to provide a comprehensive overview of the lexicon of bilingual code. The methodology includes a combination of statistical analyses and handling contact-induced change in terms of the Code-Copying Framework (=CCF). Nine interviews with nine people were conducted using the memory walk method. In code copying, correlations were found between different word classes and contact-relatedness. In code alternation, few differences were found between different speakers and one commonality was the use of complex numerals as Russian phrases without adapting them into the Tver Karelian code. The findings confirm that the copies are of a certain kind
and appear in certain word classes. Code alternation sequences suggest that, according to the CCF, the discourse rather than the language is mixed. The findings within CCF have implications on minority language policies, as the findings support the use of bilingual terminology.
Kokkuvõte. Susanna Tavi: Karjala-vene kakskeelsuse mõju tverikarjala keele sõnavarale. Käesolevas uurimistöös uuritakse tverikarjala ja vene keele kontakte. See uuring püüab anda tervikliku ülevaate kakskeelse koodi sõnavarast. Metoodika sisaldab kombinatsiooni statistilistest analüüsidest ja kontaktidest põhjustatud muutuste käsitlemisest koodikopeerimise raamistiku (Code-Copying Framework = CCF) osas. Üheksa intervjuud üheksa inimesega viidi läbi mälukõnni meetodil. Leiti seoseid erinevate sõnaklasside ja kontaktidega seotuse vahel. Koodivahelduses leiti eri kõnelejate vahel vähe erinevusi ja üheks ühiseks jooneks oli keerukate arvsõnade kasutamine venekeelsete fraasidena, ilma neid tverikarjala koodi sobitamata. Leiud kinnitavad, et koopiad on teatud liiki ja esinevad teatud sõnaklassides. Koodi vaheldumise jadad viitavad sellele, et CCF-i kohaselt on segatud eelkõige diskursus, ja mitte keel. CCF-i leiud avaldavad mõju vähemuskeelte poliitikale, kuna leiud toetavad kakskeelse terminoloogia kasutamist.
Philology. Linguistics, Finnic. Baltic-Finnic
Factor-balanced $S$-adic languages
Léo Poirier, Wolfgang Steiner
A set of words, also called a language, is letter-balanced if the number of occurrences of each letter only depends on the length of the word, up to a constant. Similarly, a language is factor-balanced if the difference of the number of occurrences of any given factor in words of the same length is bounded. The most prominent example of a letter-balanced but not factor-balanced language is given by the Thue-Morse sequence. We establish connections between the two notions, in particular for languages given by substitutions and, more generally, by sequences of substitutions. We show that the two notions essentially coincide when the sequence of substitutions is proper. For the example of Thue-Morse-Sturmian languages, we give a full characterisation of factor-balancedness.
Relationships Between Bounded Languages, Counter Machines, Finite-Index Grammars, Ambiguity, and Commutative Regularity
Arturo Carpi, Flavio D'Alessandro, Oscar H. Ibarra
et al.
It is shown that for every language family that is a trio containing only semilinear languages, all bounded languages in it can be accepted by one-way deterministic reversal-bounded multicounter machines (DCM). This implies that for every semilinear trio (where these properties are effective), it is possible to decide containment, equivalence, and disjointness concerning its bounded languages. A condition is also provided for when the bounded languages in a semilinear trio coincide exactly with those accepted by DCM machines, and it is used to show that many grammar systems of finite index -- such as finite-index matrix grammars and finite-index ETOL -- have identical bounded languages as DCM. Then connections between ambiguity, counting regularity, and commutative regularity are made, as many machines and grammars that are unambiguous can only generate/accept counting regular or commutatively regular languages. Thus, such a system that can generate/accept a non-counting regular or non-commutatively regular language implies the existence of inherently ambiguous languages over that system. In addition, it is shown that every language generated by an unambiguous finite-index matrix grammar has a rational characteristic series in commutative variables, and is counting regular. This result plus the connections are used to demonstrate that finite-index matrix grammars and finite-index ETOL can generate inherently ambiguous languages (over their grammars), as do several machine models. It is also shown that all bounded languages generated by these two grammar systems (those in any semilinear trio) can be generated unambiguously within the systems. Finally, conditions on languages generated by finite-index matrix grammars and finite-index ETOL implying commutative regularity are obtained. In particular, it is shown that every finite-index EDOL language is commutatively regular.
On Dynamic Lifting and Effect Typing in Circuit Description Languages (Extended Version)
Andrea Colledan, Ugo Dal Lago
In the realm of quantum computing, circuit description languages represent a valid alternative to traditional QRAM-style languages. They indeed allow for finer control over the output circuit, without sacrificing flexibility nor modularity. We introduce a generalization of the paradigmatic lambda-calculus Proto-Quipper-M, itself modeling the core features of the quantum circuit description language Quipper. The extension, called Proto-Quipper-K, is meant to capture a very general form of dynamic lifting. This is made possible by the introduction of a rich type and effect system in which not only computations, but also the very types are effectful. The main results we give for the introduced language are the classic type soundness results, namely subject reduction and progress.
Quantitative analysis of character networks in Polish 19th- and 20th-century novels
Marek Kubis
7 sitasi
en
Computer Science
Modal words that remark certainty in turkmen language
Elanur
Modal words, which are not included in grammars of Turkish, are considered as an
independent word class in grammars of Turkic languages. These words, which have been
evaluated by traditional understanding in grammar books, are one of the lexical markers of the
modality, which has attracted the attention of many researchers in recent years. In other words,
they are words that give various meanings to a sentence of statement produced by the speaker.
Modal words expressing certainty are structures that mark expressions in which the speaker
expresses his/her point of view without hesitation and confidently, based on the knowledge
he/she knows or believes. The more confident the speaker has in his/her knowledge, the more precise his/her expressions will be. In this study, the certainty within the scope of the epistemic
modality will be explained within the framework of the modality and the modal words that
express certainty in Turkmen language will be emphasized.
Language and Literature, Ural-Altaic languages
On a chagatai turkic yusuf u züleyha
Mehmet SARIKÖSE
The work that is the subject of the study is a Yusuf u Züleyha mesnevi written in
Chagatai Turkic. The work was copied by the Swedish missionary Gunnar Hermansson between
1929 and 1930 and it was registered with name of “The Verses on Yusuf and Zulaikha” in
Sweden Lund University Library Jarring Collection Prov. 12. The work was written in the East
Turkestan region, which is located in today's Xinjiang (Xinjiang) region of China.
The work, which consists of 164 sheets and was written in 3 notebooks, was donated to
Lund University in 1982. Originating from East Turkestan, Yarkend, the work was copied to
Abdul Kadir Ahund by the Swedish missionary Gunnar Hermansson. Gunnar Jarring stated that
the work was a translation of Molla Câmî's Persian masnavi named Yusuf u Züleyha in the tag
he created for the work. The studied work reflects the characteristics of Chagatai Turkic in
terms of language features. It is understood that the work belongs to the last period of Chagatai
Turkic in terms of its copied date. It is thought that a comparative study to be made between the
phonetic and morphological features of the southern dialects of New Uighur Turkic, New
Uighur Turkic and Chagatai Turkic with the linguistic features of the work will contribute to
researchers working in the related fields.
Language and Literature, Ural-Altaic languages
Dynamic Membership for Regular Languages
Antoine Amarilli, Louis Jachiet, Charles Paperman
We study the dynamic membership problem for regular languages: fix a language L, read a word w, build in time O(|w|) a data structure indicating if w is in L, and maintain this structure efficiently under letter substitutions on w. We consider this problem on the unit cost RAM model with logarithmic word length, where the problem always has a solution in O(log |w| / log log |w|) per operation. We show that the problem is in O(log log |w|) for languages in an algebraically-defined, decidable class QSG, and that it is in O(1) for another such class QLZG. We show that languages not in QSG admit a reduction from the prefix problem for a cyclic group, so that they require Ω(log |w| / log log |w|) operations in the worst case; and that QSG languages not in QLZG admit a reduction from the prefix problem for the multiplicative monoid U 1 = {0, 1}, which we conjecture cannot be maintained in O(1). This yields a conditional trichotomy. We also investigate intermediate cases between O(1) and O(log log |w|). Our results are shown via the dynamic word problem for monoids and semigroups, for which we also give a classification. We thus solve open problems of the paper of Skovbjerg Frandsen, Miltersen, and Skyum [30] on the dynamic word problem, and additionally cover regular languages.
User-Friendly MES Interfaces Recommendations for an AI-Based Chatbot Assistance in Industry 4.0 Shop Floors
Soujanya Mantravadi, A. Jansson, Charles Møller
Transeurasian as a continuum of diffusion
E. Vajda
Intermingling of Turkic, Mongolic, and Tungusic speakers over many centuries left multiple overlapping layers of contact-induced language change in their wake. While the dynamics of pastoralist mobility spread linguistic traits far and wide, it remains unresolved whether contact alone (together with coincidental resemblance) can account for all of the shared features in the families traditionally grouped as “Altaic,” or whether some homologies represent evidence of deeper common ancestry. Without arguing strongly for or against either possibility, this chapter considers how typological parallels may have diffused among pastoral Inner Eurasia’s four autochthonous families—Uralic, Turkic, Mongolic, and Tungusic—and also into Yeniseian, Yukaghir, Chukchi-Kamchatkan, Nivkh, Ainu, Koreanic, and Japonic—families and isolates that interacted less pervasively with steppe and forest pastoralists.
Hiina keele sugulusest ugri keelte ja eriti soome-eesti keelega (1895)
Karl August Hermann
Eesti 19. sajandi väljapaistev keeleteadlane, entsüklopedist ja helilooja Karl August Hermann (1851–1909) toob välja tunnused, mis võiksid osutada ugri, st soome-ugri ning altai keelte, sh eesti ja soome keele sugulusele hiina keelega. Ta vaatleb sõnatüvesid ja -juuri, sõnamoodustust, võimalikke ühiseid tüvesid, sarnaseid lausungeid ning omastava käände ja omadussõnalise täiendi asendit, mis võiksid osutada erinevate kaugete keelte sugulusele. Ta teeb järelduse, et hiina keel on soome-ugri ja altai keeltega suguluses. Hermanni saksakeelne artikkel, mis ilmus aastal 1895, on tõlgitud eesti keelde. Abstract. Karl August Hermann: About the relationship of Chinese with the Ugrian languages and especially with the Finnish-Estonian (1895). Karl August Hermann (1851–1909), an eminent Estonian linguist, encyclopedist and composer in the nineteenth century, identifies features that might indicate the affinity of Ugrian, i.e. Finno-Ugric and Altaic languages, including Estonian and Finnish, with Chinese. He looks at word stems and roots, word formation, possible common word stems, similar utterances, and the position of the genitive and the adjective in relation to the noun that might indicate the affinity of different distant languages. He concludes that Chinese is related to Finno- Ugric and Altaic languages. Hermann’s forgotten article, published in German in 1895, has been translated into Estonian by Urmas Sutrop.
Eessõna
Gerson Klumpp, Valentin Gusev
Eessõna
Philology. Linguistics, Finnic. Baltic-Finnic
Opportunities and Challenges for Circuit Board Level Hardware Description Languages
Richard Lin, Björn Hartmann
Board-level hardware description languages (HDLs) are one approach to increasing automation and raising the level of abstraction for designing electronics. These systems borrow programming languages concepts like generators and type systems, but also must be designed with human factors in mind to serve existing hardware engineers. In this work, we look at one recent prototype system, and discuss open questions spanning from fundamental models through usable interfaces.