Tekstirobotid ja laiendatud vaim. Kuidas mõista tehisintellekti rolli akadeemilise teksti loomes?
Vivian Puusepp
Siinses artiklis arutlen kahe küsimuse üle: kui õppija kasutab akadeemilise teksti kirjutamisel tekstiroboti abi, siis (1) kuidas võiks mõista tekstiroboti rolli selles protsessis ning (2) kui akadeemilise teksti kirjutamist käsitleda laiendatud kognitiivse protsessina, siis kas tekstirobot laiendab inimvaimu? Artiklis tekstiroboti võimalikke rolle kaaludes selgub, et tekstirobotit ei saa paigutada ühtegi kategooriasse, millega oleme akadeemilise kirjutamise kontekstis seni harjunud: see pole allikas, (kaas)autor, vestluspartner ega pelgalt tööriist. Teise küsimuse üle arutledes võtan aluseks laiendatud vaimu (ingl extended mind) lähenemise, mille järgi laienevad inimese vaimsed protsessid mõnikord väljapoole tema aju ja keha, hõlmates väliseid protsesse või vahendeid kognitiivse süsteemi osana. Artiklis selgitan, et sellest, kuivõrd vastab tekstiroboti kasutusviis usaldusväärse sidususe (ingl reliable coupling) kriteeriumidele, sõltub ka tekstiroboti kasutamise mõju inimvaimule. Arutluskäigust ilmneb, et inimvaimu laiendamiseks on määravaim usalduse kriteerium, mis eeldab kasutaja oskust tekstiroboti väljundit kriitiliselt hinnata. Kui kasutaja usaldab tekstirobotit pimesi ja kasutab väljundit seda kriitiliselt hindamata, siis tekstiroboti kasutamine pigem ahendab inimvaimu, pärssides kasutaja kirjutamis- ja mõtlemisoskuse arengut.
Abstract. Vivian Puusepp: Chatbots and the extended mind. How to under stand the role of artificial intelligence in academic writing? In this article, I discuss two issues: when a learner uses the assistance of a chatbot in academic writing, (1) how could the role of the chatbot be understood in this process? and (2) assuming academic writing is considered an extended cognitive process, does the chatbot extend the human mind? Studying the possible roles of the chatbot shows that the latter cannot be placed into any category that we are accustomed to in the context of academic writing: it is neither a source, a (co-)author, a conversation partner, nor a mere tool. Discussing the second question, I draw on the extended mind approach, which holds that human mental processes sometimes extend beyond the brain and body, in corporating external processes or tools as parts of the cognitive system. I argue that the extent to which the use of a chatbot meets the criteria of reliable coupling also determines its impact on the human mind. In the article, I clarify that the most crucial criterion for extending the human mind is the criterion of trust, which presupposes the user’s ability to critically evaluate the chatbot’s output. If the user blindly trusts the chatbot and relies on its output without any critical reflection, then the use of the chatbot may actually narrow the human mind, hindering the development of the user’s writing and thinking skills.
Philology. Linguistics, Finnic. Baltic-Finnic
Eessõna
Ilona Tragel, Eleriin Miilman
Philology. Linguistics, Finnic. Baltic-Finnic
Let's Take Esoteric Programming Languages Seriously
Jeremy Singer, Steve Draper
Esoteric programming languages are challenging to learn, but their unusual features and constraints may serve to improve programming ability. From languages designed to be intentionally obtuse (e.g. INTERCAL) to others targeting artistic expression (e.g. Piet) or exploring the nature of computation (e.g. Fractan), there is rich variety in the realm of esoteric programming languages. This essay examines the counterintuitive appeal of esoteric languages and seeks to analyse reasons for this popularity. We will explore why people are attracted to esoteric languages in terms of (a) program comprehension and construction, as well as (b) language design and implementation. Our assertion is that esoteric languages can improve general PL awareness, at the same time as enabling the esoteric programmer to impress their peers with obscure knowledge. We will also consider pedagogic principles and the use of AI, in relation to esoteric languages. Emerging from the specific discussion, we identify a general set of 'good' reasons for designing new programming languages. It may not be possible to be exhaustive on this topic, and it is certain we have not achieved that goal here. However we believe our most important contribution is to draw attention to the varied and often implicit motivations involved in programming language design.
Extensibility in Programming Languages: An overview
Sebastian mateos Nicolajsen
I here conduct an exploration of programming language extensibility, making an argument for an often overlooked component of conventional language design. Now, this is not a technical detailing of these components, rather, I attempt to provide an overview as I myself have lacked during my time investigating programming languages. Thus, read this as an introduction to the magical world of extensibility. Through a literature review, I identify key extensibility themes - Macros, Modules, Types, and Reflection - highlighting diverse strategies for fostering extensibility. The analysis extends to cross-theme properties such as Parametricism and First-class citizen behaviour, introducing layers of complexity by highlighting the importance of customizability and flexibility in programming language constructs. By outlining these facets of existing programming languages and research, I aim to inspire future language designers to assess and consider the extensibility of their creations critically.
Mix-of-Language-Experts Architecture for Multilingual Programming
Yifan Zong, Yuntian Deng, Pengyu Nie
Large language models (LLMs) have demonstrated impressive capabilities in aiding developers with tasks like code comprehension, generation, and translation. Supporting multilingual programming -- i.e., coding tasks across multiple programming languages -- typically requires either (1) finetuning a single LLM across all programming languages, which is cost-efficient but sacrifices language-specific specialization and performance, or (2) finetuning separate LLMs for each programming language, which allows for specialization but is computationally expensive and storage-intensive due to the duplication of parameters. This paper introduces MoLE (Mix-of-Language-Experts), a novel architecture that balances efficiency and specialization for multilingual programming. MoLE is composed of a base model, a shared LoRA (low-rank adaptation) module, and a collection of language-specific LoRA modules. These modules are jointly optimized during the finetuning process, enabling effective knowledge sharing and specialization across programming languages. During inference, MoLE automatically routes to the language-specific LoRA module corresponding to the programming language of the code token being generated. Our experiments demonstrate that MoLE achieves greater parameter efficiency compared to training separate language-specific LoRAs, while outperforming a single shared LLM finetuned for all programming languages in terms of accuracy.
Eesti tanka. Levik, autorid, temaatika ja vormilahendused
Heili Hani
"Estonian tanka: Prevalence, authors, themes and forms". The primary aim is to provide an overview of the evolution of the Estonian tanka genre, to delineate the requirements of its classical form and style, and to analyze its proliferation within Estonian literature. Tanka poems have been published since 1917, with their prevalence expanding notably in the 1960s and persisting to the present day. A total of 76 tankas writers have contributed 1296 tankas to the collection. In addition, there are 34 student authors featured in school almanacs. It is noteworthy that this poetic form has attracted a diverse range of authors, from amateurs to established writers. Over time, both the form and themes of tanka poetry have evolved, encompassing varied approaches to poem construction, from line arrangement to graphic design. At the same time, the subject matter has broadened from the human experience and the experience of nature to almost all aspects of human life, including social critique and expressions of sexuality. Prominent authors in this genre include Minni Nurme, Ain Kaalep, Jaan Kaplinski, and Mats Traat. The concise nature of tanka poetry has led writers to craft more condensed and precise compositions, often leveraging the compounding and wordplay inherent in the Estonian language. This genre allows authors to imbue their work with rich layers of meaning using minimal words. This results in a dense textual experience.
Other Finnic languages and dialects
Context-Free Languages of String Diagrams
Matt Earnshaw, Mario Román
We introduce context-free languages of morphisms in monoidal categories, extending recent work on the categorification of context-free languages, and regular languages of string diagrams. Context-free languages of string diagrams include classical context-free languages of words, trees, and hypergraphs, when instantiated over appropriate monoidal categories. Using a contour-splicing adjunction, we prove a representation theorem for context-free languages of string diagrams: every such language arises as the image under a monoidal functor of a regular language of string diagrams.
Moniverbiset konstruktiot ja oppijansuomen kompleksisuus kielitaidon eri tasoilla
Taina Mylläri
"Verb constructions and complexity across proficiency levels in learner Finnish".
Learner language development can be analysed by measuring complexity, accuracy and fluency.
Complexity, our focus here, can be defined as the range and sophistication of learner language. Syntactic complexity is typically analysed by quantitatively measuring the length of production units or the amount of subordination rather than by exploring syntactic variation and diversity in learner language. In this article, the development of syntactic complexity in written learner Finnish across the CEFR proficiency levels is studied by exploring changes in the use of non-finite verb constructions. The aim of the study is to bring to light differences in complexity that are not captured by the traditional measures of syntactic complexity.
The data in the study comprise 241 written learner Finnish texts (23,596 words) from the University of Jyväskylä Cefling project corpus and they cover all CEFR levels, from A1 to C2. The data are explored both quantitatively and qualitatively. The focus of the study is on the changes in the use of verb constructions containing a finite verb and at least one non-finitive verb form, and on how those changes may reflect the development of syntactic complexity. The results show that the constructions studied do not necessarily grow in length but instead become more varied both lexically and structurally as proficiency increases. Such changes are not revealed by quantitative measures of syntactic complexity focusing on the length of production units. Hence, the results support calls to adopt a more qualitative approach to investigating syntactic complexity. They also suggest that, in some languages, syntactic, morphological and lexical complexity cannot always be separated.
Type Theory as a Language Workbench
Jan de Muijnck-Hughes, Guillaume Allais, Edwin Brady
Language Workbenches offer language designers an expressive environment in which to create their DSLs. Similarly, research into mechanised meta-theory has shown how dependently typed languages provide expressive environments to formalise and study DSLs and their meta-theoretical properties. But can we claim that dependently typed languages qualify as language workbenches? We argue yes! We have developed an exemplar DSL called Velo that showcases not only dependently typed techniques to realise and manipulate IRs, but that dependently typed languages make fine language workbenches. Velo is a simple verified language with well-typed holes and comes with a complete compiler pipeline: parser, elaborator, REPL, evaluator, and compiler passes. Specifically, we describe our design choices for well-typed IRs design that includes support for well-typed holes, how CSE is achieved in a well-typed setting, and how the mechanised type-soundness proof for Velo is the source of the evaluator.
The Ural Poetic School: Phantom or Reality
Yulia S. Podlubnova
This article problematises the use of the “modern Ural poetry” descriptive construction denoting the poetic space of a particular region and considers Ural poetry in a broader context of Russian-language literature. They can be understood differently depending on the researcher’s approach and the research methodology. Thus, it may be regarded as a segment within regional literature, regional poetry, or as a cultural space that exists over territories and borders. What matters is the globalised concept which resolves the opposition between the centre and the province, which is imperative for Russian culture. The analysis of Ural poetry in such a conceptual framework actualises several issues. More particularly, to what extent can regional specificity be decisive for the poetry of the Urals? And what makes up this specificity: poetics, Ural identity, a set of Ural myths, or the structure of a poetic community? How are the boundaries of Ural poetry manifested? What is the degree of its correlation with what is done outside the region? The Ural Poetry School, a project of poet and literary manager V. O. Kalpidi, gives the most important answers to these questions. The article considers the conceptual basis of the project and those significant theoretical and practical changes that have taken place over more than two decades of its existence, especially after the emergence of poetic projects under the InVersiya brand in the late 2010s which presented an alternative view of the development of Ural poetry. This generational and worldview gap fostered the distinction between modern Ural poetry and the Russian-language poetry of the Urals. A closed poetic space with tangible boundaries, its structure and logic of development, which claimed to be the Centre as originally thought by the UPS, has turned into a space that is an organic part of something larger and decentralised. The article emphasises that the view of the poetry of the Urals as a segment of modern Russian-language poetry makes it possible to revise the existing set of myths and ideas and provides more accurate tools for describing it.
Constructions which are used in the function of +sIz in Northeastern Turkish dialects
FATMA ABTEKİN, MEHTAP SOLAK SAĞLAM
The Turkic dialects in Siberia separated from the rest of Turkic at an earlier date. This is
the reason why these dialects differ from other Turkic dialects in terms of phonetics,
morphology and semantics. The Siberian Turkic dialects are divided into two: Southern and
Northern. Sakha Turkic, which is used as a written language, is classified as Northern Siberian
dialect and as border dialect. Khakas, Tuva and Altaic constitute the Southern Siberian dialects.
Although these dialects show some differences among themselves, it can be said that
there are many morphological similarities. In Common Turkic the suffix +sIz is used to derive a
negative adjective of a noun. This suffix is used except Northeastern Turkic in all dialects with
minimal sound differences to show non-existence. Such a suffix doesn’t exist in Northeastern
Turkic. Instead, there are several structures, which have the same function. This characteristic of
Northeastern Turkic is one of the elements that distinguish the Northeastern group from other
Turkic dialects. Furthermore, this feature is an important classification criterion. In the context
of this study, the structures, which are used in the Northeastern Turkic instead of the suffix +sIz
will be identified, examined and classified with the half of dictionaries and grammars.
Language and Literature, Ural-Altaic languages
Programming Languages and Law: A Research Agenda
James Grimmelmann
If code is law, then the language of law is a programming language. Lawyers and legal scholars can learn about law by studying programming-language theory, and programming-language tools can be usefully applied to legal problems. This article surveys the history of research on programming languages and law and presents ten promising avenues for future efforts. Its goals are to explain how the combination of programming languages and law is distinctive within the broader field of computer science and law, and to demonstrate with concrete examples the remarkable power of programming-language concepts in this new domain.
On the Transferability of Pre-trained Language Models for Low-Resource Programming Languages
Fuxiang Chen, Fatemeh Fard, David Lo
et al.
A recent study by Ahmed and Devanbu reported that using a corpus of code written in multilingual datasets to fine-tune multilingual Pre-trained Language Models (PLMs) achieves higher performance as opposed to using a corpus of code written in just one programming language. However, no analysis was made with respect to fine-tuning monolingual PLMs. Furthermore, some programming languages are inherently different and code written in one language usually cannot be interchanged with the others, i.e., Ruby and Java code possess very different structure. To better understand how monolingual and multilingual PLMs affect different programming languages, we investigate 1) the performance of PLMs on Ruby for two popular Software Engineering tasks: Code Summarization and Code Search, 2) the strategy (to select programming languages) that works well on fine-tuning multilingual PLMs for Ruby, and 3) the performance of the fine-tuned PLMs on Ruby given different code lengths. In this work, we analyze over a hundred of pre-trained and fine-tuned models. Our results show that 1) multilingual PLMs have a lower Performance-to-Time Ratio (the BLEU, METEOR, or MRR scores over the fine-tuning duration) as compared to monolingual PLMs, 2) our proposed strategy to select target programming languages to fine-tune multilingual PLMs is effective: it reduces the time to fine-tune yet achieves higher performance in Code Summarization and Code Search tasks, and 3) our proposed strategy consistently shows good performance on different code lengths.
Did Indo-European Languages Stem From a Trans-Eurasian Original Language
Xavier Rouard
This interdisciplinary study allowed me to establish, on the basis of linguistic, genetic, archaeological, historical and religious data, that linguistic concordances between Gaulish and Slavic were linked with Neolithic migrations from North-Western India and Pakistan to Iran, Mesopotamia, Anatolia, the Caucasus, the North of the Black Sea, Danubic and Balkan Europe, Gaul and Iberia, where Neolithic farmers contributed to the formation of the megalithic civilisation which developed in Gaul from 5.000 BC and brought an archaic language stemming from a Trans-Eurasian original language. This explains the linguistic concordances I estab-lished between Gaulish and Dravidian languages – 250 common words from the 500 words I studied (and 160 with Burushaski), as well as with Altaic, Uralic, Kartvelian, Anatolian and Middle-Eastern languages. This also explains similarities I have found in the organisation of the Society and religion, which lead certain researchers to suggest, on the basis of the spread of the very ancient haplogroup H2 P-96 from India to Western Europe, that first Europeans and proto-Dravidians had a very ancient common origin, as the macrohaplog-roup F and the haplogroup H could appear in India. ar-chéologiques, historiques et religieuses, que les correspondances linguistiques entre le gaulois et le slave étaient liées à des migrations Néolithiques d’Inde et du Pakistan du Nord-Ouest vers l’Iran, la Mésopotamie, l’Anatolie, le Caucase, le Nord de la Mer Noire, l’Europe danubienne et balkanique, la Gaule et l’Ibérie, où les agriculteurs néolithiques ont contribué à former la civilisation mégalithique qui s’est développée en Gaule à partir de -5.000 et apporté une langue archaïque issue d’une langue originelle trans-eurasienne. Cela explique les correspondances linguistiques que j’ai établies entre le gaulois et les langues dravidiennes - 250 mots com-muns sur les 500 mots étudiés (et 160 avec le bourouchaski), ainsi qu’avec les langues altaïques, ouraliennes, karvéliennes, anatoliennes et moyen-orientales. Cela explique aussi les similitudes constatées dans l’organisa-tion de la société et la religion, qui amènent certains chercheurs à suggérer, sur la base de la diffusion du très ancien haplogroupe H2 P-96 de l’Inde à l’Europe de l’Ouest, que les premiers Européens et les proto-Dravi-diens avaient une origine commune très ancienne, le macrohaplogroupe F et l’haplogroupe H ayant pu appa-raitre en Inde.
Latvian place names and dialects: A relevant source for the exploration of the Vidzeme South Estonian language
Lembit Vaba
Knowledge about the South Estonian language spoken in the parts of Livonia where Latvian prevailed is based on materials collected from the Leivus residing in Ilzene parish (Lv pagasts) of eastern Vidzeme. Very little language or none at all has been recorded from the South Estonian speakers who are known to have lived in the parishes bordering Ilzene. The article introduces and analyses the works of Latvian place name and dialect researchers focusing on Lejasciems and Kalnamuiža as well as Madona municipality (Lv novads) located in the southeastern corner of Vidzeme where South Estonians have historically lived.
Kokkuvõte. Lembit Vaba: Läti kohanimed ja murded: asjakohane allikas Vidzeme lõunaeesti keele uurimiseks. Teadmised lätikeelsel Liivimaal kõneldud lõunaeesti keelest rajanevad ainestikul, mida on kogutud Vidzeme idaosas Ilzene valla külades elanud leivudelt. Ilzenega piirnevatest valdadest, kus teadaolevalt elas samuti lõunaeestlaste rühmi, on keeleainest talletatud napilt või üldse mitte. Artiklis tutvustatakse ja analüüsitakse neid Läti kohanime- ja murdeuurijate töid, mis on seotud Lejasciemsi, Kalnamuiža ja Vidzeme kagunurga Madona piirkonnaga, kus ajalooliselt on elanud lõunaeestlasi.
Philology. Linguistics, Finnic. Baltic-Finnic
Persons in Linguistics of the Ural-Volga Region: Halil Açıköz
Feride Insanovna Tagirova
The work is devoted to the description of the life of the Turkish linguist H. Çıkgöz. For the first time in the history of Turkish science his scientific interests were directed to the Finno-Ugric peoples of the Ural-Volga region of Russia – the Mari and Udmurts. Methods. The work is written in the genre of an essay, many episodes of which are due to the author’s memories of the scientist. Results. The presented material can be used in the compilation of bio-bibliographic indexes and databases in the field of the study of the Turkic and Finno-Ugric languages of the Ural-Volga region. Discussion. As a scientist H. Açıkgöz had a wide range of scientific interests and systemic knowledge in various fields: in the field of modern linguistics and written monuments, medieval classical poetry and art in general. In Turkey he was a lecturer at Istanbul University, known as a philologist in his true sense. In Tatarstan he was known as a Türkologist, but few had any idea of the true scope of his scientific interests. In Mari El and Udmurtia they did not have time to recognize him, with the exception of a narrow circle of scientists. This work is useful in that it sheds light on his activities in the field of Finno-Ugric studies, which still remain in the shadows. A wide circle of scholars, both in Russia and abroad, still do not know that H. Açıkgöz was engaged in theoretical and practical study of the Finno-Ugric languages, compilation of the Mari-Turkish dictionary and translation of the dictionary of Tatar and Bashkir borrowings in the Mari language of N.I. Isanbaev.
COMPARATIVE ANALYSES OF PREPOSITONS IN JAPANESE AND UZBEK LANGUAGES
Sitorabonu Farxodovna Malikova
In linguistics, the comparison of languages has always been in the center of attention. Although it is recognized by scholars that Japanese and Uzbek belong to the same language family, the Altaic language family, grammatical phenomena in both languages are not the same. While both languages have similarities, they also have differences. Comparing languages belonging to the same language family involves studying the phenomena that occur in that language. The category of agreement is widely observed in both languages, but there are some agreements between Japanese agreement agreements, which are given with one agreement in Uzbek, and the scope of application is narrow. The category of consonants is widely observed in both languages, but there are some consonants among the Japanese suffixes, which are given with one consonant in Uzbek, and the scope of application is also narrow. This article provides a comparative analysis of the Uzbek suffix of the accusative case and the differences between them
Comparative anal0 sys of suffixes of possessive case in japanese and uzbek languages
Sitorabonu Farxodovna Malikova
In linguistics, the comparison of languages has always been in the center of attention. Although it is recognized by scholars that Japanese and Uzbek belong to the same language family, the Altaic language family, grammatical phenomena in both languages are not the same. While both languages have similarities, they also have differences. Comparing languages belonging to the same language family involves studying the phenomena that occur in that language. The category of agreement is widely observed in both languages, but there are some agreements between Japanese agreement agreements, which are given with one agreement in Uzbek, and the scope of application is narrow. The category of consonants is widely observed in both languages, but there are some consonants among the Japanese suffixes, which are given with one consonant in Uzbek, and the scope of application is also narrow. This article provides a comparative analysis of the Uzbek suffix of the accusative case and the differences between them
The Principles Of Typological Study Of Transposition In The Uzbek And Korean Languages
Nazarova Shakhlo
Like Uzbek, Korean belongs to the Altaic language family, and the sources assume that: a) Korean word forms are agglutinative schemes based on the stem + affix, b) sentences are based on the syntactic scheme of “possessive + second part + cut”, c) the stability of word stress and expiratory character, etc. Additionaly in a word structure the following can be divided: 1) 뾐ꭅ덽: 뾐 + ꭅ + 덽 ; 頝ꃝꍡ겑꽽鲙: 頝 + -ꃝꍡ- + -겑- + -꽽- + -鲙such that cores and appendages can be joined one after the other and separated; 2) according to the function of affix morphemes, «뾐-, -덽-, -ꃝꍡ-» word building, «-겑-, -꽽-, -鲙» 3) the formation of transpositive and non-transpositive artificial words by means of word-forming suffixes, 4) the relative freedom of the transpositive connection between independent, auxiliary words and morphemes also point to this genetic connection.
Sakha and Dolgan, the North Siberian Turkic languages
B. Pakendorf, Eugénie Stapert
This chapter provides a brief structural overview of the North Siberian Turkic languages Sakha (also known as Yakut) and Dolgan. Both languages are spoken in the northeast of the Russian Federation: Sakha in the Republic Sakha (Yakutia) and Dolgan on the Taimyr Peninsula. These languages clearly fit the Turkic linguistic profile with vowel harmony, agglutinative morphology, SOV word order, and preposed relative clauses, but owing to contact-induced changes there are considerable differences from other Turkic languages as well. Notable differences are the loss of the Turkic genitive and locative cases and the development of a partitive and a comparative case, as well as a distinction between an immediate and a remote imperative. Like other so-called Altaic languages, Sakha and Dolgan make widespread use of nonfinite verb forms in subordination.