NILE: Formalizing Natural-Language Descriptions of Formal Languages
Tristan Kneisel, Marko Schmellenkamp, Fabian Vehlken
et al.
This paper explores how natural-language descriptions of formal languages can be compared to their formal representations and how semantic differences can be explained. This is motivated from educational scenarios where learners describe a formal language (presented, e.g., by a finite state automaton, regular expression, pushdown automaton, context-free grammar or in set notation) in natural language, and an educational support system has to (1) judge whether the natural-language description accurately describes the formal language, and to (2) provide explanations why descriptions are not accurate. To address this question, we introduce a representation language for formal languages, Nile, which is designed so that Nile expressions can mirror the syntactic structure of natural-language descriptions of formal languages. Nile is sufficiently expressive to cover a broad variety of formal languages, including all regular languages and fragments of context-free languages typically used in educational contexts. Generating Nile expressions that are syntactically close to natural-language descriptions then allows to provide explanations for inaccuracies in the descriptions algorithmically. In experiments on an educational data set, we show that LLMs can translate natural-language descriptions into equivalent, syntactically close Nile expressions with high accuracy - allowing to algorithmically provide explanations for incorrect natural-language descriptions. Our experiments also show that while natural-language descriptions can also be translated into regular expressions (but not context-free grammars), the expressions are often not syntactically close and thus not suitable for providing explanations.
Automatic Translation Between Kreol Morisien and English Using the Marian Machine Translation Framework
Z. Boodeea, S. Pudaruth, Nitish Chooramun
et al.
Kreol Morisien is a vibrant and expressive language that reflects the multicultural heritage of Mauritius. There are different versions of Kreol languages. While Kreol Morisien is spoken in Mauritius, Kreol Rodrige is spoken only in Rodrigues, and they are distinct languages. Being spoken by only about 1.5 million speakers in the world, Kreol Morisien falls in the category of under-resourced languages. Initially, Kreol Morisien lacked a formalised writing system, with many people using different spellings for the same words. The first step towards standardisation of writing Kreol Morisien was after the publication of the Kreol Morisien orthography in 2011 and Kreol Morisien grammar in 2012 by the Kreol Morisien Academy. Kreol Morisien obtained a national position in the year 2012 when it was introduced in educational organisations. This was a major breakthrough for Kreol Morisien to be recognised as a national language on the same level as English, French, and other oriental languages. By providing a means for Kreol Morisien speakers to connect with others, a translation system will help to preserve and strengthen the identity of the language and its speakers in an increasingly globalized world. The aim of this paper is to develop a translation system for Kreol Morisien and English. Thus, a dataset consisting of 50,000 parallel Kreol Morisien and English sentences was created, where 48,000 sentence pairs were used to train the models, while 1000 sentences were used for evaluation and another 1000 sentences were used for testing. Several machine translation systems such as statistical machine translation, open-source neural machine translation, a Transformer model with attention mechanism, and Marian machine translation are trained and evaluated. Our best model, using MarianMT, achieved a BLEU score of 0.62 for the translation of English to Kreol Morisien and a BLEU score of 0.58 for the translation of Kreol Morisien into English. To our knowledge, these are the highest BLEU scores that are available in the literature for this language pair. A high-quality translation tool for Kreol Morisien will facilitate its integration into digital platforms. This will make previously inaccessible knowledge more accessible, as the information can now be translated into the mother tongue of most Mauritians with reasonable accuracy.
5 sitasi
en
Computer Science
The role of suggestion in the enrichment of the deconstructive event (a study on the views of Abdullah Al-Ghadami in his book AI-Kateia and Takfir from structural to anatomical)
Mohamad Saeedyan Tabar, Khalil Parvini, Faramarz Mirzaei
et al.
The deconstruction trend, which began independently with the ideas of Jacques Derrida, revolves around the urgency of rebuilding the text after a devastating demolition. Derrida's style in his literary criticism is based on deconstructing the text on the basis of destroying the certainty of meaning and the centrality of fixed semantics rather than being aimed at rebuilding the text. However, with the entry of this movement into the Arab world and its resurgence in the bosom of Abdullah Al-Ghadami's works, this trend found new life and was based on rebuilding the text as the most important artistic function of literary criticism. Within the aforementioned, poetic suggestion and inspiration can be considered the main catalyst for the realization of this propaganda. In this research, depending on the descriptive-analytical approach, the role of suggestion in enriching the deconstructive approach was studied under the shadow of Abdullah Al-Ghadami's actions in the book AI-Kateia and Takfir. The results indicate that this element provides remarkable developments in this approach, including: coherence of interpretation and writing, delving into the problematic, destroying the text and reconstructing it as a literary text, the poeticity of the reader, and Change the speech style.Keywords: Reading, poetic suggestion, deconstructive criticism, Abdullah Al-Ghadami.Extended summary IntroductionIn the last era, in the field of literary criticism, certainty was denied and forced and replaced with doubt and lack of certainty in meaning. There is no doubt that doubt and the destruction of certainty is not just a challenge to the literary text , but rather what is seen is the collapse of the literary text, its dismantling, and then its arrangement again on the basis of the principles of a new critical approach. With the introduction of this idea to the Arab world and its resurgence under the works of Abdullah Al-Ghadhami, emphasis was placed on the aesthetic aspects of the literary text, such as the music of words, and the use of metaphors, similes, to rebuild the literary text after its destruction under the deconstructive idea. Meanwhile, there is an essential element that plays an active role in the use of these tools, as this element is considered a driving factor and these tools are media used to create a new text resulting from deconstructive reading. This force and basic element are nothing but poetic suggestions and inspirations. This internal tool is used in the text to create a unique semantic space in the literary text. Material and methodsIn this study; Relying on deconstructive critical foundations; the effectiveness of this element to renew the literary text is being studied. At the same time; the achievements brought about by this element in the literary text are discussed with a focus on Al-Ghudhami’s views in the book: “Sin and Atonement from Structuralism to Anatomicalism.” What is important in Abdullah Al-Ghadhami’s critical works is his focus on the practical aspect and application to literary examples in criticism. In this regard, his special interest in the factor of poetic inspiration occupies a special place in his critical approach. To achieve this goal, Al-Ghadhami first divided the literary text into smaller components called poetic sentences. Then, by merging these poetic sentences with the reader’s scientific repositories that may be the result of reading other texts that he has read before and are still in his imagination, and also by using emotional experiences and cultural influences, the reader reconstructs the text anew with a context that may be completely different from what the writer first wanted. By observing this element in small components and linking it to deconstructing meaning, we reveal its importance and the secret of its beauty to generate new meaning. Research findingsHe mentions several factors to evoke and activate the element of suggestion in reading the text, and among these factors are: balance and good choice of words, illusion and failure to dissect the moral intent, separating the signifier from its meaning, polarizing the effect, and quotation. It is also believed that there are artistic characteristics that occur during the deconstruction of the text by relying on the element of suggestion, such as: cohesion of interpretation and writing: given that the relationship between the reader and the text is an existential relationship; Because the reader, according to his psychological feeling, casts a shadow over the text, so that a different interpretation and context occurs in every reading, and the reader’s interpretation is what gives the literary text an artistic characteristic. Discussion of Results Involvement in the problem: The literary text always opens the door to presenting multiple explanations, interpretations, and meanings in different contexts, by activating its suggestive energies according to each reader and during each particular reading. It enables the reader to provide multiple explanations, interpretations, and meanings, just as the reader also has his own spiritual states and meanings stored in His mind causes the text to become a spacious world for the wandering of contradictions and multiplicity in meaning and truth, and the destruction of centrality. The text is transformed from a literary work to a literary text: With this view, the way the literary text is interpreted and analyzed differs. So, we liberate the text from the fence of positivistic and arbitrary connotations (as Al-Ghadhami puts it) and place it before a broad horizon of meanings and concepts emerging from different contexts. Poetic reading: Al-Ghadhami, as a deconstructed critic, looks at the literary text with the belief that the text itself is not very important. In fact, it is the reader who gives him a new spirit in every reading.
Oriental languages and literatures
Ms № 1467 Of Arabic Script Manuscripts Collection of the Matenadaran as a Newfound Example of the "Collection of Verse Dictionaries"
Ani Avetisyan
The present article touches upon a series of Ottoman Turkish manuscripts from the Matenadaran's Arabic script manuscripts collection, an example of a unique collection in Ottoman Turkish manuscripts known as the "Collection of Verse Dictionaries" MS No. 1467, in order to provide the first detailed study. These collections were compiled at the religious-educational institutions called tekke or dergāh, and the medrese. They were compiled as language textbooks, in order to provide easy learning of languages (Arabic, Persian, Ottoman Turkish) through the simultaneous use of several verse dictionaries and to be engaged in the process of learning languages by heart. The unique copy of the Matenadaran’s "Collection of Verse Dictionaries" includes 3 complete copies of bilingual (Arabic-Ottoman Turkish) and trilingual (Arabic-Persian-Ottoman Turkish) verse dictionaries of the 14th-15th, 17th and 19th-century writers: copies of Ferişteoġlu ʽAbdullaṭīf ibn Melek’s (proper name was ʽAbdullaīf ʽİzzeddīn et-Tirevī) "Luġat-i Ferişteoġlū" and Bosnalı Ebū̕ l-Fāżl Muḥammed (Meḥmed) ibn Aḥmed er-Rūmī's "Ṣubha-i Ṣıbyān" Arabic-Ottoman Turkish and also complete copy of Adanalı Ḫōca Meḥmed Ḥayret's (propar name was Meḥmed Behāeddīn Ḥayret) "Tuḥfe-i Zībā" (known with another titles as "Tuḥḥfe-i Dürrī" or "Tuḥfe-i Ḥayret" or "Tuḥfe-i Se Zebān") Arabic-Persian-Ottoman Turkish verse dictionaries. The article presents in detail the works included in the collection. At the same time, it has touched upon the methodology of writing verse dictionaries in classical Turkish literature, their structural features, the significance and role of dictionaries in Turkish society, religion, literature and education. The purposes of writing verse dictionaries in all cases were to teach languages, to develop and spread literary speech, and to practice in prosody (especially in ʽArūż meter). The comprehensive presentation of the collection is even sufficient for it to become a part of the manuscripts of the four collections, already known in foreign collections as the "Collection of Verse Dictionaries", in order to become a source of new research opportunities for local and foreign specialists.
Oriental languages and literatures
Return Literature: A Reading on The Meanings and Connotations of The Title
Ramazan Ömer
Modern Palestinian literature has undergone fundamental transformations through three main phases: the pre-Nakba (Catastrophe) period, the post-Nakba period, and the post-Oslo Accords period. These phases have shaped the structure and themes of Palestinian literature. Resistance literature has been associated with or derived from various concepts such as “Nakba Literature,” “Fighter Literature,” “Diaspora Literature,” and “Literature of Return.” These terms reflect the diversity of issues addressed by Palestinian literature, necessitating a clear distinction between them to define their meanings and roles. This study examines “Literature of Return” as an independent literary concept within Palestinian resistance literature, distinguishing it from other related notions to provide a more precise definition. This study explores the fundamental characteristics of “Literature of Return” and its role in emphasizing the right to return as an integral part of Palestinian national identity. This study adopts the literary analysis method by examining various texts and investigating works related to the term “Literature of Return.” In addition, it addresses the different contexts and associations of this term. Additionally, it highlights “Literature of Return” as an independent literary genre that focuses exclusively on the issue of returning to the homeland, setting it apart from other literary currents related to the Palestinian cause. One of the key findings of this study is that “Literature of Return” is a distinct and independent literary genre, differing from “Literature of the Returnees” and “Migrant/Diaspora Literature,” which establishes it as a unique and autonomous category within Palestinian literature.
Oriental languages and literatures
The Prayer of Nabû-šuma-ukīn (BM.40474): An Anti-Witchcraft Prayer
Lenzi, Alan
In 1999 Irving Finkel published the editio princeps of The Prayer of Nabû-šuma-ukīn and argued that the text provides historical corroboration for the imprisonment of Amēl-Marduk (Evil-Merodach) prior to his brief rule over Babylon (561‑560 BCE). In this study, I evaluate Finkel’s interpretation and argue The Prayer has nothing to do with Amēl-Marduk. It is, rather, a prayer to combat witchcraft that has plagued the supplicant in the form of gossip, slander, and character assassination.
Oriental languages and literatures, Asian. Oriental
Extensibility in Programming Languages: An overview
Sebastian mateos Nicolajsen
I here conduct an exploration of programming language extensibility, making an argument for an often overlooked component of conventional language design. Now, this is not a technical detailing of these components, rather, I attempt to provide an overview as I myself have lacked during my time investigating programming languages. Thus, read this as an introduction to the magical world of extensibility. Through a literature review, I identify key extensibility themes - Macros, Modules, Types, and Reflection - highlighting diverse strategies for fostering extensibility. The analysis extends to cross-theme properties such as Parametricism and First-class citizen behaviour, introducing layers of complexity by highlighting the importance of customizability and flexibility in programming language constructs. By outlining these facets of existing programming languages and research, I aim to inspire future language designers to assess and consider the extensibility of their creations critically.
The Relative Monadic Metalanguage
Jack Liell-Cock, Zev Shirazi, Sam Staton
Relative monads provide a controlled view of computation. We generalise the monadic metalanguage to a relative setting and give a complete semantics with strong relative monads. Adopting this perspective, we generalise two existing program calculi from the literature. We provide a linear-non-linear language for graded monads, LNL-RMM, along with a semantic proof that it is a conservative extension of the graded monadic metalanguage. Additionally, we provide a complete semantics for the arrow calculus, showing it is a restricted relative monadic metalanguage. This motivates the introduction of ARMM, a computational lambda calculus-style language for arrows that conservatively extends the arrow calculus.
Forget English!: Orientalisms and World Literatures
A. Mufti
Event I. The development of oriental medicine in Russia in the XVIII-XIX centuries. The role of scientists of the Imperial Medico-Surgical (Military Medical) Academy
Galina O. Andreeva, M. M. Odinak, V. N. Tsygan
et al.
The article presents the history of nascence traditional oriental medicine in Russia during XVIII–XIX centuries. The first information about oriental medicine was brought to Russia in the XVIII century by doctors, who visited Mongolia and China as members of embassy expeditions. The first decades of the XVIII century can be considered as beginning of a systematic study oriental treatment methods. It was possible thanks to the many years efforts of the employees of the Russian ecclesiastical mission in Beijing. This organization from 1715 to 1864 years served religious, diplomatic and scientific functions. An invaluable contribution to the study of Chinese medicine was made by the leaders of the mission. Major role belongs to Nikita Yakovlevich Bichurin (father Iakinf), archimandrite of the IX mission. He was fluent in Chinese, studied the primary sources of medical literature, translated significant treatises into Russian, and taught Chinese to the mission staff. The head of the X mission, Pavel Ivanovich Kamensky, compiled a Chinese-Russian medical dictionary, reorganized the mission, and insisted on the need to introduce the position of a doctor among the staff. Starting from 1821, doctors O.P. Voitsekhovsky, P.E. Kirillov, A.A. Tatarinov, S.I. Bazilevsky and P.A. Kornievsky, graduates of the Imperial Medical and Surgical (Military Medical) Academy worked as physician of the X–XIV missions. Doctors continued to study the theoretical concepts of Chinese medicine, philosophical and cultural traditions that underlie healthcare. In addition to medical work, in accordance with the instructions of the Medical Council at the Ministry of Foreign Affairs, they explored epidemiology, healthcare organization and the process of training doctors in China, analyzed Eastern approaches in the diagnosis and treatment of diseases, pharmacopoeia, used of herbal remedies, methods of prevention and health maintenance. The scientific approach, knowledge of the Chinese language, and a long stay in the country allowed them to lay the foundations of Oriental medicine in Russian, acquaint medical community with the methods of treatment and prevention of diseases adopted in China, introduce acupuncture, moxa, the use of new types of herbal remedies, enrich the collections of medicinal plants.
المجربات في تفاسير القرآن الكريم
Nasreddin Adjdir, Boudaoued Mohammed El Mahdi
This study aims to indicate the name of the experiments so that a person in his life likes to bring righteousness to him as a shepherd for the instinct that broke them, and the mental laws that do not favor anyone indicate attachment to those returns that overflow it with good, from this point of view people stick to experiments that benefit them, but the latter is limited by several legal caveats, especially concerning the corruption of belief or dragged into a greater spoiler, so this study came to look for the extent to which the people of interpretation of the experiments and seize Reliable controls in working out
Oriental languages and literatures, Islam
Building Walls, Social Groups and Empires: A Study of Political Power and Compliance in the Neo-Assyrian Period
Marta Lorenzon, Caroline Wallis
This contribution aims to use social history and social theory to investigate political power and compliance with authority in ancient Western Asia, through the case study of Neo-Assyrian imperial building projects. Our first aim is to discuss the realities of construction work in the Neo-Assyrian Empire, focusing on the building process both through literary sources and archaeological data. Our second goal is to understand the role played by these building sites in the strengthening of local and supra-local political orders, in the consolidation of social group boundaries, and in the construction of political subjectivities of the ancient social actors involved. Our reflection sheds light on the new interpretative possibilities – and challenges – that integrating social theories, archaeological work, and language technology may create.
History of Asia, Oriental languages and literatures
Domain-Specific Tensor Languages
Jean-Philippe Bernardy, Patrik Jansson
The tensor notation used in several areas of mathematics is a useful one, but it is not widely available to the functional programming community. In a practical sense, the (embedded) domain-specific languages (DSLs) that are currently in use for tensor algebra are either 1. array-oriented languages that do not enforce or take advantage of tensor properties and algebraic structure or 2. follow the categorical structure of tensors but require the programmer to manipulate tensors in an unwieldy point-free notation. A deeper issue is that for tensor calculus, the dominant pedagogical paradigm assumes an audience which is either comfortable with notational liberties which programmers cannot afford, or focus on the applied mathematics of tensors, largely leaving their linguistic aspects (behaviour of variable binding, syntax and semantics, etc.) for the reader to figure out by themselves. This state of affairs is hardly surprising, because, as we highlight, several properties of standard tensor notation are somewhat exotic from the perspective of lambda calculi. We bridge the gap by defining a DSL, embedded in Haskell, whose syntax closely captures the index notation for tensors in wide use in the literature. The semantics of this EDSL is defined in terms of the algebraic structures which define tensors in their full generality. This way, we believe that our EDSL can be used both as a tool for scientific computing, but also as a vehicle to express and present the theory and applications of tensors.
The Science of “Balāǧat” and Oriental Classical Literature
S. Rustamiy
Objectives: This article is dedicated to exploring the intricate science of "Balāǧat" and its profound influence on Oriental classical literature. "Balāǧat," an Arabic term, is examined in its role as the art of eloquence and rhetoric that plays a central role in shaping the literary traditions of the Middle East and beyond. The study aims to delve into the historical development of "Balāǧat," seeking to illuminate its significance as a foundational element of classical Arabic, Persian, and other Eastern literary traditions. Methods: The article employs a methodical approach by examining the key principles and techniques of "Balāǧat." It focuses on elements such as metaphors, similes, allegories, and other rhetorical devices, exploring how these tools have been employed by renowned poets and writers to craft masterful works of literature. The investigation extends to understanding the impact of "Balāǧat" across various genres, including poetry, prose, and oratory, with the goal of shedding light on how these rhetorical tools convey complex ideas and emotions with unmatched elegance and sophistication. Results: The article presents the results of its exploration, highlighting the enduring legacy of "Balāǧat" in contemporary literature and its role in shaping the discourse on language, expression, and cultural identity. Through the analysis of selected literary works and critical perspectives, the study seeks to demonstrate the ongoing relevance of "Balāǧat" and its enduring contribution to the rich tapestry of Oriental classical literature. Conclusion: In conclusion, the article underscores the enduring significance of "Balāǧat" as a fundamental aspect of Oriental literary traditions. It emphasizes its lasting impact on the art of expression and the transmission of cultural heritage. The multidisciplinary approach taken throughout the study aims to provide a comprehensive understanding of the science of "Balāǧat" and its intricate connection to the captivating world of Oriental classical literature.
The Oriental context of the topos of the garden in Russian "Fin de siècle" poetry
Tatyana A. Dyachenko
The purpose of the study is to identify the features of the functioning of the topos of the garden in the subject–spatial organization of Russian poetic texts of the 1880s-1890s with a vector to the exoticism of the East. The subject of the study is a garden in the oriental context of poems of the designated period. The object is lyrical works of Russian poets of the last two decades of the XIX century, focused on the eastern context. Particular attention is paid to oriental imagery, which is transferred to the sphere of the universal language of poetry, into which poets translate the subjectivist attitude to the transformation of the real world into an ideal one. The specificity of the material and the multidimensional nature of its consideration determined the general philosophical principles of historicism and consistency as a methodological basis. The research is based on the synthesis of historical-genetic, comparative-historical, functional, intertextual methods. The main conclusions of the study: 1) in the space of the oriental locus of lyrical works of the 1880s-1990s, the garden most often acts as cultural-specific units; 2) unlike the poets of the past decades, the authors of the "fin de siécle" era more often refer to Sufi symbolism itself. The novelty of the research lies in the fact that the poetic heritage of the Muslim East in the texts of the "epoch of timelessness" – pre-symbolism, as well as the initial stage of the symbolist trend, has not previously become the object of systematic research. According to E.A. Tahodi's fair remark, in the last three decades, Russian science has grown interest "in the figures of the "second row", in the epochs of transformation of traditional literary paradigms, including pre-symbolism," which remains the least studied page in the history of Russian literature of the XIX century.
The Acoustics of Alien Space in Oriental Notes by F. Werfel and E. Canetti
N. E. Seibel, E. M. Shastina
The material of the study is the travel notes Egyptian Diaryby F.Werfel and Voices of Marrakeshby E.Canetti, two Austrian writers of Jewish origin. The task of both authors is defined as a re-turn to national and cultural origins and reconstruction of the ‘oriental myth’, which, in their opinion, serves the basis of any aesthetic search. It is concluded that the writers are united by the principle of text fragmenta-tion, namely the transition from one space to another and overcoming of the boundary both locally and in terms of spiritual development. Travel in literature necessarily implies segmentation of the text when chang-ing locations. In most cases, however, verbal communication remains for Werfel and Canetti beyond the mu-sic they seek in the East. The dialogue obscures and recodes the deep strata of the ancestral culture, the de-sire to find which moves the authors. For musically-minded writers, each topos segment has its own acoustic score. The article describes the reasons why both authors refuse to study the language beforehand, preferring to rely on intuition and sense of language. Since for both of them the sound is stronger than the word, they use similar principles of audiolization of the world. The authors of the article dwell on contrast, the capturing of rhythmic shift, and the use of music-related metaphors and other types of ekphrasis. The dialogue con-ducted in a comprehensible language (German, French) is opposed to the music of urban streets and the de-sert in the descriptions by Werfel and Canetti. The incomprehensible is aestheticized and mythologized, while the verbalized is felt as profane,ordinary, flat.
At the Origins of European Oriental Studies: an Unknown Letter by Benjamin Schultze to Georg Jacob Kehr
Kseniia D. Nikolskaia
The Russian Archive of Ancient Acts funds hold an archive belonging to G. J. Kehr (1692–1740), who stood at the very origins of European Oriental studies. There is very little information about this person. His archive is very large and extremely poorly parsed. Among the letters preserved in it are messages from eminent personalities of those days. Some papers of G.J. Kehr are connected with the South of India (Tranquebar), where Lutheran priests who were part of the so-called Danish Royal Mission were working at that time. Among these papers there is a small letter from B. Schultze (1689–-1760). Schultze became the head of the mission in Tranquebar after the death of B. Ziegenbalg (1682–1719), its first organizer. Like Ziegenbalg, Schulze did a lot for the Christianization of the region and for the formation of Oriental studies as a science. He was the first among Europeans to study the Telugu language, published the grammar of this language, translated the texts of the Bible into it. He studied dakkhinī, a dialect of Hindustani. Schulze published a grammar of this language, outlining its basic rules in Latin. His letter below, addressed to Kehr, is obviously a continuation of the previous correspondence. Among other things, the message contains some rules for reading Tamil texts. In addition, valuable information is given about the work of missionaries on the translation of Christian literature into Tamil and about the activity of the Printing house established in Tranquebar. Finally, the letter mentions the names of people significant for the era (language teachers and translators), who probably formed a circle of acquaintances for both G.J.Kehr and B. Schultze.
ViWOZ: A Multi-Domain Task-Oriented Dialogue Systems Dataset For Low-resource Language
Phi Nguyen Van, Tung Cao Hoang, Dung Nguyen Manh
et al.
Most of the current task-oriented dialogue systems (ToD), despite having interesting results, are designed for a handful of languages like Chinese and English. Therefore, their performance in low-resource languages is still a significant problem due to the absence of a standard dataset and evaluation policy. To address this problem, we proposed ViWOZ, a fully-annotated Vietnamese task-oriented dialogue dataset. ViWOZ is the first multi-turn, multi-domain tasked oriented dataset in Vietnamese, a low-resource language. The dataset consists of a total of 5,000 dialogues, including 60,946 fully annotated utterances. Furthermore, we provide a comprehensive benchmark of both modular and end-to-end models in low-resource language scenarios. With those characteristics, the ViWOZ dataset enables future studies on creating a multilingual task-oriented dialogue system.
Annotating Norwegian Language Varieties on Twitter for Part-of-Speech
Petter Mæhlum, Andre Kåsen, Samia Touileb
et al.
Norwegian Twitter data poses an interesting challenge for Natural Language Processing (NLP) tasks. These texts are difficult for models trained on standardized text in one of the two Norwegian written forms (Bokmål and Nynorsk), as they contain both the typical variation of social media text, as well as a large amount of dialectal variety. In this paper we present a novel Norwegian Twitter dataset annotated with POS-tags. We show that models trained on Universal Dependency (UD) data perform worse when evaluated against this dataset, and that models trained on Bokmål generally perform better than those trained on Nynorsk. We also see that performance on dialectal tweets is comparable to the written standards for some models. Finally we perform a detailed analysis of the errors that models commonly make on this data.
Difficulties in teaching the second foreign (English) language to students studying oriental language as their major at NEFU, Yakutia, Russia
A. Ivanova
The Far Eastern Federal District of Russia, including the Republic of Sakha (Yakutia), is located in close proximity to the Asia-Pacific region, which explains the demand in specialists who know oriental languages. One of the oriental languages (Japanese, Chinese or Korean) is studied at the head university of the republic, and English is the language of business communication. In secondary educational institutions of Yakutia, English is the first foreign language, and students study it as a second foreign language in the university. The goal of this study is to identify the main difficulties of learning English as a second foreign language by bilingual students studying the oriental language (Japanese, Chinese or Korean) as their major. The study analyzed domestic and foreign literature, professional educational programs of the North-Eastern Federal University in Yakutsk, characteristics of students from the indigenous population of Yakutia. It was substantiated that the more difficulties in mastering the subject, the stricter the requirements for mastering the educational material.