Gennaro Chierchia
Hasil untuk "Slavic languages. Baltic languages. Albanian languages"
Menampilkan 20 dari ~32060 hasil · dari DOAJ, Semantic Scholar, arXiv
V. V. Zhura, I. Yu. Markovina
This study presents a comparative analysis of the genre characteristics of medical discharge documents within Russian and British linguistic culture. The relevance of this research stems from the insufficient exploration of written medical discourse genres, as well as the lack of data regarding variations in the representation of documented templates for outpatient and inpatient medical discharge records and their sociocultural specifics. The aim of this study is to conduct a comparative analysis of medical discharge documents produced in Russian and British linguistic cultures to identify their distinctive features, intragenre variations within a single linguistic culture, and intercultural similarities and differences. Methodological approaches employed include document analysis combined with comparative, stylistic, and interpretative analysis procedures. The study establishes the implications of the key genre-forming parameters for the examined type of documents. The analysis reveals that, within the domestic linguistic culture, intragenre variability is associated with modifications to such parameters as chronotope, producent-recipient orientation, compositional-structural characteristics, and linguistic features. Inter-cultural variability manifests itself in compositional-structural and formal-content-related differences, objective modality, and the intended audience of documents across the compared linguistic cultures. The findings contribute to the theory of intercultural professional communication and the practice of specialized translation.
Jay Lee
It is well-known that abstract interpreters can be systematically derived from their concrete counterparts using a "recipe," but developing sound static analyzers remains a time-consuming task. Reducing the effort required and mechanizing the process of developing analyzers continues to be a significant challenge. Is it possible to automatically retarget an existing abstract interpreter for a new language? We propose a novel technique to automatically derive abstract interpreters for various languages from an existing abstract interpreter. By leveraging partial evaluation, we specialize an abstract interpreter for a source language. The specialization is performed using the semantics of target languages written in the source language. Our approach eliminates the need to develop analyzers for new targets from scratch. We show that our method can effectively retarget an abstract interpreter for one language into a correct analyzer for another language.
N. Nikolaeva, A. Yermoshin, Anastasiya S. Volskaya
The article undertakes a conceptual analysis of the challenges associated with translating works from the Corpus Areopagiticum, a collection of theological treatises attributed to Dionysius the Areopagite from the 1st century. However, these works are unequivocally associated with early medieval Eastern Christian mystical-theological thought, presumably from the turn of the 5th—6th centuries. These texts first appeared in the Slavic Orthodox area in 1370, and subsequent translations emerged at the end of the 17th century, in the 18th and 19th centuries, and, most recently, in contemporary times. The authors introduce a set of criteria that facilitate the differentiation of the analyzed texts into distinct types of text transmission, namely transposition, retelling, and translation. These criteria are founded on factors such as the dominant translation strategy, the approach to the source language, and the textual tradition. The primary research methodology involves a diachronic analysis of linguistic material, employing comparative, stylistic, and textual analysis within the theolinguistic paradigm. The hypothesis posited in the article is substantiated based on empirical evidence. Moreover, the article draws conclusions regarding the impact of general linguistic changes on the nature of translations. This includes shifts in the role and status of the Church Slavonic language, the conditions contributing to the formation of a new literary language, and the inevitable influence of broader cultural and civilizational factors. The paper also explores the tradition of translating otherness, a practice that persists in contemporary times.
Samson Dodzi Fenuku
Formal education relies heavily on effective teaching methods and styles, which are particularly crucial in language education. The transfer of linguistic skills between Russian, a Slavonic language, and Ewe, a Niger-Congo language, presents a complex challenge due to their structural and cultural differences. This study delves into the complexities of transferring skills from Russian, a language from the Slavic branch of the Indo-European family, to Ewe, a language of the Niger-Congo family, focusing on the interplay between teaching methodologies, psychological factors, and philosophical perspectives. The primary objectives were to evaluate the effectiveness of various teaching methods, analyze the psychological influences on language education, and understand how philosophical viewpoints shape language teaching practices. Utilizing a mixed-methods approach, the research combined quantitative survey data from 60 participants with qualitative insights from interviews with 15 educators. The findings highlighted a preference for interactive and technology-assisted methods, with a strong emphasis on communicative techniques. The psychological assessment indicated high levels of motivation and adaptability among learners, favouring interactive and collaborative learning environments. Thematic analysis of philosophical perspectives revealed a diverse spectrum of ideologies that inform teaching practices. The study concludes that a pedagogical approach embracing interactivity, technological integration, and philosophical diversity is essential for effective language education. It calls for professional development that equips educators to employ a range of teaching methods and to effectively use technology. The research underscores the need for a learner-centred, adaptable educational paradigm that caters for diverse learner needs and preferences, advocating a more inclusive approach to language teaching.
Matúš Pikuliak, Andrea Hrckova, Stefan Oresko et al.
We present GEST -- a new manually created dataset designed to measure gender-stereotypical reasoning in language models and machine translation systems. GEST contains samples for 16 gender stereotypes about men and women (e.g., Women are beautiful, Men are leaders) that are compatible with the English language and 9 Slavic languages. The definition of said stereotypes was informed by gender experts. We used GEST to evaluate English and Slavic masked LMs, English generative LMs, and machine translation systems. We discovered significant and consistent amounts of gender-stereotypical reasoning in almost all the evaluated models and languages. Our experiments confirm the previously postulated hypothesis that the larger the model, the more stereotypical it usually is.
Peter Rupnik, Taja Kuzman, Nikola Ljubesic
Automatic discrimination between Bosnian, Croatian, Montenegrin and Serbian is a hard task due to the mutual intelligibility of these South-Slavic languages. In this paper, we introduce the BENCHić-lang benchmark for discriminating between these four languages. The benchmark consists of two datasets from different domains - a Twitter and a news dataset - selected with the aim of fostering cross-dataset evaluation of different modelling approaches. We experiment with the baseline SVM models, based on character n-grams, which perform nicely in-dataset, but do not generalize well in cross-dataset experiments. Thus, we introduce another approach, exploiting only web-crawled data and the weak supervision signal coming from the respective country/language top-level domains. The resulting simple Naive Bayes model, based on less than a thousand word features extracted from web data, outperforms the baseline models in the cross-dataset scenario and achieves good levels of generalization across datasets.
E. V. Bodnaruk, T. N. Astakhova
The purpose of this article is a comparative analysis of evidential markers in German-language texts of scientific and popular science discourse. The novelty of the study lies in the fact that today there are no scientific works devoted to comparing the means of expressing evidentiality in these types of discourse. Evidence markers contain a link to the source of the information. The researchers note that indicating the source of information increases the degree of reliability of the reported information. One of the main characteristics of scientific and popular science discourse is intertextuality, which is expressed with the help of evidential markers that vary depending on the discourse. The material of the study was 5 texts (299 mentions of the source of information) of scientific and 28 texts (281 mentions of the source of information) of popular science discourse in German, dedicated to the problems of the Arctic. As a result of the study, it was found that statements with evidential meanings “direct evidentiality” and “citation” are more common in scientific discourse than in popular science. At the same time, full citations are less common in scientific texts than in popular science ones. The meaning “rumors” as well as fragmentary quoting are rather rare in both discourses. The lowest frequency was found by the value “inferentiality”, fixed only in the texts of popular science discourse.
Wojciech Siemaszkiewicz
ABSTRACT John Leo Mish (1909–1983) was the Chief of the Slavic and Baltic Division from 1956 through 1976. He was also the Chief of the Oriental (later Middle East) Division at the same time. He was a renowned linguist, fluent in 34 languages including Manchu language. He was born in Upper Silesia, during World War II, found himself in Austria, Greece and finally in India. He emigrated to the USA in 1946 where he continued his scholarly career.
Gabriel Orlanski, Kefan Xiao, Xavier Garcia et al.
Current benchmarks for evaluating neural code models focus on only a small subset of programming languages, excluding many popular languages such as Go or Rust. To ameliorate this issue, we present the BabelCode framework for execution-based evaluation of any benchmark in any language. BabelCode enables new investigations into the qualitative performance of models' memory, runtime, and individual test case results. Additionally, we present a new code translation dataset called Translating Python Programming Puzzles (TP3) from the Python Programming Puzzles (Schuster et al. 2021) benchmark that involves translating expert-level python functions to any language. With both BabelCode and the TP3 benchmark, we investigate if balancing the distributions of 14 languages in a training dataset improves a large language model's performance on low-resource languages. Training a model on a balanced corpus results in, on average, 12.34% higher $pass@k$ across all tasks and languages compared to the baseline. We find that this strategy achieves 66.48% better $pass@k$ on low-resource languages at the cost of only a 12.94% decrease to high-resource languages. In our three translation tasks, this strategy yields, on average, 30.77% better low-resource $pass@k$ while having 19.58% worse high-resource $pass@k$.
Manuel Delgado, Jaume Usó i Cubertorer
There is a one-to-one and onto correspondence between the class of numerical semigroups of depth $n$, where $n$ is an integer, and a certain language over the alphabet $\{1,\ldots,n\}$ which we call a Kunz language of depth $n$. The Kunz language associated with the numerical semigroups of depth $2$ is the regular language $\{1,2\}^*2\{1,2\}^*$. We prove that Kunz languages associated with numerical semigroups of larger depth are context-sensitive but not regular.
Mikhail Arkhipov, M. Trofimova, Yuri Kuratov et al.
Our paper addresses the problem of multilingual named entity recognition on the material of 4 languages: Russian, Bulgarian, Czech and Polish. We solve this task using the BERT model. We use a hundred languages multilingual model as base for transfer to the mentioned Slavic languages. Unsupervised pre-training of the BERT model on these 4 languages allows to significantly outperform baseline neural approaches and multilingual BERT. Additional improvement is achieved by extending BERT with a word-level CRF layer. Our system was submitted to BSNLP 2019 Shared Task on Multilingual Named Entity Recognition and demonstrated top performance in multilingual setting for two competition metrics. We open-sourced NER models and BERT model pre-trained on the four Slavic languages.
Albina F. Noskova
At the request of the editorial board of the journal Slavic World in the Third Millennium, Albina Fedorovna Noskova (born 1936), Doctor of Historical Sciences and chief researcher of the Institute of Slavic Studies of the Russian Academy of Sciences, recounts her life and career path in science. She graduated from the Department of Southern and Western Slavs of the History Faculty of Moscow State University in 1959 and then studied at the graduate school of the Institute from 1961 to 1964. Albina Fedoovna is the recognised specialist in both the modern history of Poland and the problems in the history of Soviet-Polish relations. The principal lines of her investigations included the history of Poland and other Eastern European countries during and after World War II, the problems of Slavic-German relations, and the policy of Moscow in Eastern Europe. A. F. Noskova is the author of several hundred academic works, as well as the organiser of and a participant in many international projects and conferences. Albina Fedorovna discusses her childhood, her parents and teachers, her studies at the Department of Southern and Western Slavs of the History Faculty of Moscow State University, and her work in archives and at the Institute of Slavic Studies, as well as her business trips abroad.
Tomoyuki Yamakami
We study the computational complexity of finite intersections and finite unions of deterministic context-free (dcf) languages. Earlier, Wotschke [J. Comput. System Sci. 16 (1978) 456--461] demonstrated that intersections of $(d+1)$ dcf languages are in general more powerful than intersections of $d$ dcf languages for any positive integer $d$ based on the separation result of the intersection hierarchy of Liu and Weiner [Math. Systems Theory 7 (1973) 185--192]. The argument of Liu and Weiner, however, works only on bounded languages of particular forms, and therefore Wotschke's result is not directly extendable to other non-bounded languages. To deal with a wide range of languages for the non-membership to the intersection hierarchy, we circumvent the specialization of their proof technics and devise a new and practical technical tool: two pumping lemmas for finite unions of dcf languages. Since the family of dcf languages is closed under complementation and also under intersection with regular languages, these pumping lemmas help us establish the non-membership relation of languages formed by finite intersections of target languages. We also concern ourselves with a relationship to deterministic limited automata of Hibbard [Inf. Control 11 (1967) 196--238] in this regard.
Giovanna D'Agostino, Davide Martincigh, Alberto Policriti
Ordering the collection of states of a given automaton starting from an order of the underlying alphabet is a natural move towards a computational treatment of the language accepted by the automaton. Along this path, Wheeler \emph{graphs} have been recently introduced as an extension/adaptation of the Burrows-Wheeler Transform (the now famous BWT, originally defined on strings) to graphs. These graphs constitute an important data-structure for languages, since they allow a very efficient storage mechanism for the transition function of an automaton, while providing a fast support to all sorts of substring queries. This is possible as a consequence of a property -- the so-called \emph{path coherence} -- valid on Wheeler graphs and consisting in an ordering on nodes that "propagates" to (collections of) strings. By looking at a Wheeler graph as an automaton, the ordering on strings corresponds to the co-lexicographic order of the words entering each state. This leads naturally to consider the class of regular languages accepted by Wheeler automata, i.e. the Wheeler languages. It has been shown that, as opposed to the general case, the classic determinization by powerset construction is polynomial on Wheeler languages. As a consequence, most of the classical problems turn out to be "easy" -- that is, solvable in polynomial time -- on Wheeler languages. Moreover, deciding whether a DFA is Wheeler and deciding whether a DFA accepts a Wheeler language is polynomial. Our contribution here is to put an upper bound to easy problems. For instance, whenever we generalize by switching to general NFAs or by not fixing an order of the underlying alphabet, the above mentioned problems become "hard" -- that is NP-complete or even PSPACE-complete.
Peter D. Mosses
When a new programming language appears, the syntax and intended behaviour of its programs need to be specified. The behaviour of each language construct can be concisely specified by translating it to fundamental constructs (funcons), compositionally. In contrast to the informal explanations commonly found in reference manuals, such formal specifications of translations to funcons can be precise and complete. They are also easy to write and read, and to update when the language evolves. The PLanCompS project has developed a large collection of funcons. Each funcon is defined independently, using a modular variant of structural operational semantics. The definitions are available online, along with tools for generating funcon interpreters from them. This paper introduces and motivates funcons. It illustrates translation of language constructs to funcons, and funcon definition. It also relates funcons to the notation used in some previous language specification frameworks, including monadic semantics and action semantics.
Alessandro Niero
Review of the book: Vladislav Chodasevič, Non è tempo di essere, a cura di Caterina Graziadei, Milano, Bompiani, 2019.
Paweł Lachowicz
The fall of imperial authority and the decline of the Byzantine state at the end of the 12th century has its cause not only in foreign policy but also, to a large extent, in the family policy of the Komnenoi emperors. The “clan” system introduced during Alexios I’ reign and continued by his successors, connected the aristocratic elites with the imperial family by blood ties. In the 12th century, the composition of this group, linked by a complicated marriage network, underwent a significant transformation, which could be one of the most important factors of the later crisis. The purpose of this paper is twofold. First: distinguishing two groups of aristocrats within the Komnenos “clan” i.e. “core” Komnenos family and affine families. Second: determining their approximate numer during the 12th century. Relatively large amount of data about aristocratic elites of that period allows for statistical approach. Written sources and sigillography of the 12th century Byzantium is rich in information about high ranking persons. In addition, the Komnenos era has been thoroughly described in prosopographical works. This allows for counting the number of aristocrats and thus obtaining reliable results. Such an approach is not free from estimation and probability. However, the amount of information is sufficient enough to show the overall trends visible in the composition of the elites associated with the Komnenoi. The result of this study is a table that shows the tendency of the weakening of the Komnenos family in face of a constantly growing group of affine aristocratic families. This sheds a new light on the progressive collapse of the imperial authority after the death of Manuel I Komnenos, the key role of destructive actions of Andronikos I, and the weakness of the Angelos dynasty.
Yunsha Du
N. Yu. Pilyugina, E. S. Sheremetyeva
The question of the specificity of text functions of combination tak ili inache ‘anyway’ is covered. The novelty of the study is seen in the fact that for the first time this combination is described in terms of its functioning in the text, the types of contexts characteristic of it are analyzed. The analysis revealed that the combination tak ili inache functions as means of text bond with an alternative-concessive meaning. The relevance of the study is determined by a multidimensional approach to the description of tak ili inache textual nature: this phenomenon is analyzed in terms of semantic, syntactic and communicative-pragmatic features. The characteristic of a combination tak ili inache in explanatory dictionaries and dictionaries of functional parts of speech is presented; the concept of a text bond in a circle of adjacent terms is described. It is established that the text bond tak ili inache is a formal means of organizing text constructions, structure of these constructions is described, especially the implementation of the left and right component. Special attention is paid to the explication of the communicative and pragmatic potential of the bond. On the basis of the analysis of the propositions content of the left part of structures with tak ili inache , the contextual modifications of the studied structures are distinguished. The ability of the bond to create typical text constructions characteristic of journalistic and scientific texts is noted.
Halaman 18 dari 1603