Minuska: Towards a Formally Verified Programming Language Framework
Jan Tušil, Jan Obdržálek
Programming language frameworks allow us to generate language tools (e.g., interpreters) just from a formal description of the syntax and semantics of a programming language. As these frameworks tend to be quite complex, an issue arises whether we can trust the generated tools. To address this issue, we introduce a practical formal programming language framework called Minuska, which always generates a provably correct interpreter given a valid language definition. This is achieved by (1) defining a language MinusLang for expressing programming language definitions and giving it formal semantics and (2) using the Coq proof assistant to implement an interpreter parametric in a MinusLang definition and to prove it correct. Minuska provides strong correctness guarantees and can support nontrivial languages while performing well. This is the extended version of the SEFM24 paper of the same name.
The Equivalence Problem of E-Pattern Languages with Regular Constraints is Undecidable
Dirk Nowotka, Max Wiedenhöft
Patterns are words with terminals and variables. The language of a pattern is the set of words obtained by uniformly substituting all variables with words that contain only terminals. Regular constraints restrict valid substitutions of variables by associating with each variable a regular language representable by, e.g., finite automata. Pattern languages with regular constraints contain only words in which each variable is substituted according to a set of regular constraints. We consider the membership, inclusion, and equivalence problems for erasing and non-erasing pattern languages with regular constraints. Our main result shows that the erasing equivalence problem, one of the most prominent open problems in the realm of patterns, becomes undecidable if regular constraints are allowed in addition to variable equality.
«Grei skuring»? En komikers imitasjon av norsk med polsk preg
Anne Golden, Lars Anders Kulbrandstad
I denne artikkelen presenterer vi Danuta, komikeren Lene Kongsvik Johansens scenefigur, slik hun fremstår som del av et standup-show. Danuta er en vaskehjelp fra Polen og forteller om sine erfaringer fra hjemmene hun har arbeidet i, samtidig som hun reflekterer over nordmenns adferd. Vi analyserer Danutas språk og opptreden, spesielt Kongsvik Johansens imitasjon av et andrespråkspreget norsk talt av polakker, inklusive ordvalget, og til dels innholdet i fortellingen. Det vi ønsker å finne ut, er i hvor stor grad Danutas kommunikative adferd har noe til felles med autentisk andrespråksnorsk talt av polakker mens de er innlærere. Data til denne sammenligningen har vi fått fra flere kilder: analyse av et opptak av en polsk innlærer, tidligere studier av polakkers andrespråksnorsk og diskusjon om Johansens imitasjon med fire polske lingvister. Vi finner at Johansens imitasjon har mange trekk til felles med polskpreget andrespråksnorsk, selv om disse ikke er konsekvent gjennomført. Vi stiller også spørsmål om slik humor med imitasjon av andrespråkspreget norsk og humoristiske fortellinger kan oppfattes som nedlatende og dermed er problematisk. Her finner vi at Danutas fortelling er like krass overfor nordmenns væremåter som overfor polakker, og at de stereotypiske forestillingene hun trekker frem, vel så ofte peker på nordmenn. I denne diskusjonen bruker vi analysebegreper fra et par humorteorier, og vi refererer til polske og norske lingvisters synspunkter. I tillegg viser vi til noen standup-komikeres vurdering av humorens grenser slik det kommer frem i media og intervjuer.
North Germanic. Scandinavian
Scandinavism through Dutch and Flemish eyes
Tim van Gerven
ABSTRACT This article analyses the reception of Scandinavism in the Dutch and Flemish press from the start of the nineteenth century and up to the end of World War I. It demonstrates that increasing knowledge of the pan-Scandinavian movement occurred in tandem with a growing interest in Scandinavian culture more generally and affected the Dutch language in its definition of both ‘Scandinavia’ and ‘Scandinavians’. Press coverage of Scandinavism was mainly dictated by the movement’s newsworthiness and Dutch understanding of the movement as either a political unification project and/or a more modest cultural programme developed in step with current events. Pan-national activism in Scandinavia was also deemed of interest because of its significance for similar initiatives in the Dutch-speaking world. Scandinavian nation-building processes became an inspiration for burgeoning Greater Netherlandism, as well as for the Flemish and Frisian national movements. Overall, the article contributes to the study of Scandinavism in particular by exploring how the movement was received outside its borders, and to the study of pan-nationalisms more generally by applying a comparative and transnational perspective, thus amplifying the central but often neglected role played by pan-national identity cultivation in the European nation-building discourse.
Top-Down Versus Bottom-Up Approaches to Aspect: The Case of the Dutch Prepositional Progressive
Maarten Bogaards
Progressive constructions in Germanic are usually studied as progressive constructions—that is, exclusively so. I characterize this as a top-down approach to aspect, which, I argue, harbors the risk of overlooking relevant language-specific structures that are similar in form and meaning. This paper, therefore, advocates taking a bottom-up approach. Based on a case study of the prepositional progressive in Dutch (aan het-progressive), I claim that this approach is of added empirical and theoretical value. Drawing on construction-based theories, the relevant patterns—dubbed situational constructions—are analyzed in terms of horizontal constructional links.*
100 years/100books : a celebration of the Czech-Norwegian diplomatic relations
Adéla Ficová
Germanic languages. Scandinavian languages, History of Northern Europe. Scandinavia
Enumerating Regular Languages with Bounded Delay
Antoine Amarilli, Mikaël Monet
We study the task, for a given language $L$, of enumerating the (generally infinite) sequence of its words, without repetitions, while bounding the delay between two consecutive words. To allow for delay bounds that do not depend on the current word length, we assume a model where we produce each word by editing the preceding word with a small edit script, rather than writing out the word from scratch. In particular, this witnesses that the language is orderable, i.e., we can write its words as an infinite sequence such that the Levenshtein edit distance between any two consecutive words is bounded by a value that depends only on the language. For instance, $(a+b)^*$ is orderable (with a variant of the Gray code), but $a^* + b^*$ is not. We characterize which regular languages are enumerable in this sense, and show that this can be decided in PTIME in an input deterministic finite automaton (DFA) for the language. In fact, we show that, given a DFA $A$, we can compute in PTIME automata $A_1, \ldots, A_t$ such that $L(A)$ is partitioned as $L(A_1) \sqcup \ldots \sqcup L(A_t)$ and every $L(A_i)$ is orderable in this sense. Further, we show that the value of $t$ obtained is optimal, i.e., we cannot partition $L(A)$ into less than $t$ orderable languages. In the case where $L(A)$ is orderable (i.e., $t=1$), we show that the ordering can be produced by a bounded-delay algorithm: specifically, the algorithm runs in a suitable pointer machine model, and produces a sequence of bounded-length edit scripts to visit the words of $L(A)$ without repetitions, with bounded delay -- exponential in $|A|$ -- between each script. In fact, we show that we can achieve this while only allowing the edit operations push and pop at the beginning and end of the word, which implies that the word can in fact be maintained in a double-ended queue.
For a Better Dictionary: Revisiting Ecolexicography as a New Paradigm
Xiqin Liu, Jing Lyu, Dongping Zheng
Driven by practical conundrums that users often face in maximizing (e-)dictionaries
as a companion resource, this article revisits and redefines ecolexicography as a new paradigm that
situates compilers and users in a relational dynamic. Drawing insights from ecolinguistics and
cognitive studies, it appeals for rethinking the compiler–user relationship and placing dictionaries
in a distributed cognitive system. A multidimensional framework of ecolexicography is proposed,
consisting of a micro-level and a macro-level. To the micro-level, both symbolic and cognitive
dimensions are added: (1) the dictionary can be symbolically viewed as a semantic and semiotic
ecology; (2) dialogicality should be highlighted as an essential aspect of e-dictionary compilation/
design, and distributed cognition can be emancipatory for rethinking dictionary use. The macrolevel concerns the obligations of lexicographers as committed to three interrelated ecologies or
ecosystems: language, socio-culture and nature. Transdisciplinary in nature, ecolexicography
involves a holistic, systematic and integrative methodology to nourish lexicographical practice and
research. Corpus-based Frame Analysis is introduced to identify ecologically destructive frames
and ideologies so that the dictionary discourse could be reframed. The study upgrades our understanding of the ontological, epistemological and methodological aspects related to ecolexicography, serving as a call for philosophical reflections on metalexicography. It is also expected to create an
opportunity for lexicographers to examine problems with (e-)dictionaries in a new light and dialogue about how to find solutions.
Philology. Linguistics, Languages and literature of Eastern Asia, Africa, Oceania
Foreign Language Use in German and Turkish Magazine Advertisements: A Comparative Analysis
İrem Atasoy
Advertisements today are influenced negatively by the use of foreign language as a persuasive tool. This is driven by globalization and the expansion of international brands and multinational advertising agencies. Advertisers frequently use foreign languages to attract attention and to evoke associations and internationality. Foreign words and phrases in advertisements are defined as linguistic codes that can be easily distinguished and remembered by individuals because of their distinctive phonological and morphological features. Such foreign words and phrases are emphasized by using various visual codes (e.g., font, point size, uppercase/ lowercase letters, and color) as typographic components. The use of foreign language in advertisements is one of the most analyzed subjects in the field of advertising language. This study seeks to examine the use of foreign languages in German and Turkish magazine advertisements. Adopting more linguistic and semiotic approaches, which treat language as a sign and advertisement as a message system, this paper focuses on the growing role of foreign languages in German and Turkish advertisements. The corpus consists of the first three online issues of Elle magazine published in 2021 in Germany and Turkey. The data are first classified quantitatively and qualitatively then analyzed in three steps applying linguistic and semiotic criteria. The first step involves the linguistic analysis of foreign words and phrases. The second step undertakes a semiotic analysis of the typographic elements employed as visual codes of foreign language usage in advertisements. The third step is devoted to a discussion and comparison of the results of the foregoing analysis.
German literature, Germanic languages. Scandinavian languages
Yiddish Causal-Noncausal Alternation in Areal Perspective
E. Luchina
Languages differ in the way they code causal-noncausal alternations, in which an event is presented as either having an external causer or happening by itself. Some languages make no distinction between the two situations, while others make a morphosyntactic distinction. Yiddish, a Germanic language, differs from other genealogically close Germanic varieties: Yiddish codes causal-noncausal alternation similarly to the coterritorial Slavic languages with which it was in contact, for instance Polish and Russian. The two tendencies that make Yiddish similar to the Slavic languages in this respect are the rise of anticausative marking (direct calquing) and the development of a causative (preference for overt marking).
Auxiliary Selection in Yiddish Dialects
L. Schäfer
The variation of the two past tense auxiliaries (HAVE and BE) is a well-studied phenomenon in European languages, especially in the West Germanic varieties. So far, however, the situation in Eastern Yiddish has not been examined. This paper focuses on auxiliary selection in these Yiddish dialects based on data from the Language and Culture Archive of Ashkenazic Jewry, which were collected in the 1960s. Like most of the current works on this topic, the following analysis uses and discusses Sorace’s (1993, 2000) Auxiliary Selection Hierarchy, which allows to examine the Yiddish structures in light of historical and diatopic evidence from other Germanic varieties, particularly German and Dutch. The main focus is on intransitive verbs that show a high degree of variation—state verbs, controlled and uncontrolled motional process verbs, and change-of-state verbs. However, the Auxiliary Selection Hierarchy also has weaknesses, as is demonstrated in the following.*
On abstraction in the OMG hierarchy: systems, models, and descriptions
A. Prinz, T. Xanthopoulou, Terje Gjøsæter
et al.
Dark Riders: Disease, Sexual Violence, and Gender Performance in the Old English Mære and Old Norse Mara
Caroline R. Batten
Three early medieval Germanic languages—Old English, Old High German, and Old Norse—contain related words for a supernatural female being: mære in Old English, mara in Old Norse, and mahr in Old High German.1 Their shared root is the Indo-European *mer, to crush or oppress.2 The term has a long life: a creature called a mara also appears in folk narratives collected in Scandinavia and Hungary in the nineteenth and twentieth centuries. Catharina Raudvere and Eva Pócs, in their respective studies of this later material, conclude that the folkloric mara is a nocturnal female being who crushes men in their beds and livestock in their stables by mounting them, and whose attacks have an implicit erotic element.3 The natures of the much older Old English mære and Old Norse mara, however, have never been the subject of systematic examination. Indeed, much scholarship makes assumptions about the identities and behavior of these beings without such an investigation. Most dictionaries translate mære and mara as “nightmare” without elaboration, or define
Parallel Hyperedge Replacement String Languages
Graham Campbell
There are many open questions surrounding the characterisation of groups with context-sensitive word problem. Only in 2018 was it shown that all finitely generated virtually Abelian groups have multiple context-free word problems, and it is a long-standing open question as to where to place the word problems of hyperbolic groups in the formal language hierarchy. In this paper, we introduce a new language class called the parallel hyperedge replacement string languages, show that it contains all multiple context-free and ET0L languages, and lay down the foundations for future work that may be able to place the word problems of many hyperbolic groups in this class.
Benchmarking the Status of Default Pseudorandom Number Generators in Common Programming Languages
Nils van den Honert, Diederick Vermetten, Anna V. Kononova
The ever-increasing need for random numbers is clear in many areas of computer science, from neural networks to optimization. As such, most common programming language provide easy access to Pseudorandom Number Generators. However, these generators are not all made equal, and empirical verification has previously shown some to be flawed in key ways. Because of the constant changes in programming languages, we perform the same empirical benchmarking using large batteries of statistcal tests on a wide array of PRNGs, and identify that while some languages have improved significantly over the years, there are still cases where the default PRNG fails to deliver sufficiently random results.
DWUG: A large Resource of Diachronic Word Usage Graphs in Four Languages
Dominik Schlechtweg, Nina Tahmasebi, Simon Hengchen
et al.
Word meaning is notoriously difficult to capture, both synchronically and diachronically. In this paper, we describe the creation of the largest resource of graded contextualized, diachronic word meaning annotation in four different languages, based on 100,000 human semantic proximity judgments. We thoroughly describe the multi-round incremental annotation process, the choice for a clustering algorithm to group usages into senses, and possible - diachronic and synchronic - uses for this dataset.
Mapeamento de Estudos da Linguística Contrastiva Português/Alemão: Dados Bibliográficos no Brasil
Flaviana da Silva Sipriano, Rebeca Santos de Souza, Rogéria Costa Pereira
Este trabalho tem como propósito relatar o desenvolvimento, a metodologia e os resultados de um Projeto de Iniciação Científica que propõe a construção de uma base de dados da bibliografia da linguística contrastiva Português/Alemão. De natureza documental-bibliográfica, a coletânea bibliográfica foi realizada em bases de dados brasileiras de publicações em português, e as referências compiladas no software Zotero. As buscas foram realizadas entre agosto e dezembro de 2019 e se utilizou as palavras-chave (Língua Alemã; Língua Portuguesa; Alemão; Português; Linguística; Linguística Contrastiva), assim como as grandes áreas da Linguística (Morfologia; Fonética; Fonologia; Semântica; Pragmática e Sintaxe). Foram identificadas ao todo 48 pesquisas, publicadas entre os anos de 1972 e 2019, classificadas entre Literatura Branca e Literatura Cinzenta. Os dados apontam uma prevalência de artigos publicados no periódico Pandaemonium Germanicum, da Universidade de São Paulo (USP), especialmente na década de 1990, assim como um grande número de pesquisas na área de semântica.
German literature, Germanic languages. Scandinavian languages
Literaturunterricht über YouTube – Erklärvideos in heterogenen Lerngruppen
Juliane Dube
Im digitalen Zeitalter sind Videoplattformen aus (in-)formellen Bildungskontexten nicht mehr wegzudenken. Mit dem wachsenden Angebot an Erklärvideos ist auch die Deutschdidaktik gefordert, sich mit diesem Medienformat im Rahmen von fachdidaktischer Unterrichtsforschung (vgl. Dube 2019, Dube / Hußmann 2019) auseinanderzusetzen. Dies gilt umso mehr beim Blick auf die ungleich verteilten Bildungschancen im Umgang mit digitalen Medien. Der Beitrag beschäftigt sich mit Erklärvideos sowie anhand von zwei ausgewählten Fallbeispielen mit der qualitativ-quantitativen Beschreibung damit einhergehender individueller Handlungsstrategien zum Aufbau gattungstypologischen Wissens. Die Kontrastierung soll einerseits die Spannweite initiierter Lernprozesse verdeutlichen und andererseits Hinweise zur Gestaltung von Lernumgebungen mit Erklärvideos im Kontext einer entwicklungs- und erwerbsorientierten Fachdidaktik Deutsch liefern.
Abstract (english): Literature Lessons with YouTube – Educational videos in inclusive contexts
In the digital age, video-platforms have become essential for (in-)formal education. With increasing numbers of educational videos, the Literary Pedagogics must evaluate this audio-visual media-format within its educational research (c.f. Dube 2019, Dube / Hußmann 2019). This holds especially true in respect to the unequally distributed educational opportunities related to media usage. This article deals with the usage of educational videos on the basis of two qualitative-quantitative descriptions to build up genre-typological knowledge. Their contrast shall demonstrate the range of learning processes initiated, as well as providing indication for modelling learning situations with educational videos in inclusive education contexts.
Education, Communication. Mass media
LSDC - A comprehensive dataset for Low Saxon Dialect Classification
Jan B. Siewert, Yves Scherrer, Martijn Wieling
et al.
12 sitasi
en
Computer Science, Geography
ThingML+ Augmenting Model-Driven Software Engineering for the Internet of Things with Machine Learning
Armin Moin, Stephan Rössler, Stephan Günnemann
In this paper, we present the current position of the research project ML-Quadrat, which aims to extend the methodology, modeling language and tool support of ThingML - an open source modeling tool for IoT/CPS - to address Machine Learning needs for the IoT applications. Currently, ThingML offers a modeling language and tool support for modeling the components of the system, their communication interfaces as well as their behaviors. The latter is done through state machines. However, we argue that in many cases IoT/CPS services involve system components and physical processes, whose behaviors are not well understood in order to be modeled using state machines. Hence, quite often a data-driven approach that enables inference based on the observed data, e.g., using Machine Learning is preferred. To this aim, ML-Quadrat integrates the necessary Machine Learning concepts into ThingML both on the modeling level (syntax and semantics of the modeling language) and on the code generators level. We plan to support two target platforms for code generation regarding Stream Processing and Complex Event Processing, namely Apache SAMOA and Apama.