Molham Aref, Paolo Guagliardo, George Kastrinis
et al.
From the moment of their inception, languages for relational data have been described as sublanguages embedded in a host programming language. Rel is a new relational language whose key design goal is to go beyond this paradigm with features that allow for programming in the large, making it possible to fully describe end to end application semantics. With the new approach we can model the semantics of entire enterprise applications relationally, which helps significantly reduce architecture complexity and avoid the well-known impedance mismatch problem. This paradigm shift is enabled by 50 years of database research, making it possible to revisit the sublanguage/host language paradigm, starting from the fundamental principles. We present the main features of Rel: those that give it the power to express traditional query language operations and those that are designed to grow the language and allow programming in the large.
We prove that all standard subregular language classes are linearly separable when represented by their deciding predicates. This establishes finite observability and guarantees learnability with simple linear models. Synthetic experiments confirm perfect separability under noise-free conditions, while real-data experiments on English morphology show that learned features align with well-known linguistic constraints. These results demonstrate that the subregular hierarchy provides a rigorous and interpretable foundation for modeling natural language structure. Our code used in real-data experiments is available at https://github.com/UTokyo-HayashiLab/subregular.
Aleksander Boruch-Gruszecki, Yangtian Zi, Zixuan Wu
et al.
Large language models (LLMs) already excel at writing code in high-resource languages such as Python and JavaScript, yet stumble on low-resource languages that remain essential to science and engineering. Besides the obvious shortage of pre-training data, post-training itself is a bottleneck: every new language seems to require new datasets, test harnesses, and reinforcement-learning (RL) infrastructure. We introduce Agnostics, a language-agnostic post-training pipeline that eliminates this per-language engineering. The key idea is to judge code solely by its externally observable behavior, so a single verifier can test solutions written in any language. Concretely, we (i) use an LLM to rewrite existing unit-test datasets into an I/O format, (ii) supply a short configuration that tells the verifier how to compile and run a target language, and (iii) apply reinforcement learning with verifiable rewards (RLVR) in a robust code execution environment. Applied to five low-resource languages--Lua, Julia, R, OCaml, and Fortran--Agnostics (1) improves Qwen-3 4B to performance that rivals other 16B-70B open-weight models; (2) scales cleanly to larger and diverse model families (Qwen-3 8B, DeepSeek Coder 6.7B Instruct, Phi 4 Mini); and (3) for ${\le} 16$B parameter models, sets new state-of-the-art pass@1 results on MultiPL-E and a new multi-language version of LiveCodeBench that we introduce. We release the language-agnostic training datasets (Ag-MBPP-X, Ag-Codeforces-X, Ag-LiveCodeBench-X), training code, and ready-to-use configurations, making RL post-training in any programming language as simple as editing a short YAML file.
It is crucial to consider two foundational principles of phonosemantics: the principle of non-arbitrariness (motivation), principle of the arbitrariness of the linguistic sign. The former principle suggests a pervasive interrelation among real-world phenomena and objects. Numerous instances in the history of science demonstrate the discovery of connections between seemingly unrelated phenomena. In contrast, the principle of the arbitrariness of the linguistic sign asserts the independence between the signifier and the signified, clashing with the overarching principle of hierarchization. According to this principle, each element in a ’higher’ system can act as an autonomous ’lower’ system. As such, a word becomes an independent ’lower’ system, possessing a substrate—the signified and the signifier—and a structural relationship between them. Stripping a word of these connections removes its structural integrity; without structure, it ceases to be a system. Therefore, Ferdinand de Saussure, the pioneer of systemic linguistics, declared the principle of the arbitrariness of the linguistic sign, challenging the very concept of systemicity. While embracing the system of arbitrariness of the linguistic sign, we acknowledge that not all words are motivated. Many words exist whose motivation remains undetermined with current etymological research. Thus, we recognize two perspectives on this issue: natural and conventional. Fundamentally, a linguistic sign is arbitrary, yet in contemporary synchronic analysis, it manifests a dual nature: both arbitrary and motivated. It is important to discern which principle dominates in each instance of nomination. In any specific nominative act, a certain characteristic of the denoted object is selected as the basis of the nomination, and in this critical moment, the nomination is motivated rather than arbitrary. Often, the selection of this particular characteristic might be random, highlighting the nomination’s arbitrariness, or its lack of motivation. Phonetic symbolism embodies a regular, non-arbitrary connection, phonetically driven, between the phonemes of a word and the non-acoustic attribute of the denotate that forms the basis of its nomination. Phonetic semantics is a natural spontaneous connection between the phonemes of a word and the non-sound characteristic of the denonation, which serves as the basis for nomination. Scientific research aims to consider the studied phenomenon in two aspects of its manifestation: oh the one hand, phonetic semantics has a statistical character, on the other hand it is psychophysiological process based on synaesthesia, syntenemia and kinematics. Research materials are offered on the examples of Germanic and Slavic languages, which are a demonstration and confirmation of the truth of the investigation. The chronology of scientific facts about the functioning of phonetic semantics allows us to come to the conclusion that this linguistic phenomenon was developed at the early stages of the formation of languages and is in constant dynamics and processes that are regular and dynamic significantly affect the relationship between the occurrence and meaning of lexical units over time, which is evidenced by the linguistic transformations of the lexical endowment of different system languages.
Farid Arfi, Hélène Coullon, Frédéric Loulergue
et al.
We propose an overview of the decentralized reconfiguration language Concerto-D through its Maude formalization. Concerto-D extends the already published Concerto language. Concerto-D improves on two different parameters compared with related work: the decentralized coordination of numerous local reconfiguration plans which avoid a single point of failure when considering unstable networks such as edge computing, or cyber-physical systems (CPS) for instance; and a mechanized formal semantics of the language with Maude which offers guarantees on the executability of the semantics. Throughout the paper, the Concerto-D language and its semantics are exemplified with a reconfiguration extracted from a real case study on a CPS. We rely on the Maude formal specification language, which is based on rewriting logic, and consequently perfectly suited for describing a concurrent model.
Denne artikkelen handler om hvordan et utvalg lærere ivaretar en bimodal og tospråklig undervisning av døve og hørselshemmede elever som har norsk tegnspråk som ett av sine språk. Datamaterialet er fra tre kommunale skoler som har et tegnspråklig opplæringstilbud. Skolene har organisert opplæringstilbudet på ulike måter. Noen har tegnspråklige og talespråklige elever i ulike avdelinger, mens andre har dem i samme klasse. I denne artikkelen baseres analysen primært på individuelle intervjuer med tolv lærere, men suppleres med feltnotater basert på observasjoner av lærernes undervisning. Analysene viser at lærerne beskriver noen pedagogiske praksiser som er særegne i den forstand at de ivaretar en tospråklig bimodal undervisning mellom norsk og norsk tegnspråk, de differensierer læremateriell og aktiviteter til elevenes språklige og kulturelle ståsted, og de fasiliterer kommunikativ og emosjonell støtte til enkeltelever og klassefellesskapet. Lærerne har som mål at det etableres inkluderende læringsmiljø hvor både tegnspråklige og talespråklige elever har anledning til å oppleve faglig mestring, tilhørighet til fellesskapet og anerkjennelse av sin egen og hverandres egenart.
We present Scallop, a language which combines the benefits of deep learning and logical reasoning. Scallop enables users to write a wide range of neurosymbolic applications and train them in a data- and compute-efficient manner. It achieves these goals through three key features: 1) a flexible symbolic representation that is based on the relational data model; 2) a declarative logic programming language that is based on Datalog and supports recursion, aggregation, and negation; and 3) a framework for automatic and efficient differentiable reasoning that is based on the theory of provenance semirings. We evaluate Scallop on a suite of eight neurosymbolic applications from the literature. Our evaluation demonstrates that Scallop is capable of expressing algorithmic reasoning in diverse and challenging AI tasks, provides a succinct interface for machine learning programmers to integrate logical domain knowledge, and yields solutions that are comparable or superior to state-of-the-art models in terms of accuracy. Furthermore, Scallop's solutions outperform these models in aspects such as runtime and data efficiency, interpretability, and generalizability.
The phenomenon of successive language learning acceleration, frequently experienced by polyglots, when in order to learn a new language from a familiar language group the polyglot requires less time with each new language, is widely known, but, it seems, it has never been thoroughly examined. This article presents a simple mathematical model based on the author’s own data, which has been collected over the course of three years’ worth of independent language study and describes how much faster one learns languages from the same group. The number of hours spent on a new language as a function of the number of previously known languages is described by a simple exponential function with two parameters: the “starting time” and the “half-life”. According to the author’s hypotheses, these parameters may provide a numerical measure of certain aspects of language that are difficult to quantify otherwise. The “starting time” could be a measure of propinquity between the learner and the language group, whereas the “halflife” could be a measure of propinquity between the languages of a given group. Additionally, reviewed are three different approaches to keeping track of time spent on language activity as used by different polyglots. These approaches are of importance for collecting data to be used in studies of successive language learning acceleration. At the end of the article, an idealized algorithm for conducting such a study is presented, and particular attention is drawn to the various parameters that must be controlled in order to carry out this kind of research in an appropriate manner. This particular study did not manage to satisfy all of the criteria mentioned, so the reliability of the claims made in this article is debatable, and additional validation is required. Furthermore, the validity of the model has to be confirmed by other researchers and polyglots.
In this paper the author analyses the forms of space in Pierre Bourdieu's field theory, looking in particular at the way they relate to one another and at the spatial aspects of the literary field in his book The Rules of Art: Genesis and Structure of the Literary Field. The first of these forms is what Bourdieu calls the "social space" and "social field", each of these referring to a structure of positions that exists objectively yet does not exist (primarily) in physical space or real (physical) interactions between social agents. In order to make this structure intelligible, Bourdieu creates various spatial schemata that range from simple diagrams to sophisticated visualisations based on multiple correspondence analysis. The relationship between this structure and physical/geographical space is a complicated matter, since relations in the social space or social fields will not necessarily coincide with actual spatial distances or proximities. Nevertheless, Bourdieu demonstrates – especially in the last period of his career – that it is necessary to study the relations among agents and the objectified forms of capital as they play out within physical/geographical space. The last part of the paper deals with the complicated relations of the three spatial aspects of Bourdieu's field theory as they are applied to the literary field. In this respect, the most interesting part of The Rules of Art is the "Prologue". In the rest of the book, spaces that are characteristically physical, such as salons and cafés, give way to non-spatial aspects of the literary field, above all literary texts, which Bourdieu conceives of as a privileged form of "position-taking" on the part of social agents.
Germanic languages. Scandinavian languages, History of Northern Europe. Scandinavia
Language-integrated query based on comprehension syntax is a powerful technique for safe database programming, and provides a basis for advanced techniques such as query shredding or query flattening that allow efficient programming with complex nested collections. However, the foundations of these techniques are lacking: although SQL, the most widely-used database query language, supports heterogeneous queries that mix set and multiset semantics, these important capabilities are not supported by known correctness results or implementations that assume homogeneous collections. In this paper we study language-integrated query for a heterogeneous query language $NRC_λ(Set,Bag)$ that combines set and multiset constructs. We show how to normalize and translate queries to SQL, and develop a novel approach to querying heterogeneous nested collections, based on the insight that ``local'' query subexpressions that calculate nested subcollections can be ``lifted'' to the top level analogously to lambda-lifting for local function definitions.
Abordando o conjunto das chamadas “peças de fala” (Sprechstücke), as primeiras quatro obras escritas para o teatro ainda nos anos 60 por Peter Handke, Prêmio Nobel de Literatura de 2019, o ensaio busca localizar nelas mais do que o jogo autorreferencial alienado censurado por muitos num suposto pós-modernismo. Relacionando o projeto do autor com uma tradição modernista propriamente austríaca e com as preocupações de uma possível filosofia política da linguagem, investigamos essa dramaturgia como uma operação crítica em relação a um funcionamento coercitivo da linguagem, para ao fim relacionar esse interesse com sua posição na história do teatro, propondo seu papel chave para compreender a cena contemporânea.
German literature, Germanic languages. Scandinavian languages
This paper contextualizes the contributions on architecture in rowohlts deutsche enzyklopädie in the years 1955-1962. It explains the influence of Hans Sedlmayr on this publication.
Germanic languages. Scandinavian languages, History of Northern Europe. Scandinavia
NORINT-korpuset (Universitetet i Oslo, 2020) er et forholdsvis nytt innlærerkorpus utviklet ved Institutt for lingvistiske og nordiske studier (ILN) ved Universitetet i Oslo (UiO). NORINT-korpuset inneholder muntlig og skriftlig norsk innlærerspråk av voksne internasjonale studenter med norskferdigheter på eller over nivå B1 i henhold til det felles europeiske rammeverket (Common European Framework of Reference for Languages (CEFR)) (Council of Europe, 2001). I denne artikkelen beskriver vi datamaterialet i korpuset, hvordan det er transkribert og annotert, og hvordan man kan anvende det brukervennlige søkeprogrammet det er lagt inn i. I tillegg viser vi hvordan man kan bruke mulighetene i NORINT-korpuset i forskning. Avslutningsvis sammenlikner vi NORINT-korpuset med ASK – Norsk andrespråkskorpus (ASK) (Universitetet i Bergen, 2020) for å diskutere muligheter og begrensninger i NORINT-korpuset.
The use of adaptive workflow management for in situ visualization and analysis has been a growing trend in large-scale scientific simulations. However, coordinating adaptive workflows with traditional procedural programming languages can be difficult because system flow is determined by unpredictable scientific phenomena, which often appear in an unknown order and can evade event handling. This makes the implementation of adaptive workflows tedious and error-prone. Recently, reactive and declarative programming paradigms have been recognized as well-suited solutions to similar problems in other domains. However, there is a dearth of research on adapting these approaches to in situ visualization and analysis. With this paper, we present a language design and runtime system for developing adaptive systems through a declarative and reactive programming paradigm. We illustrate how an adaptive workflow programming system is implemented using our approach and demonstrate it with a use case from a combustion simulation.
Karolina Zaczynska, Nils Feldhus, Robert Schwarzenberg
et al.
Pre-trained transformer language models (TLMs) have recently refashioned natural language processing (NLP): Most state-of-the-art NLP models now operate on top of TLMs to benefit from contextualization and knowledge induction. To explain their success, the scientific community conducted numerous analyses. Besides other methods, syntactic agreement tests were utilized to analyse TLMs. Most of the studies were conducted for the English language, however. In this work, we analyse German TLMs. To this end, we design numerous agreement tasks, some of which consider peculiarities of the German language. Our experimental results show that state-of-the-art German TLMs generally perform well on agreement tasks, but we also identify and discuss syntactic structures that push them to their limits.
Walter Benjamin escreveu muito sobre cidades e arquitetura ao longo de sua obra. Seja em jornais de viagens, em ensaios dedicados a determinadas formações urbanas, em textos autobiográficos, na sua teoria do flâneur, ou em sua teoria estética, as cidades sempre estiveram em um local central. A cidade ideal de Benjamin não é clássica, “bela”, mas, antes, marcada pela circulação de pessoas, pela “interpenetração” (Durchdringung) e pela “porosidade”. Essa cidade perfurada se metamorfoseia em sua própria filosofia da história, já que pare ele também o tempo é poroso.
German literature, Germanic languages. Scandinavian languages
When programming resource-scarce embedded smart devices, the designer often requires both the low-level system programming features of a language such as C and higher level capability typical of a language like Java. The choice of a particular language typically implies trade offs between conflicting design goals such as performance, costs, and overheads. The large variety of languages, virtual machines, and translators provides the designer with a dense trade off space, ranging from minimalistic to rich full-fledged approaches, but once a choice is made it is often difficult for the designer to revise it. In this work we propose a system of light-weighted and modular extensions as a method to flexibly reshape the target programming language as needed, adding only those application layer features that match the current design goals. In so doing complexity is made transparent, but not hidden: While the programmer can benefit of higher level constructs, the designer can deal with modular building blocks each characterized by a certain algorithmic complexity and therefore each accountable for a given share of the overhead. As a result the designer is given a finer control on the amount of resources that are consumed by the run-time executive of the chosen programming language.
We introduce a new application for inductive logic programming: learning the semantics of programming languages from example evaluations. In this short paper, we explored a simplified task in this domain using the Metagol meta-interpretive learning system. We highlighted the challenging aspects of this scenario, including abstracting over function symbols, nonterminating examples, and learning non-observed predicates, and proposed extensions to Metagol helpful for overcoming these challenges, which may prove useful in other domains.