Hasil "Germanic languages. Scandinavian languages"

arXiv Open Access 2026

A cartesian closed fibration of higher-order regular languages

Paul-André Melliès, Vincent Moreau

We explain how to construct in two different ways a cartesian closed fibration of higher-order regular languages in the sense of Salvati. In the first construction, we use fibrational techniques to derive the cartesian closed fibration from the various categories of regular languages of $λ$-terms associated to finite sets of ground states. In the second construction, we take advantage of the recent notion of profinite $λ$-calculus to define the cartesian closed fibration by a change-of-base from the fibration of clopen subsets over the category of Stone spaces, using an elegant idea coming from Hermida. We illustrate the expressive power of the cartesian closed fibration by generalizing the notion of Brzozowski derivative to higher-order regular languages, using an Isbell-like adjunction in the sense of Melliès and Zeilberger.

en cs.LO, cs.FL

Detail Sumber

arXiv Open Access 2026

YASA: Scalable Multi-Language Taint Analysis on the Unified AST at Ant Group

Yayi Wang, Shenao Wang, Jian Zhao et al.

Modern enterprises increasingly adopt diverse technology stacks with various programming languages, posing significant challenges for static application security testing (SAST). Existing taint analysis tools are predominantly designed for single languages, requiring substantial engineering effort that scales with language diversity. While multi-language tools like CodeQL, Joern, and WALA attempt to address these challenges, they face limitations in intermediate representation design, analysis precision, and extensibility, which make them difficult to scale effectively for large-scale industrial applications at Ant Group. To bridge this gap, we present YASA (Yet Another Static Analyzer), a unified multi-language static taint analysis framework designed for industrial-scale deployment. Specifically, YASA introduces the Unified Abstract Syntax Tree (UAST) that provides a unified abstraction for compatibility across diverse programming languages. Building on the UAST, YASA performs point-to analysis and taint propagation, leveraging a unified semantic model to manage language-agnostic constructs, while incorporating language-specific semantic models to handle other unique language features. When compared to 6 single- and 2 multi-language static analyzers on an industry-standard benchmark, YASA consistently outperformed all baselines across Java, JavaScript, Python, and Go. In real-world deployment within Ant Group, YASA analyzed over 100 million lines of code across 7.3K internal applications. It identified 314 previously unknown taint paths, with 92 of them confirmed as 0-day vulnerabilities. All vulnerabilities were responsibly reported, with 76 already patched by internal development teams, demonstrating YASA's practical effectiveness for securing large-scale industrial software systems.

en cs.SE, cs.CR

Detail DOI Sumber

DOAJ Open Access 2025

Eckhard Meineke. 2023. Studien zum genderneutralen Maskulinum. Heidelberg: Winter. 358 S.

Neef Martin

Germanic languages. Scandinavian languages

Detail DOI Sumber

arXiv Open Access 2025

SLMFix: Leveraging Small Language Models for Error Fixing with Reinforcement Learning

David Jiahao Fu, Aryan Gupta, Aaron Councilman et al.

Recent advancements in large language models (LLMs) have shown very impressive capabilities in code generation across many programming languages. However, even state-of-the-art LLMs generate programs that contains syntactic errors and fail to complete the given tasks, especially for low-resource programming languages (LRPLs). In addition, high training cost makes finetuning LLMs unaffordable with constrained computational resources, further undermining the effectiveness of LLMs for code generation. In this work, we propose SLMFix, a novel code generation pipeline that leverages a small language model (SLM) finetuned using reinforcement learning (RL) techniques to fix syntactic errors in LLM-generated programs to improve the quality of LLM-generated programs for domain-specific languages (DSLs). In specific, we applied RL on the SLM for the program repair task using a reward calculated using both a static validator and a static semantic similarity metric. Our experimental results demonstrate the effectiveness and generalizability of our approach across multiple DSLs, achieving more than 95% pass rate on the static validator. Notably, SLMFix brings substantial improvement to the base model and outperforms supervised finetuning approach even for 7B models on a LRPL, showing the potential of our approach as an alternative to traditional finetuning approaches.

en cs.SE, cs.AI

Detail Sumber

arXiv Open Access 2025

Characterization and Decidability of FC-Definable Regular Languages

Sam M. Thompson, Nicole Schweikardt, Dominik D. Freydenberger

FC is a first-order logic that reasons over all factors of a finite word using concatenation, and can define non-regular languages like that of all squares (ww). In this paper, we establish that there are regular languages that are not FC-definable. Moreover, we give a decidable characterization of the FC-definable regular languages in terms of algebra, automata, and regular expressions. The latter of which is natural and concise: Star-free generalized regular expressions extended with the Kleene star of terminal words.

en cs.LO, cs.FL

Detail Sumber

DOAJ Open Access 2024

Nynorskkompetanse og normrettleiingskompetanse hos master- og lektorstudentar i nordisk

Agnes Wigestrand Hoftun, Stig Jarle Helset

God språkleg normkompetanse i samfunnet er nødvendig for å vidareføra skriftspråka våre til kommande generasjonar, og heilt sentralt i implementeringa av dei språklege normene står noverande og kommande norsklærarar. Denne artikkelen presenterer resultat frå ein kvantitativ studie mellom 33 master- og lektorstudentar i nordisk, der føremålet var å kartlegga studentane sin eigenkompetanse i nynorsk og kompetansen deira til å gi rettskrivingsrettleiing på nynorske tekstar. Studentane blei først bedne om å korrigera ein autentisk elevtekst utan tilgang til hjelpemiddel, og samtidig markera ønskt ordbokbruk. Deretter skulle dei ta stilling til om eit utval førehandsoppstilte ordformer var innanfor eller utanfor gjeldande rettskriving. Til slutt skulle dei svara på eit spørjeskjema der me bad dei om å vurdera både eigenkompetanse og normrettleiingskompetanse i nynorsk. Resultata avdekker at storparten av deltakarane ikkje meistrar sentrale delar av nynorskrettskrivinga, og at dei fleste deltakarane har utfordringar med å gi adekvat rettskrivingsrettleiing på nynorsktekstar.

Philology. Linguistics, Germanic languages. Scandinavian languages

Detail DOI Sumber

DOAJ Open Access 2023

Der Computer schreibt (mit). Digitales Schreiben mit Word, Whatsapp, ChatGPT & Co. als Koaktivität von Mensch und Maschine

Torsten Steinhoff

Gegenstand des vorliegenden Beitrags ist die Entwicklung einer alternativen theoretischen Perspektive auf das digitale Schreiben. Dazu wird zunächst dargelegt, dass die Schreibforschung den Computer bislang anthropozentrisch und instrumentalistisch deutet – als ein vom Menschen beherrschtes Werkzeug zur Produktion von Texten. Davon Abstand nehmend, wird ein theoretischer Ansatz herausgearbeitet, der an Begriffe und Konzepte aus der Literaturwissenschaft, Kommunikationswissenschaft und Soziologie anschließt. Danach ist der Computer als ein „Hardware-Software-Ensemble“ zu verstehen, das einen medialen „Eigensinn“ besitzt und als „Partizipand“ auf unterschiedlichen „Aktivitätsniveaus“ in Schreibpraktiken „koaktiv“ ist, indem es „Gebrauchssuggestionen“ aussendet, die alle Facetten des Schreibens prägen. Am Ende des Beitrags werden drei Impulse für die weitere Forschungsdiskussion gegeben. Abstract (english): The Computer (Co-)Creates. Digital Writing with Word, WhatsApp, ChatGPT & Co. as a Co-Activity Between Man and Machine The subject of this paper is the development of an alternative theoretical perspective on digital writing. To this end, it is first shown that writing research has so far interpreted the computer in an anthropocentric and instrumentalist manner – as a tool controlled by humans for the production of texts. Leaving this aside, a theoretical approach is elaborated that draws on to terms and concepts from literary studies, communication studies, and sociology. According to this approach, the computer is to be understood as a „hardware-software ensemble“ that possesses a medial „stubbornness“ and is „co-active“ as a „participant“ at different „levels of activity“ in writing practices by emitting „suggestions of use“ that influence all facets of writing. At the end of the paper, three impulses for further research discussion are given.

Education, Communication. Mass media

Detail DOI Sumber

arXiv Open Access 2023

Can Large Language Models Transform Natural Language Intent into Formal Method Postconditions?

Madeline Endres, Sarah Fakhoury, Saikat Chakraborty et al.

Informal natural language that describes code functionality, such as code comments or function documentation, may contain substantial information about a programs intent. However, there is typically no guarantee that a programs implementation and natural language documentation are aligned. In the case of a conflict, leveraging information in code-adjacent natural language has the potential to enhance fault localization, debugging, and code trustworthiness. In practice, however, this information is often underutilized due to the inherent ambiguity of natural language which makes natural language intent challenging to check programmatically. The emergent abilities of Large Language Models (LLMs) have the potential to facilitate the translation of natural language intent to programmatically checkable assertions. However, it is unclear if LLMs can correctly translate informal natural language specifications into formal specifications that match programmer intent. Additionally, it is unclear if such translation could be useful in practice. In this paper, we describe nl2postcond, the problem of leveraging LLMs for transforming informal natural language to formal method postconditions, expressed as program assertions. We introduce and validate metrics to measure and compare different nl2postcond approaches, using the correctness and discriminative power of generated postconditions. We then use qualitative and quantitative methods to assess the quality of nl2postcond postconditions, finding that they are generally correct and able to discriminate incorrect code. Finally, we find that nl2postcond via LLMs has the potential to be helpful in practice; nl2postcond generated postconditions were able to catch 64 real-world historical bugs from Defects4J.

en cs.SE, cs.AI

Detail Sumber

arXiv Open Access 2023

Proceedings of the 16th International Conference on Automata and Formal Languages

Zsolt Gazdag, Szabolcs Iván, Gergely Kovásznai

The 16th International Conference on Automata and Formal Languages (AFL 2023) was held in Eger, September 5-7, 2023. It was organized by the Eszterházy Károly Catholic University of Eger, Hungary, and the University of Szeged, Hungary. Topics of interest covered the theory and applications of automata and formal languages and related areas. This volume contains the texts of the 3 invited presentations and the 18 papers selected by the International Program Committee from a total of 23 submissions. We would like to thank everybody who submitted a paper to the conference.

en cs.FL

Detail DOI Sumber

arXiv Open Access 2023

Semantics of Attack-Defense Trees for Dynamic Countermeasures and a New Hierarchy of Star-free Languages

Thomas Brihaye, Sophie Pinchinat, Alexandre Terefenko

We present a mathematical setting for attack-defense trees, a classic graphical model to specify attacks and countermeasures. We equip attack-defense trees with (trace) language semantics allowing to have an original dynamic interpretation of countermeasures. Interestingly, the expressiveness of attack-defense trees coincides with star-free languages, and the nested countermeasures impact the expressiveness of attack-defense trees. With an adequate notion of countermeasure-depth, we exhibit a strict hierarchy of the star-free languages that does not coincides with the classic one. Additionally, driven by the use of attack-defense trees in practice, we address the decision problems of trace membership and of non-emptiness, and study their computational complexities parameterized by the countermeasure-depth.

en cs.FL

Detail Sumber

arXiv Open Access 2023

Can Programming Languages Boost Each Other via Instruction Tuning?

Daoguang Zan, Ailun Yu, Bo Shen et al.

When human programmers have mastered a programming language, it would be easier when they learn a new programming language. In this report, we focus on exploring whether programming languages can boost each other during the instruction fine-tuning phase of code large language models. We conduct extensive experiments of 8 popular programming languages (Python, JavaScript, TypeScript, C, C++, Java, Go, HTML) on StarCoder. Results demonstrate that programming languages can significantly improve each other. For example, CodeM-Python 15B trained on Python is able to increase Java by an absolute 17.95% pass@1 on HumanEval-X. More surprisingly, we found that CodeM-HTML 7B trained on the HTML corpus can improve Java by an absolute 15.24% pass@1. Our training data is released at https://github.com/NL2Code/CodeM.

en cs.CL, cs.AI

Detail Sumber

DOAJ Open Access 2022

Zwischen Verstehen und Verweisen : (Post-)Migrationsgesellschaftliche Perspektiven auf die Vermittlung von Deutsch als Fremdsprache

Constantin Wagner

This paper attempts to reconstruct different paradigms, attitudes and experiences that underlie the understanding and action of non-professional teachers of German as a foreign language. The analysis shows that some of these teachers explain the problems of their students via a national-cultural affiliation, while other attempts to understand the students' challenges can also be observed. Participating observation of an academic congress on German as a foreign language makes clear that competing explanatory approaches are also present in specialist discourse. In this respect, the divergent categories of explanations observed in the field may appear to be based less on a teachers' lack of disciplinary socialisation than on different migration and diversity-related attitudes and views that originate from different social milieus.

Germanic languages. Scandinavian languages, History of Northern Europe. Scandinavia

Detail DOI Sumber

arXiv Open Access 2022

LAGC Semantics of Concurrent Programming Languages

Crystal Chang Din, Reiner Hähnle, Ludovic Henrio et al.

Formal, mathematically rigorous programming language semantics are the essential prerequisite for the design of logics and calculi that permit automated reasoning about concurrent programs. We propose a novel modular semantics designed to align smoothly with program logics used in deductive verification and formal specification of concurrent programs. Our semantics separates local evaluation of expressions and statements performed in an abstract, symbolic environment from their composition into global computations, at which point they are concretised. This makes incremental addition of new language concepts possible, without the need to revise the framework. The basis is a generalisation of the notion of a program trace as a sequence of evolving states that we enrich with event descriptors and trailing continuation markers. This allows to postpone scheduling constraints from the level of local evaluation to the global composition stage, where well-formedness predicates over the event structure declaratively characterise a wide range of concurrency models. We also illustrate how a sound program logic and calculus can be defined for this semantics.

en cs.PL

Detail Sumber

arXiv Open Access 2022

On Families of Full Trios Containing Counter Machine Languages

Oscar H. Ibarra, Ian McQuillan

We look at nondeterministic finite automata augmented with multiple reversal-bounded counters where, during an accepting computation, the behavior of the counters is specified by some fixed pattern. These patterns can serve as a useful "bridge" to other important automata and grammar models in the theoretical computer science literature, thereby helping in their study. Various pattern behaviors are considered, together with characterizations and comparisons. For example, one such pattern defines exactly the smallest full trio containing all the bounded semilinear languages. Another pattern defines the smallest full trio containing all the bounded context-free languages. The "bridging" to other families is then applied, e.g. to certain Turing machine restrictions, as well as other families. Certain general decidability properties are also studied using this framework.

en cs.FL

Detail DOI Sumber

DOAJ Open Access 2021

Verstehen sichtbar machen – Texterschließung durch digitale Annotationswerkzeuge kollaborativ anbahnen

Nicola König

In diesem Beitrag wird das Annotieren von Texten als vernachlässigte Praktik des Deutschunterrichts thematisiert. Obwohl die Vermittlung einer Texterschließungskompetenz als ein zentrales Anliegen des Deutschunterrichts betrachtet werden kann, stellt die selbstständige Lektüre von komplexen literarischen Texten und von Sachtexten sowie die Verständigung darüber regelmäßig Schüler:innen vor große Herausforderungen. Das Annotieren soll in diesem Zusammenhang als Lesestrategie verstanden werden, die eine aktive, Fragen entwickelnde und an Taxonomien orientierte kollaborative Auseinandersetzung mit Texten anbahnt. Die beiden im Zentrum des Beitrags stehenden digitalen Tools – Voyant zur Textanalyse und CATMA zum kollaborativen Annotieren – sind Werkzeuge der Digital Humanities. Es wird der Versuch unternommen, diese Arbeitspraktiken auf den Deutschunterricht und konkret auf Franz Kafkas Erzählung Die Verwandlung zu übertragen und als Lesestrategie nutzbar zu machen. Im Beitrag werden die fachwissenschaftlichen und -didaktischen Voraussetzungen, die curricularen Rahmenbedingungen sowie das methodische Setting erläutert, so dass eine schulische Implementierung diskutiert werden kann. Abstract (english): Visualising Understanding – Initiating Text Analysis collaboratively with Digital Annotation Tools The subject of this article is the annotation of texts as a neglected practice in German studies. Although the development of text analysis competency can be regarded as a central objective of the curriculum of German studies, autonomous reading of complex literary and non-literary texts as well as communicating about them poses major challenges to learners. In this context, annotation is to be understood as a reading strategy initiating active collaborative text analysis that encourages questioning and the use of analytical concepts. The two digital tools at the centre of this article – Voyant for text analysis and CATMA for collaborative annotation – are tools used in Digital Humanities. An attempt is made here to transfer these methods to the classroom and more specifically to Kafka’s The Metamorphosis (Die Verwandlung), and to apply them as reading strategies. The article describes the premises inherent in the subject as well as the didactics of German teaching, the curricular framework as well as the methodological setting in order to discuss its implementation in the classroom.

Education, Communication. Mass media

Detail DOI Sumber

arXiv Open Access 2020

Four-valued monitorability of $ω$-regular languages

Zhe Chen, Yunyun Chen, Robert M. Hierons et al.

Runtime Verification (RV) is a lightweight formal technique in which program or system execution is monitored and analyzed, to check whether certain properties are satisfied or violated after a finite number of steps. The use of RV has led to interest in deciding whether a property is monitorable: whether it is always possible for the satisfaction or violation of the property to be determined after a finite future continuation. However, classical two-valued monitorability suffers from two inherent limitations. First, a property can only be evaluated as monitorable or non-monitorable; no information is available regarding whether only one verdict (satisfaction or violation) can be detected. Second, monitorability is defined at the language-level and does not tell us whether satisfaction or violation can be detected starting from the current monitor state during system execution. To address these limitations, this paper proposes a new notion of four-valued monitorability for $ω$-languages and applies it at the state-level. Four-valued monitorability is more informative than two-valued monitorability as a property can be evaluated as a four-valued result, denoting that only satisfaction, only violation, or both are active for a monitorable property. We can also compute state-level weak monitorability, i.e., whether satisfaction or violation can be detected starting from a given state in a monitor, which enables state-level optimizations of monitoring algorithms. Based on a new six-valued semantics, we propose procedures for computing four-valued monitorability of $ω$-regular languages, both at the language-level and at the state-level. We have developed a new tool that implements the proposed procedure for computing monitorability of LTL formulas.

en cs.FL, cs.LO

Detail Sumber

arXiv Open Access 2020

Language-Integrated Updatable Views (Extended version)

Rudi Horn, Simon Fowler, James Cheney

Relational lenses are a modern approach to the view update problem in relational databases. As introduced by Bohannon et al. (2006), relational lenses allow the definition of updatable views by the composition of lenses performing individual transformations. Horn et al. (2018) provided the first implementation of incremental relational lenses, which demonstrated that relational lenses can be implemented efficiently by propagating changes to the database rather than replacing the entire database state. However, neither approach proposes a concrete language design; consequently, it is unclear how to integrate lenses into a general-purpose programming language, or how to check that lenses satisfy the well-formedness conditions needed for predictable behaviour. In this paper, we propose the first full account of relational lenses in a functional programming language, by extending the Links web programming language. We provide support for higher-order predicates, and provide the first account of typechecking relational lenses which is amenable to implementation. We prove the soundness of our typing rules, and illustrate our approach by implementing a curation interface for a scientific database application.

en cs.PL

Detail DOI Sumber

DOAJ Open Access 2019

Computerspiele im Videoclip rezensieren und reflektieren

Andreas Seidler

Der Beitrag widmet sich dem medialen Format von Video-Rezensionen zu Computerspielen, wie es auf professionellen Plattformen zur Spielekritik verwendet wird. Beleuchtet werden dabei die praktischen und reflektierenden Arbeitsschritte, die von Lernenden zu leisten sind, um selbst eine Computerspielkritik in dieser Form zu erstellen, sowie die Verbindungen zum spezifischen Bildungsauftrag des Deutsch-unterrichts, die sich bei einem solchen Projekt erkennen lassen.

Education, Communication. Mass media

Detail DOI Sumber

DOAJ Open Access 2019

Die behandeling van die funksie dekodering in verskillende tipes woordeboeke

Anna Nel Otto, Jadé Blume

The Treatment of the Function Decoding in Different Types of Dictionaries. Dictionaries are especially consulted for the function of decoding. This article provides a systematic description of the influence that this function has on the dictionary structures and data types in different types of dictionaries. In this discussion attention is paid to structures which appear in both printed and online dictionaries. Although the most important data type for decoding is meaning explanations/translation equivalents in multilingual dictionaries, this article focuses especially on the role of data types such as pronunciation guidance, collocations, labels, illustrations and etymological data. In printed dictionaries there is a resemblance in terms of frame structures (at least a lemma list and usage guidance), data distribution structure and access structure, while differences are more on the level of the macrostructure (quantity of lemmata and different ordering) and microstructure (indicator types and quantity of data).

Philology. Linguistics, Languages and literature of Eastern Asia, Africa, Oceania

Detail DOI Sumber

arXiv Open Access 2019

Subjective Assessment of Text Complexity: A Dataset for German Language

Babak Naderi, Salar Mohtaj, Kaspar Ensikat et al.

This paper presents TextComplexityDE, a dataset consisting of 1000 sentences in German language taken from 23 Wikipedia articles in 3 different article-genres to be used for developing text-complexity predictor models and automatic text simplification in German language. The dataset includes subjective assessment of different text-complexity aspects provided by German learners in level A and B. In addition, it contains manual simplification of 250 of those sentences provided by native speakers and subjective assessment of the simplified sentences by participants from the target group. The subjective ratings were collected using both laboratory studies and crowdsourcing approach.

en cs.CL

Detail Sumber

Hasil untuk "Germanic languages. Scandinavian languages"