arXiv Open Access 2026

NILE: Formalizing Natural-Language Descriptions of Formal Languages

Tristan Kneisel Marko Schmellenkamp Fabian Vehlken Thomas Zeume
Lihat Sumber

Abstrak

This paper explores how natural-language descriptions of formal languages can be compared to their formal representations and how semantic differences can be explained. This is motivated from educational scenarios where learners describe a formal language (presented, e.g., by a finite state automaton, regular expression, pushdown automaton, context-free grammar or in set notation) in natural language, and an educational support system has to (1) judge whether the natural-language description accurately describes the formal language, and to (2) provide explanations why descriptions are not accurate. To address this question, we introduce a representation language for formal languages, Nile, which is designed so that Nile expressions can mirror the syntactic structure of natural-language descriptions of formal languages. Nile is sufficiently expressive to cover a broad variety of formal languages, including all regular languages and fragments of context-free languages typically used in educational contexts. Generating Nile expressions that are syntactically close to natural-language descriptions then allows to provide explanations for inaccuracies in the descriptions algorithmically. In experiments on an educational data set, we show that LLMs can translate natural-language descriptions into equivalent, syntactically close Nile expressions with high accuracy - allowing to algorithmically provide explanations for incorrect natural-language descriptions. Our experiments also show that while natural-language descriptions can also be translated into regular expressions (but not context-free grammars), the expressions are often not syntactically close and thus not suitable for providing explanations.

Topik & Kata Kunci

Penulis (4)

T

Tristan Kneisel

M

Marko Schmellenkamp

F

Fabian Vehlken

T

Thomas Zeume

Format Sitasi

Kneisel, T., Schmellenkamp, M., Vehlken, F., Zeume, T. (2026). NILE: Formalizing Natural-Language Descriptions of Formal Languages. https://arxiv.org/abs/2602.19743

Akses Cepat

Lihat di Sumber
Informasi Jurnal
Tahun Terbit
2026
Bahasa
en
Sumber Database
arXiv
Akses
Open Access ✓