arXiv Open Access 2018

Learning Unions of k-Testable Languages

Alexis Linard Colin de la Higuera Frits Vaandrager

Lihat Sumber

Abstrak

A classical problem in grammatical inference is to identify a language from a set of examples. In this paper, we address the problem of identifying a union of languages from examples that belong to several different unknown languages. Indeed, decomposing a language into smaller pieces that are easier to represent should make learning easier than aiming for a too generalized language. In particular, we consider k-testable languages in the strict sense (k-TSS). These are defined by a set of allowed prefixes, infixes (sub-strings) and suffixes that words in the language may contain. We establish a Galois connection between the lattice of all languages over alphabet Σ, and the lattice of k-TSS languages over Σ. We also define a simple metric on k-TSS languages. The Galois connection and the metric allow us to derive an efficient algorithm to learn the union of k-TSS languages. We evaluate our algorithm on an industrial dataset and thus demonstrate the relevance of our approach.

Topik & Kata Kunci

cs.FL

Penulis (3)

Alexis Linard

Colin de la Higuera

Frits Vaandrager

Format Sitasi

APA MLA BibTeX

Linard, A., Higuera, C.d.l., Vaandrager, F. (2018). Learning Unions of k-Testable Languages. https://arxiv.org/abs/1812.08269

Akses Cepat

Lihat di Sumber

Informasi Jurnal

Tahun Terbit: 2018
Bahasa: en
Sumber Database: arXiv
Akses: Open Access ✓