arXiv Open Access 2022

RePair Grammars are the Smallest Grammars for Fibonacci Words

Takuya Mieno Shunsuke Inenaga Takashi Horiyama
Lihat Sumber

Abstrak

Grammar-based compression is a loss-less data compression scheme that represents a given string $w$ by a context-free grammar that generates only $w$. While computing the smallest grammar which generates a given string $w$ is NP-hard in general, a number of polynomial-time grammar-based compressors which work well in practice have been proposed. RePair, proposed by Larsson and Moffat in 1999, is a grammar-based compressor which recursively replaces all possible occurrences of a most frequently occurring bigrams in the string. Since there can be multiple choices of the most frequent bigrams to replace, different implementations of RePair can result in different grammars. In this paper, we show that the smallest grammars generating the Fibonacci words $F_k$ can be completely characterized by RePair, where $F_k$ denotes the $k$-th Fibonacci word. Namely, all grammars for $F_k$ generated by any implementation of RePair are the smallest grammars for $F_k$, and no other grammars can be the smallest for $F_k$. To the best of our knowledge, Fibonacci words are the first non-trivial infinite family of strings for which RePair is optimal.

Topik & Kata Kunci

Penulis (3)

T

Takuya Mieno

S

Shunsuke Inenaga

T

Takashi Horiyama

Format Sitasi

Mieno, T., Inenaga, S., Horiyama, T. (2022). RePair Grammars are the Smallest Grammars for Fibonacci Words. https://arxiv.org/abs/2202.08447

Akses Cepat

Lihat di Sumber
Informasi Jurnal
Tahun Terbit
2022
Bahasa
en
Sumber Database
arXiv
Akses
Open Access ✓