arXiv Open Access 2025

Can Test-time Computation Mitigate Reproduction Bias in Neural Symbolic Regression?

Shun Sato Issei Sato

Lihat Sumber

Abstrak

Mathematical expressions play a central role in scientific discovery. Symbolic regression aims to automatically discover such expressions from given numerical data. Recently, Neural symbolic regression (NSR) methods that involve Transformers pre-trained on synthetic datasets have gained attention for their fast inference, but they often perform poorly, especially with many input variables. In this study, we analyze NSR from both theoretical and empirical perspectives and show that (1) ordinary token-by-token generation is ill-suited for NSR, as Transformers cannot compositionally generate tokens while validating numerical consistency, and (2) the search space of NSR methods is greatly restricted due to reproduction bias, where the majority of generated expressions are merely copied from the training data. We further examine whether tailored test-time strategies can reduce reproduction bias and show that providing additional information at test time effectively mitigates it. These findings contribute to a deeper understanding of the limitation of NSR approaches and provide guidance for designing more robust and generalizable methods. Code is available at https://github.com/Shun-0922/Mem-Bias-NSR .

Topik & Kata Kunci

cs.LG

Penulis (2)

Shun Sato

Issei Sato

Format Sitasi

APA MLA BibTeX

Sato, S., Sato, I. (2025). Can Test-time Computation Mitigate Reproduction Bias in Neural Symbolic Regression?. https://arxiv.org/abs/2505.22081

Akses Cepat

Lihat di Sumber

Informasi Jurnal

Tahun Terbit: 2025
Bahasa: en
Sumber Database: arXiv
Akses: Open Access ✓