arXiv Open Access 2025

NoLBERT: A No Lookahead(back) Foundational Language Model

Ali Kakhbod Peiyao Li

Lihat Sumber

Abstrak

We present NoLBERT, a lightweight, timestamped foundational language model for empirical research -- particularly for forecasting in economics, finance, and the social sciences. By pretraining exclusively on text from 1976 to 1995, NoLBERT avoids both lookback and lookahead biases (information leakage) that can undermine econometric inference. It exceeds domain-specific baselines on NLP benchmarks while maintaining temporal consistency. Applied to patent texts, NoLBERT enables the construction of firm-level innovation networks and shows that gains in innovation centrality predict higher long-run profit growth.

Topik & Kata Kunci

econ.GN cs.AI cs.LG q-fin.GN

Penulis (2)

Ali Kakhbod

Peiyao Li

Format Sitasi

APA MLA BibTeX

Kakhbod, A., Li, P. (2025). NoLBERT: A No Lookahead(back) Foundational Language Model. https://arxiv.org/abs/2509.01110

Akses Cepat

Lihat di Sumber

Informasi Jurnal

Tahun Terbit: 2025
Bahasa: en
Sumber Database: arXiv
Akses: Open Access ✓