NoLBERT: A No Lookahead(back) Foundational Language Model
Peiyao Li · Ali Kakhbod
Abstract
We present NoLBERT, a lightweight, timestamped foundational language model for empirical research---particularly for forecasting in economics, finance, and the social sciences. By pretraining exclusively on text from 1976 to 1995, NoLBERT avoids both {\it lookback} and {\it lookahead} biases (information leakage) that can undermine econometric inference. It exceeds domain-specific baselines on NLP benchmarks while maintaining temporal consistency. Applied to patent texts, NoLBERT enables the construction of firm-level innovation networks and shows that gains in innovation centrality predict higher long-run profit growth.
Chat is not available.
Successful Page Load