Skip to yearly menu bar Skip to main content


Token-token correlations predict the scaling of the test loss with the number of input tokens

Francesco Cagnetta ⋅ Matthieu Wyart

Abstract

Chat is not available.