Poster
in
Workshop: Heavy Tails in ML: Structure, Stability, Dynamics

Associative Memories with Heavy-Tailed Data

Vivien Cabannes ⋅ Elvis Dohmatob ⋅ Alberto Bietti

Keywords: associative memory Zipf data optimization-based algorithm mechanistic interpretability scaling law

Project Page [ OpenReview]

Abstract

Learning arguably involves the discovery and memorization of abstract rules.But how associative memories appear in transformer architectures optimized with gradient descent algorithms?We derive precise scaling laws for a simple input-output associative memory model with respect to parameter size, and discuss the statistical efficiency of different estimators, including optimization-based algorithms.We provide extensive numerical experiments to validate and interpret theoretical results, including fine-grained visualizations of the stored memory associations.

Chat is not available.