Skip to yearly menu bar Skip to main content


Poster

Analyzing Similarity Metrics for Data Selection for Language Model Pretraining

Dylan Sam ⋅ Ayan Chakrabarti ⋅ Afshin Rostamizadeh ⋅ Srikumar Ramalingam ⋅ Gui Citovsky ⋅ Sanjiv Kumar
2025 Poster

Abstract

Video

Chat is not available.