Skip to yearly menu bar Skip to main content

Workshop: Data Centric AI

SCIMAT: Science and Mathematics Dataset


In this work, we announce a comprehensive well curated and opensource dataset with millions of samples for pre-college and college level problems in mathematics and science. A preliminary set of results using transformer architectures with character to character encoding is shown. The dataset identifies some challenging problem and invites research on better architecture search.