Skip to yearly menu bar Skip to main content

Lightning Talk
Workshop: Data Centric AI

Data-Centric AI Requires Rethinking Data Notion


The transition towards data-centric AI requires revisiting data notions from mathematical and implementational standpoints to obtain unified data-centric machine learning packages. Towards this end, this work proposes unifying principles offered by categorical and cochain notions of data, and discusses the importance of these principles in data-centric AI transition. In the categorical notion, data invariants, which are the structural properties that are preserved under a particular type of morphisms, are often the interesting object of study. As for cochain notion, data can be viewed as a function defined in a discrete domain of interest and acted upon via operators. While these notions are almost orthogonal, they provide a unifying definition to view the data, ultimately impacting the way machine learning packages are developed, implemented, and utilized by practitioners.