Moderator: Rediet Abebe
If data is power, this keynote asks what methodologies and frameworks, beyond measuring bias and fairness in ML, might best serve communities that are, otherwise, written off as inevitable ‘data gaps?’ To address this question, the talk applies design justice principles articulated in 2020 by scholar Costanza-Chock to the case of community-based organizations (CBOs) serving marginalized Black and Latinx communities in North Carolina. These CBOs, part of an 8-month study of community healthcare work, have become pivotal conduits for COVID-19 health information and equitable vaccine access. As such, they create and collect the so-called ‘sparse data’ of marginalized groups often missing from healthcare analyses. How might health equity—a cornerstone of social justice—be better served by equipping CBOs to collect community-level data and set the agendas for what to share and learn from the people that they serve? The talk will open with an analysis of the limits of ML models that prioritize the efficiencies of scale over attention to just and inclusive sampling. It will then examine how undertheorized investments in measuring bias and fairness in data and decision-making systems distract us from considering the value of collecting data with rather than for communities. Outlining an early learning theory proposed by Russian psychologist Lev Vygotsky (1978), the presentation will argue that focusing on the demands of collecting community members’ data and observing the social interactions that are computationally hard to measure but qualitatively invaluable to see are necessary to advance socially-just ML. The talk will conclude with recommendations for how to reorient computer science and machine learning to a more explicit theory and practice of data power-sharing.