Timezone: »

 
Poster
The Dollar Street Dataset: Images Representing the Geographic and Socioeconomic Diversity of the World
William Gaviria Rojas · Sudnya Diamos · Keertan Kini · David Kanter · Vijay Janapa Reddi · Cody Coleman

Tue Nov 29 02:00 PM -- 04:00 PM (PST) @ Hall J #1023

It is crucial that image datasets for computer vision are representative and contain accurate demographic information to ensure their robustness and fairness, especially for smaller subpopulations. To address this issue, we present Dollar Street - a supervised dataset that contains 38,479 images of everyday household items from homes around the world. This dataset was manually curated and fully labeled, including tags for objects (e.g. “toilet,” “toothbrush,” “stove”) and demographic data such as region, country and home monthly income. This dataset includes images from homes with no internet access and incomes as low as \$26.99 per month, visually capturing valuable socioeconomic diversity of traditionally under-represented populations. All images and data are licensed under CC-BY, permitting their use in academic and commercial work. Moreover, we show that this dataset can improve the performance of classification tasks for images of household items from lower income homes, addressing a critical need for datasets that combat bias.

Author Information

William Gaviria Rojas (Coactive AI)
Sudnya Diamos (Georgia Institute of Technology)
Keertan Kini (Stanford University)
David Kanter (MLCommons)
Vijay Janapa Reddi (Harvard University)
Cody Coleman (Stanford University)

Cody is a computer science Ph.D. candidate at Stanford University, is advised by Professors Matei Zaharia and Peter Bailis and is supported by a National Science Foundation Fellowship. As a member of the Stanford DAWN Project, Cody’s research is focused on democratizing machine learning through tools and infrastructure that enable more than the most well-funded teams to create innovative and impactful systems; this includes reducing the cost of producing state-of-the-art models and creating novel abstractions that simplify machine learning development and deployment. Prior to joining Stanford, he completed his B.S. and M.Eng. in electrical engineering and computer science at the Massachusetts Institute of Technology.

More from the Same Authors