Timezone: »
It is crucial that image datasets for computer vision are representative and contain accurate demographic information to ensure their robustness and fairness, especially for smaller subpopulations. To address this issue, we present Dollar Street - a supervised dataset that contains 38,479 images of everyday household items from homes around the world. This dataset was manually curated and fully labeled, including tags for objects (e.g. “toilet,” “toothbrush,” “stove”) and demographic data such as region, country and home monthly income. This dataset includes images from homes with no internet access and incomes as low as \$26.99 per month, visually capturing valuable socioeconomic diversity of traditionally under-represented populations. All images and data are licensed under CC-BY, permitting their use in academic and commercial work. Moreover, we show that this dataset can improve the performance of classification tasks for images of household items from lower income homes, addressing a critical need for datasets that combat bias.
Author Information
William Gaviria Rojas (Coactive AI)
Sudnya Diamos (Georgia Institute of Technology)
Keertan Kini (Stanford University)
David Kanter (MLCommons)
Vijay Janapa Reddi (Harvard University)
Cody Coleman (Stanford University)
Cody is a computer science Ph.D. candidate at Stanford University, is advised by Professors Matei Zaharia and Peter Bailis and is supported by a National Science Foundation Fellowship. As a member of the Stanford DAWN Project, Cody’s research is focused on democratizing machine learning through tools and infrastructure that enable more than the most well-funded teams to create innovative and impactful systems; this includes reducing the cost of producing state-of-the-art models and creating novel abstractions that simplify machine learning development and deployment. Prior to joining Stanford, he completed his B.S. and M.Eng. in electrical engineering and computer science at the Massachusetts Institute of Technology.
More from the Same Authors
-
2021 : MLPerf Tiny Benchmark »
Colby Banbury · Vijay Janapa Reddi · Peter Torelli · Nat Jeffries · Csaba Kiraly · Jeremy Holleman · Pietro Montino · David Kanter · Pete Warden · Danilo Pau · Urmish Thakker · antonio torrini · jay cordaro · Giuseppe Di Guglielmo · Javier Duarte · Honson Tran · Nhan Tran · niu wenxu · xu xuesong -
2021 : The People’s Speech: A Large-Scale Diverse English Speech Recognition Dataset for Commercial Usage »
Daniel Galvez · Greg Diamos · Juan Torres · Juan Cerón · Keith Achorn · Anjali Gopi · David Kanter · Max Lam · Mark Mazumder · Vijay Janapa Reddi -
2021 : Multilingual Spoken Words Corpus »
Mark Mazumder · Sharad Chitlangia · Colby Banbury · Yiping Kang · Juan Ciro · Keith Achorn · Daniel Galvez · Mark Sabini · Peter Mattson · David Kanter · Greg Diamos · Pete Warden · Josh Meyer · Vijay Janapa Reddi -
2022 : Multi-Agent Reinforcement Learning for Microprocessor Design Space Exploration »
Srivatsan Krishnan · Natasha Jaques · Shayegan Omidshafiei · Dan Zhang · Izzeddin Gur · Vijay Janapa Reddi · Aleksandra Faust -
2023 Poster: DataPerf: Benchmarks for Data-Centric AI Development »
Mark Mazumder · Colby Banbury · Xiaozhe Yao · Bojan Karlaš · William Gaviria Rojas · Sudnya Diamos · Greg Diamos · Lynn He · Alicia Parrish · Hannah Rose Kirk · Jessica Quaye · Charvi Rastogi · Douwe Kiela · David Jurado · David Kanter · Rafael Mosquera · Will Cukierski · Juan Ciro · Lora Aroyo · Bilge Acun · Lingjiao Chen · Mehul Raje · Max Bartolo · Evan Sabri Eyuboglu · Amirata Ghorbani · Emmett Goodman · Addison Howard · Oana Inel · Tariq Kane · Christine Kirkpatrick · D. Sculley · Tzu-Sheng Kuo · Jonas Mueller · Tristan Thrush · Joaquin Vanschoren · Margaret Warren · Adina Williams · Serena Yeung · Newsha Ardalani · Praveen Paritosh · Ce Zhang · James Zou · Carole-Jean Wu · Cody Coleman · Andrew Ng · Peter Mattson · Vijay Janapa Reddi -
2022 : Panel »
Mayee Chen · Alexander Ratner · Robert Nowak · Cody Coleman · Ramya Korlakai Vinayak -
2022 : Multi-Agent Reinforcement Learning for Microprocessor Design Space Exploration »
Srivatsan Krishnan · Natasha Jaques · Shayegan Omidshafiei · Dan Zhang · Izzeddin Gur · Vijay Janapa Reddi · Aleksandra Faust -
2021 : Q&A Lightning Talks - Responsibility and Ethics »
Vijay Janapa Reddi · Cody Coleman -
2021 : Q&A Lightning Talks - Challenge Problems and Theory »
Cody Coleman · Vijay Janapa Reddi -
2021 : Lightning Talks - Challenge Problems and Theory »
Vijay Janapa Reddi · Carole-Jean Wu -
2021 : Q&A Lightning Talk - Benchmarks and Challenges »
Cody Coleman · Vijay Janapa Reddi -
2021 : Lightning Talks - Benchmarks and Challenges »
Vijay Janapa Reddi · Cody Coleman -
2021 Workshop: Data Centric AI »
Andrew Ng · Lora Aroyo · Greg Diamos · Cody Coleman · Vijay Janapa Reddi · Joaquin Vanschoren · Carole-Jean Wu · Sharon Zhou · Lynn He