Datasets and Benchmarks

Dataset and Benchmark Track 2

Joaquin Vanschoren · Serena Yeung

2021 Datasets and Benchmarks

Abstract

The Datasets and Benchmarks track serves as a novel venue for high-quality publications, talks, and posters on highly valuable machine learning datasets and benchmarks, as well as a forum for discussions on how to improve dataset development. Datasets and benchmarks are crucial for the development of machine learning methods, but also require their own publishing and reviewing guidelines. For instance, datasets can often not be reviewed in a double-blind fashion, and hence full anonymization will not be required. On the other hand, they do require additional specific checks, such as a proper description of how the data was collected, whether they show intrinsic bias, and whether they will remain accessible.

Video

Chat is not available.

Schedule

Timezone: America/Los_Angeles

8:00 AM

A Large-Scale Database for Graph Representation Learning

Scott Freitas · Yuxiao Dong · Joshua Neil · Duen Horng Chau

Video

8:10 AM

WRENCH: A Comprehensive Benchmark for Weak Supervision

Jieyu Zhang · Yue Yu · · Yujing Wang · Yaming Yang · Mao Yang · Alexander Ratner

Video

8:20 AM

ATOM3D: Tasks on Molecules in Three Dimensions

Raphael Townshend · Martin Vögele · Patricia Suriana · Alex Derry · Alexander Powers · Yianni Laloudakis · Sidhika Balachandar · Bowen Jing · Brandon Anderson · Stephan Eismann · Risi Kondor · Russ Altman · Ron Dror

Video

8:30 AM

Reduced, Reused and Recycled: The Life of a Dataset in Machine Learning Research

Bernard Koch · Emily Denton · Alex Hanna · Jacob G Foster

Video

8:40 AM

Joint Q&A

Video