Timezone: »

The Cells Out of Sample (COOS) dataset and benchmarks for measuring out-of-sample generalization of image classifiers
Alex Lu · Amy Lu · Wiebke Schormann · Marzyeh Ghassemi · David Andrews · Alan Moses

Wed Dec 11 05:00 PM -- 07:00 PM (PST) @ East Exhibition Hall B + C #124

Understanding if classifiers generalize to out-of-sample datasets is a central problem in machine learning. Microscopy images provide a standardized way to measure the generalization capacity of image classifiers, as we can image the same classes of objects under increasingly divergent, but controlled factors of variation. We created a public dataset of 132,209 images of mouse cells, COOS-7 (Cells Out Of Sample 7-Class). COOS-7 provides a classification setting where four test datasets have increasing degrees of covariate shift: some images are random subsets of the training data, while others are from experiments reproduced months later and imaged by different instruments. We benchmarked a range of classification models using different representations, including transferred neural network features, end-to-end classification with a supervised deep CNN, and features from a self-supervised CNN. While most classifiers perform well on test datasets similar to the training dataset, all classifiers failed to generalize their performance to datasets with greater covariate shifts. These baselines highlight the challenges of covariate shifts in image data, and establish metrics for improving the generalization capacity of image classifiers.

Author Information

Alex Lu (University of Toronto)

I'm a PhD student at the University of Toronto. I'm part of the Computer Science Department, and I research computational biology under Alan Moses. I focus on unsupervised machine learning techniques for the analysis of microscopy images. I believe that microscopy images contain rich information about biology, but they're underused because analysis of these images has traditionally been subjective and time-consuming, requiring biologists to look at each image manually. This approach is incompatible with current technologies, where robots can take tens of thousands of images in a single experiment. I develop ways for computers to "look" at these images, automatically discovering interesting biology for us. In some cases, the computer can identify patterns that are too complex for us to identify by human eye, or organize its findings systematically to make novel biological insights. This allows us to discover new biology from microscopy images, in an objective and systematic way.

Amy Lu (University of Toronto/Stanford University)
Wiebke Schormann (Sunnybrook Research Institute)
Marzyeh Ghassemi (University of Toronto, Vector Institute)
David Andrews (Sunnybrook Research Institute)
Alan Moses (University of Toronto)

More from the Same Authors

  • 2022 : Dissecting In-the-Wild Stress from Multimodal Sensor Data »
    Sujay Nagaraj · Thomas Hartvigsen · Adrian Boch · Luca Foschini · Marzyeh Ghassemi · Sarah Goodday · Stephen Friend · Anna Goldenberg
  • 2021 Poster: Learning Optimal Predictive Checklists »
    Haoran Zhang · Quaid Morris · Berk Ustun · Marzyeh Ghassemi
  • 2021 Poster: Characterizing Generalization under Out-Of-Distribution Shifts in Deep Metric Learning »
    Timo Milbich · Karsten Roth · Samarth Sinha · Ludwig Schmidt · Marzyeh Ghassemi · Bjorn Ommer
  • 2021 Poster: Medical Dead-ends and Learning to Identify High-Risk States and Treatments »
    Mehdi Fatemi · Taylor Killian · Jayakumar Subramanian · Marzyeh Ghassemi
  • 2020 : Policy Panel »
    Roya Pakzad · Dia Kayyali · Marzyeh Ghassemi · Shakir Mohamed · Mohammad Norouzi · Ted Pedersen · Anver Emon · Abubakar Abid · Darren Byler · Samhaa R. El-Beltagy · Nayel Shafei · Mona Diab
  • 2020 : Welcome »
    Marzyeh Ghassemi
  • 2019 : Coffee Break and Poster Session »
    Rameswar Panda · Prasanna Sattigeri · Kush Varshney · Karthikeyan Natesan Ramamurthy · Harvineet Singh · Vishwali Mhasawade · Shalmali Joshi · Laleh Seyyed-Kalantari · Matthew McDermott · Gal Yona · James Atwood · Hansa Srinivasan · Yonatan Halpern · D. Sculley · Behrouz Babaki · Margarida Carvalho · Josie Williams · Narges Razavian · Haoran Zhang · Amy Lu · Irene Y Chen · Xiaojie Mao · Angela Zhou · Nathan Kallus
  • 2019 : Cell »
    Anne Carpenter · Jian Zhou · Maria Chikina · Alexander Tong · Ben Lengerich · Aly Abdelkareem · Gokcen Eraslan · Stephen Ra · Daniel Burkhardt · Frederick A Matsen IV · Alan Moses · Zhenghao Chen · Marzieh Haghighi · Alex Lu · Geoffrey Schau · Jeff Nivala · Miriam Shiffman · Hannes Harbrecht · Levi Masengo Wa Umba · Joshua Weinstein