Timezone: »

Semi-supervised Learning from Uncurated Echocardiogram Images with Fix-A-Step
Zhe Huang · Mary-Joy Sidhom · Benjamin Wessler · Michael Hughes

Semi-supervised learning (SSL) promises gains in accuracy compared to training classifiers on small labeled datasets by also training on many unlabeled images. Unfortunately, modern deep SSL often makes accuracy worse when given uncurated unlabeled sets. In realistic applications like medical imaging, unlabeled sets are often uncurated and thus possibly different from the labeled set in represented classes. Recent remedies suggest filtering approaches that detect out-of-distribution (OOD) unlabeled examples and then discard or downweight them. Instead, we view all unlabeled examples as potentially helpful. We introduce a procedure called Fix-A-Step that can improve heldout accuracy of common deep SSL methods despite lack of curation. Our first key insight is that unlabeled data, even OOD, can usefully inform augmentations of labeled data. Our second innovation is to modify gradient descent updates to prevent following the multi-task SSL loss from hurting abeled-set accuracy. Though our method is simpler than alternatives, we show consistent accuracy gains on common CIFAR-10 benchmarks across all levels of contamination. We further suggest a new medically-focused robust SSL benchmark called Heart2Heart, where the core task is recognizing the view type of ultrasound images of the heart. On Heart2Heart, Fix-A-Step can learn from 353,500 truly uncurated unlabeled images to deliver gains that generalize across hospitals.

Author Information

Zhe Huang (tufts university)
Mary-Joy Sidhom (Tufts University)
Benjamin Wessler (Tufts Medical Center)
Michael Hughes (Tufts University)

More from the Same Authors