Timezone: »

Assembling Existing Labels from Public Datasets to\\Diagnose Novel Diseases: COVID-19 in Late 2019
Zengle Zhu · Mintong Kang · Alan Yuille · Zongwei Zhou

The success of deep learning relies heavily on the availability of large annotated datasets, but neither sizable data nor annotation is easily accessible for novel diseases. This paper uses the classification of COVID-19 in late 2019 as an example to demonstrate the effectiveness of a novel strategy, named "Label-Assemble". To facilitate the diagnosis of novel diseases, we propose to assemble existing labels from public datasets. Although novel diseases are not in the existing labels, we discover that learning from alternative labels can dramatically improve the diagnosis of the novel disease as these labels can better define the classification boundary of the novel disease. This discovery has the potential to accelerate the development circle of computer-aided diagnosis of novel diseases, in which positive label is hard to collect, yet negative labels are usually available and relatively easier to assemble. Label-Assemble achieves 99.3% accuracy on the COVIDx-CXR2 dataset, which significantly exceeds the previous state of the art (96.3% accuracy) and only uses 3% of the annotated COVID-19 images. We further investigate the implementation of the assembling strategy, showing that assembling pathologically related labels, supplemented by semi-supervised learning, is preferred.

Author Information

Zengle Zhu (Tongji University)
Mintong Kang (University of Illinois at Urbana-Champaign)
Alan Yuille (JHU)
Zongwei Zhou (Johns Hopkins University)

More from the Same Authors