Skip to yearly menu bar Skip to main content


Poster
in
Workshop: Medical Imaging meets NeurIPS

From Competition to Collaboration: Making Toy Datasets on Kaggle Clinically Useful for Chest X-Ray Diagnosis Using Federated Learning

Pranav Kulkarni · Adway Kanhere · Paul Yi · Vishwa Parekh


Abstract:

Chest X-ray (CXR) datasets hosted on Kaggle, though useful from a data sciencecompetition standpoint, have limited utility in clinical use because of their narrowfocus on diagnosing one specific disease. In real-world clinical use, multiplediseases need to be considered since they can co-exist in the same patient. Inthis work, we demonstrate how federated learning (FL) can be used to makethese toy CXR datasets from Kaggle clinically useful. Specifically, we train asingle FL classification model (‘global‘) using two separate CXR datasets – oneannotated for presence of pneumonia and the other for presence of pneumothorax(two common and life-threatening conditions) – capable of diagnosing both. Wecompare the performance of the global FL model with models trained separatelyon both datasets (‘baseline‘) for two different model architectures. On a standard,naive 3-layer CNN architecture, the global FL model achieved AUROC of 0.84and 0.81 for pneumonia and pneumothorax, respectively, compared to 0.85 and0.82, respectively, for both baseline models (p>0.05). Similarly, on a pretrainedDenseNet121 architecture, the global FL model achieved AUROC of 0.88 and0.91 for pneumonia and pneumothorax, respectively, compared to 0.89 and 0.91,respectively, for both baseline models (p>0.05). Our results suggest that FL can beused to create global ‘meta‘ models to make toy datasets from Kaggle clinicallyuseful, a step forward towards bridging the gap from bench to bedside.

Chat is not available.