Timezone: »
Unpredictable ML model behavior on unseen data, especially in the health domain, raises serious concerns about its safety as repercussions for mistakes can be fatal. In this paper, we explore the feasibility of using state-of-the-art out-of-distribution detectors for reliable and trustworthy diagnostic predictions. We select publicly available deep learning models relating to various health conditions (e.g., skin cancer, lung sound, and Parkinson's disease) using various input data types (e.g., image, audio, and motion data). We demonstrate that these models show unreasonable predictions on out-of-distribution datasets. We show that Mahalanobis distance- and Gram matrices-based out-of-distribution detection methods are able to detect out-of-distribution data with high accuracy for the health models that operate on different modalities. We then translate the out-of-distribution score into a human interpretable \textsc{confidence score} to investigate its effect on the users' interaction with health ML applications. Our user study shows that the \textsc{confidence score} helped the participants only trust the results with a high score to make a medical decision and disregard results with a low score. Through this work, we demonstrate that dataset shift is a critical piece of information for high-stake ML applications, such as medical diagnosis and healthcare, to provide reliable and trustworthy predictions to the users.
Author Information
Chunjong Park (University of Washington)
Anas Awadalla (Department of Computer Science, University of Washington)
Tadayoshi Kohno (University of Washington)
Shwetak Patel (University of Washington)
More from the Same Authors
-
2022 Poster: GLOBEM Dataset: Multi-Year Datasets for Longitudinal Human Behavior Modeling Generalization »
Xuhai Xu · Han Zhang · Yasaman Sefidgar · Yiyi Ren · Xin Liu · Woosuk Seo · Jennifer Brown · Kevin Kuehn · Mike Merrill · Paula Nurius · Shwetak Patel · Tim Althoff · Margaret Morris · Eve Riskin · Jennifer Mankoff · Anind Dey -
2020 Poster: Multi-Task Temporal Shift Attention Networks for On-Device Contactless Vitals Measurement »
Xin Liu · Josh Fromm · Shwetak Patel · Daniel McDuff -
2020 Oral: Multi-Task Temporal Shift Attention Networks for On-Device Contactless Vitals Measurement »
Xin Liu · Josh Fromm · Shwetak Patel · Daniel McDuff -
2018 Poster: Heterogeneous Bitwidth Binarization in Convolutional Neural Networks »
Joshua Fromm · Shwetak Patel · Matthai Philipose