Workshop: Trustworthy and Socially Responsible Machine Learning

COVID-Net Biochem: An Explainability-driven Framework to Building Machine Learning Models for Predicting Survival and Kidney Injury of COVID-19 Patients from Clinical and Biochemistry Data

Hossein Aboutalebi · Maya Pavlova · Mohammad Javad Shafiee · Adrian Florea · Andrew Hryniowski · Alexander Wong


A major challenge faced during the pandemic has been the prediction of survival and the risk for additional injuries in individual patients, which requires significant clinical expertise and additional resources to avoid further complications. In this study we propose COVID-Net Biochem, an explainability-driven framework for building machine learning models to predict patient survival and the chance of developing kidney injury during hospitalization from clinical and biochemistry data in a transparent and systematic manner. In the first clinician-guided initial design'' phase, we prepared a benchmark dataset of carefully selected clinical and biochemistry data based on clinician assessment, which were curated from a patient cohort of 1366 patients at Stony Brook University. A collection of different machine learning models with a diversity of gradient based boosting tree architectures and deep transformer architectures was designed and trained specifically for survival and kidney injury prediction based on the carefully selected clinical and biochemical markers. In the secondexplainability-driven design refinement'' phase, we harnessed explainability methods to not only gain a deeper understanding into the decision-making process of the individual models, but also identify the overall impact of the individual clinical and biochemical markers to identify potential biases. These explainability outcomes are further analyzed by a clinician with over eight years experience to gain a deeper understanding of clinical validity of decisions made. These explainability-driven insights gained alongside the associated clinical feedback are then leveraged to guide and revise the training policies and architectural design in an iterative manner to improve not just prediction performance but also improve clinical validity and trustworthiness of the final machine learning models. Using the proposed explainable-driven framework, we achieved 97.4\% accuracy in survival prediction and 96.7\% accuracy in predicting kidney injury complication, with the models made available in an open source manner. While not a production-ready solution, the ultimate goal of this study is to act as a catalyst for clinical scientists, machine learning researchers, as well as citizen scientists to develop innovative and trust-worthy clinical decision support solutions for helping clinicians around the world manage the continuing pandemic.

Chat is not available.