Exact multi-label classification is the task of assigning each datapoint a set of class labels such that the assigned set exactly matches the ground truth. Optimizing for exact multi-label classification is important in domains where missing a single label can be especially costly, such as in object detection for autonomous vehicles or symptom classification for disease diagnosis. Recurrent Classifier Chains (RCCs), a recurrent neural network extension of ensemble-based classifier chains, are the state-of-the-art exact multi-label classification method for maximizing subset accuracy. However, RCCs iteratively predict classes with an unprincipled ordering, and therefore indiscriminately condition class probabilities. These disadvantages make RCCs prone to predicting inaccurate label sets. In this work we propose Recurrent Bayesian Classifier Chains (RBCCs), which learn a Bayesian network of class dependencies and leverage this network in order to condition the prediction of child nodes only on their parents. By conditioning predictions in this way, we perform principled and non-noisy class prediction. We demonstrate the effectiveness of our RBCC method on a variety of real-world multi-label datasets, where we routinely outperform the state of the art methods for exact multi-label classification.
Walter Gerych (Worcester Polytechnic Institute)
Thomas Hartvigsen (Worcester Polytechnic Institute)
Luke Buquicchio (Worcester Polytechnic Institute)
Emmanuel Agu (Worcester Polytechnic Institute)
Elke A. Rundensteiner (Worcester Polytechnic Institute)
Dr. E. Rundensteiner, chaired Professor in Computer Science, is the founding Director of the interdisciplinary Data Science program at Worcester Polytechnic Institute (WPI). As an internationally recognized expert in big data analytics, her research interests span data science, machine learning, big data analytics, data management, and cloud computing. With an h-index of 60, she has authored well over 400 publications, numerous patents, and software systems released to public domain. Her work has been supported by government agencies including ARL, DARPA, NSF, NIH, DOE, and FDA, and by industry including HP, IBM, Verizon Labs, GTE, NEC, AMADEUS, CRA, MITRE Corporation, and others. She has been recipient of numerous honors and awards, including WPI Chairman's Exemplary Faculty Prize, WPI Board of Trustees' Outstanding Research and Creative Scholarship award, Sigma Xi Outstanding Senior Faculty Researcher, and the NSF Young Investigator award. She holds leadership positions in the big data field, having served as Associate Editor of IEEE Transactions on Data and Knowledge Engineering and VLDB Journal and as area chair on premiere professional big data conferences, including ACM SIGMOD, VLDB, IEEE ICDE, and others.