Skip to yearly menu bar Skip to main content


Poster
in
Affinity Workshop: Women in Machine Learning

Model Understanding and Debugging at The Level of Subpopulation

Jun Yuan


Abstract:

Understanding machine learning (ML) models has been of paramount importance when making decisions with societal impacts such as transport control, financial activities, and medical diagnosis. While local explanation techniques are popular methods to interpret ML models on a single instance, they do not scale to the understanding of a model's behavior on the whole dataset. In this workshop, I want to present two papers we published recently about model understanding and debugging at the level of subpopulation. The first approach is an interactive visualization widget embedded in the Jupyter notebook environment to guide users explore subpopulations where the local explanations (e.g., LIME, SHAP, etc.) tend to have the same patterns. Based on interactive clustering, users can select and create subpopulations for inspection in the user interface rendered in a notebook cell. Our widget enables flexible input and output. Besides conventional clicking or brushing selection, we allow users to create a subpopulation by calling a python function to select instances in the interactive user interface. Users can also output intermediate analysis results as DataFrame in Python for further inspection.In the second approach, we introduce an error analysis tool that helps people semantically understand errors in NLP models. This tool automatically discovers semantically-grounded subpopulations with high error rates in the context of a human-in-the-loop workflow. It enables model developers to learn more about their model errors through discovered subpopulations, validate the sources of errors through interactive analysis on the discovered subpopulations, and then test hypotheses about model errors by defining custom subpopulations. With the help of these tools, I believe model developers can gain a better understanding of their model behaviors, especially anomalous and error behaviors so that they can develop actionable insights to further improve their models.

Chat is not available.