Understanding Model Bias Requires Systematic Probing Across Tasks
Abstract
There is a growing body of literature exposing social biases of LLMs. However, these works often focus on a specific protected group, a specific prompt type and a specific decision task. Given the large and complex input-output space of LLMs, case-by-case analyses alone may not paint a picture of the systematic biases of these models. In this paper, we argue for broad and systematic bias probing. We propose to do so by comparing the distribution of outputsover a wide range of prompts, multiple protected attributes and across different realistic decision making settings in thesame application domain. We demonstrate this approach for three personalized healthcare advice-seeking settings. We argue that studying the complex patterns of bias across tasks helps us better anticipatethe way behaviors (specifically biased behaviors) of LLMs might generalize to new tasks.