Skip to yearly menu bar Skip to main content


Oral Poster

Questioning the Survey Responses of Large Language Models

Ricardo Dominguez-Olmedo · Moritz Hardt · Celestine Mendler-Dünner

East Exhibit Hall A-C #2708
[ ]
Thu 12 Dec 11 a.m. PST — 2 p.m. PST
 
Oral presentation: Oral Session 3B: Natural Language Processing
Thu 12 Dec 10 a.m. PST — 11 a.m. PST

Abstract:

Surveys have recently gained popularity as a tool to study large language models. By comparing models’ survey responses to those of different human reference populations, researchers aim to infer the demographics, political opinions, or values best represented by current language models. In this work, we critically examine language models' survey responses on the basis of the well-established American Community Survey by the U.S. Census Bureau. Evaluating 43 different language models using de-facto standard prompting methodologies, we establish two dominant patterns. First, models' responses are governed by ordering and labeling biases, for example, towards survey responses labeled with the letter “A”. Second, when adjusting for these systematic biases through randomized answer ordering, models across the board trend towards uniformly random survey responses, irrespective of model size or training data. As a result, models consistently appear to better represent subgroups whose aggregate statistics are closest to uniform for the survey under consideration, leading to potentially misguided conclusions about model alignment.

Live content is unavailable. Log in and register to view live content