Psychometrically Grounded Evaluation of LLM Personas
Jiaqi Xiong
Abstract
Research on large language model (LLM) personas has accelerated from prompt-level role play to activation-level steering and fine-grained behavioral evaluation. This review focuses narrowly on one problem: how to evaluate LLM personas with psychometric rigor. We synthesize recent work on measurement validity (e.g., social desirability, response formats), reliability and external validity (role identification, behavioral trajectories, memory/persona grounding), and cross-cultural fairness. Building on these findings, we outline a practical evaluation recipe that integrates validated survey constructs with behavioral tasks and activation-level diagnostics.
Chat is not available.
Successful Page Load