From Caregiver–Child Dialogue to Persona-Consistent LLMs: The Resonance Corpus and Evaluation Protocols
Abstract
We present the Resonance Corpus, a large-scale corpus of naturalistic Chinese caregiver–child conversations, and recast it as a benchmark for persona modeling and cognitive alignment in LLMs. The corpus contains approximately 60,000 dialogues; a subset of 6,840 includes utterance-level annotations for cognitive level and analogical reasoning. These annotations support three evaluation tracks: (i) persona consistency, testing whether an LLM can sustain caregiver or child personas across multi-turn exchanges; (ii) cognitive alignment, quantifying whether a simulated caregiver calibrates linguistic complexity to the child’s level; and (iii) natural persona patterns, including topic adaptability under child-initiated shifts and analogical alignment in explanations. This benchmark links cognitive–pragmatic theory to measurable persona behavior, aiming to catalyze robust, human-aligned conversational agents.