Timezone: »

MAUVE: Measuring the Gap Between Neural Text and Human Text using Divergence Frontiers
Krishna Pillutla · Swabha Swayamdipta · Rowan Zellers · John Thickstun · Sean Welleck · Yejin Choi · Zaid Harchaoui

Tue Dec 07 08:30 AM -- 10:00 AM (PST) @ None #None

As major progress is made in open-ended text generation, measuring how close machine-generated text is to human language remains a critical open problem. We introduce Mauve, a comparison measure for open-ended text generation, which directly compares the learnt distribution from a text generation model to the distribution of human-written text using divergence frontiers. Mauve scales up to modern text generation models by computing information divergences in a quantized embedding space. Through an extensive empirical study on three open-ended generation tasks, we find that Mauve identifies known properties of generated text, scales naturally with model size, and correlates with human judgments, with fewer restrictions than existing distributional evaluation metrics.

Author Information

Krishna Pillutla (University of Washington)
Swabha Swayamdipta (Allen Institute for AI)

I'm a Postdoctoral Investigator at the Allen Institute for AI. My research focuses on studying biases in datasets and models, with an aim to achieve robust generalization. Good biases, such as structural inductive biases help language understanding. But biases can be undesirable, e.g. spurious correlations commonly found in crowd-sourced, large-scale datasets due to annotation artifacts, or social prejudices of human annotators and task designers.

Rowan Zellers (University of Washington)
John Thickstun (University of Washington)
Sean Welleck (University of Washington)
Yejin Choi (University of Washington)
Zaid Harchaoui (University of Washington)

Related Events (a corresponding poster, oral, or spotlight)

More from the Same Authors