Skip to yearly menu bar Skip to main content


Poster
in
Workshop: CtrlGen: Controllable Generative Modeling in Language and Vision

Robust Text Generation using Sequence-to-Sequence Pre-Training

Nishtha Madaan · · Srikanta Bedathur


Abstract:

Large Transformer-based models have shown great performance in sequence-to-sequence tasks such as machine translation, text summarization etc. While these models perform well on the original task they have been trained on, it is hard to use them for a new but related task. We propose CASPer, a framework to perturb the input-output behavior of the original pre-trained sequence-to-sequence model. CASPer learns a perturbation parameter at test time to modify the behavior of pre-trained model and generates samples that have target characteristics. We apply this framework on a pre-trained text summarization model to alter a given input text such that the generated text has a changed sentiment or other attributes. In experiments, we show that CASPer effectively generates controlled text that preserve the original content, are fluent, diverse and follow the steering provided by the attribute model. We also show that the generated text from CASPer can be used for effective data augmentation for a downstream task.

Chat is not available.