Skip to yearly menu bar Skip to main content


Back-Modality: Leveraging Modal Transformation for Data Augmentation

Zhi Li · Yifan Liu · Yin Zhang

Great Hall & Hall B1+B2 (level 1) #500
[ ]
[ Paper [ Poster [ OpenReview
Wed 13 Dec 3 p.m. PST — 5 p.m. PST


We introduce Back-Modality, a novel data augmentation schema predicated on modal transformation. Data from an initial modality undergoes transformation to an intermediate modality, followed by a reverse transformation. This framework serves dual roles. On one hand, it operates as a general data augmentation strategy. On the other hand, it allows for other augmentation techniques, suitable for the intermediate modality, to enhance the initial modality. For instance, data augmentation methods applicable to pure text can be employed to augment images, thereby facilitating the cross-modality of data augmentation techniques. To validate the viability and efficacy of our framework, we proffer three instantiations of Back-Modality: back-captioning, back-imagination, and back-speech. Comprehensive evaluations across tasks such as image classification, sentiment classification, and textual entailment demonstrate that our methods consistently enhance performance under data-scarce circumstances.

Chat is not available.