Review Networks for Caption Generation
Zhilin Yang · Ye Yuan · Yuexin Wu · William Cohen · Russ Salakhutdinov

Mon Dec 05 09:00 AM -- 12:30 PM (PST) @ Area 5+6+7+8 #74

We propose a novel extension of the encoder-decoder framework, called a review network. The review network is generic and can enhance any existing encoder- decoder model: in this paper, we consider RNN decoders with both CNN and RNN encoders. The review network performs a number of review steps with attention mechanism on the encoder hidden states, and outputs a thought vector after each review step; the thought vectors are used as the input of the attention mechanism in the decoder. We show that conventional encoder-decoders are a special case of our framework. Empirically, we show that our framework improves over state-of- the-art encoder-decoder systems on the tasks of image captioning and source code captioning.

Author Information

Zhilin Yang (Carnegie Mellon University)
Ye Yuan (Carnegie Mellon University)
Yuexin Wu (Carnegie Mellon University)
William Cohen (Carnegie Mellon University)
Russ Salakhutdinov (University of Toronto)

