Timezone: »
While standard recurrent neural networks explicitly impose a chain structure on different forms of data, they do not have an explicit bias towards recursive self-instantiation where the extent of recursion is dynamic. Given diverse and even growing data modalities (e.g., logic, algorithmic input and output, music, code, images, and language) that can be expressed in sequences and may benefit from more architectural flexibility, we propose the self-instantiated recurrent unit (Self-IRU) with a novel inductive bias towards dynamic soft recursion. On one hand, theSelf-IRU is characterized by recursive self-instantiation via its gating functions, i.e., gating mechanisms of the Self-IRU are controlled by instances of the Self-IRU itself, which are repeatedly invoked in a recursive fashion. On the other hand, the extent of the Self-IRU recursion is controlled by gates whose values are between 0 and 1 and may vary across the temporal dimension of sequences, enabling dynamic soft recursion depth at each time step. The architectural flexibility and effectiveness of our proposed approach are demonstrated across multiple data modalities. For example, the Self-IRU achieves state-of-the-art performance on the logical inference dataset [Bowman et al., 2014] even when comparing with competitive models that have access to ground-truth syntactic information.
Author Information
Aston Zhang (AWS)
Yi Tay (NTU, Singapore)
Yikang Shen (Mila, University of Montreal, MSR Montreal)
Alvin Chan (Nanyang Technological University)
SHUAI ZHANG (University of New South Wales)
More from the Same Authors
-
2022 : Planning with Large Language Models for Code Generation »
Shun Zhang · Zhenfang Chen · Yikang Shen · Mingyu Ding · Josh Tenenbaum · Chuang Gan -
2022 : Hyper-Decision Transformer for Efficient Online Policy Adaptation »
Mengdi Xu · Yuchen Lu · Yikang Shen · Shun Zhang · DING ZHAO · Chuang Gan -
2023 Poster: Principle-Driven Self-Alignment of Language Models from Scratch with Minimal Human Supervision »
Zhiqing Sun · Yikang Shen · Qinhong Zhou · Hongxin Zhang · Zhenfang Chen · David Cox · Yiming Yang · Chuang Gan -
2023 Poster: Adaptive Online Replanning with Diffusion Models »
Siyuan Zhou · Yilun Du · Shun Zhang · Mengdi Xu · Yikang Shen · Wei Xiao · Dit-Yan Yeung · Chuang Gan -
2021 Poster: Deep Extrapolation for Attribute-Enhanced Generation »
Alvin Chan · Ali Madani · Ben Krause · Nikhil Naik -
2021 Poster: G-PATE: Scalable Differentially Private Data Generator via Private Aggregation of Teacher Discriminators »
Yunhui Long · Boxin Wang · Zhuolin Yang · Bhavya Kailkhura · Aston Zhang · Carl Gunter · Bo Li -
2019 Poster: Ordered Memory »
Yikang Shen · Shawn Tan · Arian Hosseini · Zhouhan Lin · Alessandro Sordoni · Aaron Courville -
2019 Poster: Compositional De-Attention Networks »
Yi Tay · Anh Tuan Luu · Aston Zhang · Shuohang Wang · Siu Cheung Hui -
2019 Poster: Quaternion Knowledge Graph Embeddings »
SHUAI ZHANG · Yi Tay · Lina Yao · Qi Liu -
2018 Poster: Densely Connected Attention Propagation for Reading Comprehension »
Yi Tay · Anh Tuan Luu · Siu Cheung Hui · Jian Su -
2018 Poster: Recurrently Controlled Recurrent Networks »
Yi Tay · Anh Tuan Luu · Siu Cheung Hui