Timezone: »

Gated Recurrent Convolution Neural Network for OCR
Jianfeng Wang · Xiaolin Hu

Mon Dec 04 06:30 PM -- 10:30 PM (PST) @ Pacific Ballroom #121

Optical Character Recognition (OCR) aims to recognize text in natural images. Inspired by a recently proposed model for general image classification, Recurrent Convolution Neural Network (RCNN), we propose a new architecture named Gated RCNN (GRCNN) for solving this problem. Its critical component, Gated Recurrent Convolution Layer (GRCL), is constructed by adding a gate to the Recurrent Convolution Layer (RCL), the critical component of RCNN. The gate controls the context modulation in RCL and balances the feed-forward information and the recurrent information. In addition, an efficient Bidirectional Long Short-Term Memory (BLSTM) is built for sequence modeling. The GRCNN is combined with BLSTM to recognize text in natural images. The entire GRCNN-BLSTM model can be trained end-to-end. Experiments show that the proposed model outperforms existing methods on several benchmark datasets including the IIIT-5K, Street View Text (SVT) and ICDAR.

Author Information

Jianfeng Wang (Beijing University of Posts and Telecommunications)
Xiaolin Hu (Tsinghua University)

More from the Same Authors