Keywords: [ ENLSP-Main ]
Transformers have displaced recurrent neural networks (RNN) for language modelling due to their effectiveness, and their scalability on ubiquitous GPUs.But in resource constrained systems, both training and inference with transformer language models are challenging due to their computational and memory requirements. RNN language models are a potential alternative, but there is still a need to bridge the gap between what RNNs are capable of in terms of efficiency and performance, and the requirements of resource constrained applications. The memory and computational requirements arising from propagating the activations of all the neurons at every time step to every connected neuron together with the sequential dependence of activations make RNNs harder to train efficiently. We propose a solution inspired by biological neuron dynamics, by making the communication between RNN units sparse and discrete along the forward direction. We show that this makes the backward pass with backpropagation through time (BPTT) computationally sparse and efficient as well. We base our model on gated recurrent unit (GRU), extending it to have its units emit discrete events for communication triggered by a threshold, so that no information needs to be communicated to other units in the absence of events. Our model achieves efficiency without compromising task performance, demonstrating competitive performance compared to state-of-the-art recurrent network models in language modelling. The dynamic activity sparsity mechanism also makes our model well suited for energy-efficient neuromorphic hardware.