We consider neural networks with rational activation functions. The choice of the nonlinear activation function in deep learning architectures is crucial and heavily impacts the performance of a neural network. We establish optimal bounds in terms of network complexity and prove that rational neural networks approximate smooth functions more efficiently than ReLU networks with exponentially smaller depth. The flexibility and smoothness of rational activation functions make them an attractive alternative to ReLU, as we demonstrate with numerical experiments.
Nicolas Boulle (University of Oxford)
Yuji Nakatsukasa (University of Oxford)
Alex Townsend (Cornell University)
Prof. Alex Townsend is the Goenka Family Tenure-Track Assistant Professor at Cornell University in the Mathematics Department. His research is in Applied Mathematics and focuses on spectral methods, low-rank techniques, fast transforms, and theoretical aspects of deep learning. Prior to Cornell, he was an Applied Math instructor at MIT (2014-2016) and a DPhil student at the University of Oxford (2010-2014). He was awarded an NSF CAREER in 2021, a SIGEST paper award in 2019, the SIAG/LA Early Career Prize in applicable linear algebra in 2018, and the Leslie Fox Prize in numerical analysis in 2015.