Linear Discriminant Analysis (LDA) is one of the most common methods for dimensionality reduction in pattern recognition and statistics. It is a supervised method that aims to find the most discriminant space in the reduced dimensional space, which can be further used with a linear classifier for classification. In this work, we present an iterative optimization method called the Proxy Matrix Optimization (PMO) which makes use of automatic differentiation and stochastic gradient descent (SGD) on the Grassmann manifold to arrive at the optimal projection matrix. We show that PMO does better than the prevailing manifold optimization methods.