Timezone: »
First-order methods for stochastic optimization have undeniable relevance. Variance reduction for these algorithms has become an important research topic. We exploit convexity and L-smoothness to improve the noisy estimates outputted by the stochastic gradient oracle. Our method, named COCO denoiser, is the joint maximum likelihood estimator of multiple function gradients from their noisy observations, subject to co-coercivity constraints between them. The resulting estimate is the solution of a convex Quadratically Constrained Quadratic Problem. Although this problem is expensive to solve by interior point methods, we exploit its structure to apply an accelerated first-order algorithm, the Fast Dual Proximal Gradient method. Besides analytically characterizing the proposed estimator, we show empirically that increasing the number and proximity of the queried points leads to better gradient estimates. We also apply COCO in stochastic settings by plugging it in existing algorithms, such as SGD, Adam or STRSAGA, outperforming their vanilla versions, even in scenarios where our modelling assumptions are mismatched.
Author Information
Manuel Madeira (Instituto Superior Técnico)
Renato Negrinho (Carnegie Mellon University)
Joao Xavier (Instituto Superior Tecnico)
Pedro Aguiar (Instituto Superior Técnico)
More from the Same Authors
-
2022 : A Multi-Token Coordinate Descent Method for Vertical Federated Learning »
Pedro Valdeira · Yuejie Chi · Claudia Soares · Joao Xavier -
2021 : Poster Session 1 (gather.town) »
Hamed Jalali · Robert Hönig · Maximus Mutschler · Manuel Madeira · Abdurakhmon Sadiev · Egor Shulgin · Alasdair Paren · Pascal Esser · Simon Roburin · Julius Kunze · Agnieszka Słowik · Frederik Benzing · Futong Liu · Hongyi Li · Ryotaro Mitsuboshi · Grigory Malinovsky · Jayadev Naram · Zhize Li · Igor Sokolov · Sharan Vaswani -
2020 Poster: Sparse and Continuous Attention Mechanisms »
André Martins · António Farinhas · Marcos Treviso · Vlad Niculae · Pedro Aguiar · Mario Figueiredo -
2020 Spotlight: Sparse and Continuous Attention Mechanisms »
André Martins · António Farinhas · Marcos Treviso · Vlad Niculae · Pedro Aguiar · Mario Figueiredo -
2019 Poster: Towards modular and programmable architecture search »
Renato Negrinho · Matthew Gormley · Geoffrey Gordon · Darshan Patil · Nghia Le · Daniel Ferreira -
2018 Poster: Learning Beam Search Policies via Imitation Learning »
Renato Negrinho · Matthew Gormley · Geoffrey Gordon -
2014 Poster: Orbit Regularization »
Renato Negrinho · Andre Martins