Break
in
Workshop: OPT 2023: Optimization for Machine Learning
Poster Session 1
Egor Shulgin · Mingzhen He · Hanmin Li · Thibault Lahire · Eric Zelikman · Damien Scieur · Rajat Vadiraj Dwaraknath · Gene Li · Zhanhong Jiang · Rahul Jain · Zihan Zhou · Tianyue Zhang · Ilyas Fatkhullin · Frederik Kunstner · Utkarsh Singhal · Bruno Loureiro · Krishna C Kalagarla · Kai Liu · Michal Derezinski · Ross Clarke · Dimitri Papadimitriou · Mo Zhou · Jörg Franke · Chandler Smith · Darshan Chakrabarti · Trang H. Tran · Mokhwa Lee · Wei Kuang · Vincent Roulet · John Lazarsfeld · Donghyun Oh · Yihe Deng · Fu Wang · Junchi YANG · Dániel Rácz · Jeffrey Flanigan · Aaron Mishkin · Luca Scharr · Robert Gower · Chaoyue Liu · Yushen Huang · Nicholas Recker
Posters in this session
Towards a Better Theoretical Understanding of Independent Subnetwork Training
Revisiting Random Weight Perturbation for Efficiently Improving Generalization
Det-CGD: Compressed Gradient Descent with Matrix Stepsizes for Non-Convex Optimization
Non-Uniform Sampling and Adaptive Optimizers in Deep Learning
Self-Taught Optimizer (STOP): Recursively Self-Improving Code Generation
Adaptive Quasi-Newton and Anderson Acceleration Framework with Explicit Global (Accelerated) Convergence Rates
On Optimization Formulations of Finite Horizon MDPs
Dueling Optimization with a Monotone Adversary
A Predicting Clipping Asynchronous Stochastic Gradient Descent Method in Distributed Learning
Average-Constrained Policy Optimization
Model-Free, Regret-Optimal Best Policy Identification in Online CMDPs
Vanilla Thompson Sampling Revisited
Stochastic Optimization under Hidden Convexity
Why Adam Outperforms Gradient Descent on Language Models: A Heavy-Tailed Class Imbalance Problem
DynaLay: An Introspective Approach to Dynamic Layer Selection for Deep Networks
How to Guess a Gradient
Escaping mediocrity: how two-layer networks learn hard generalized linear models
Safe Posterior Sampling for Constrained MDPs with Bounded Constraint Violation
Nesterov Meets Robust Multitask Learning Twice
Stochastic Variance-Reduced Newton: Accelerating Finite-Sum Minimization with Large Batches
Adam through a Second-Order Lens
On the convergence of warped proximal iterations for solving nonmonotone inclusions and applications
Multi-head CLIP: Improving CLIP with Diverse Representations and Flat Minima
New Horizons in Parameter Regularization: A Constraint Approach
Riemannian Optimization for Euclidean Distance Geometry
Efficient Learning in Polyhedral Games via Best Response Oracles
Stochastic FISTA Step Search Algorithm for Convex Optimization
Almost multisecant BFGS quasi-Newton method
Statistical Inference of Adaptive Inexact Stochastic Newton Method
On the Interplay Between Stepsize Tuning and Progressive Sharpening
Decentralized Learning Dynamics in the Gossip Model
Pruning Neural Networks with Velocity-Constrained Optimization
Risk Bounds of Accelerated SGD for Overparameterized Linear Regression
DIRECT Optimisation with Bayesian Insights: Assessing Reliability Under Fixed Computational Budgets
Parameter-Agnostic Optimization under Relaxed Smoothness
Optimization dependent generalization bound for ReLU networks based on sensitivity in the tangent bundle
Understanding the Role of Optimization in Double Descent
Level Set Teleportation: the Good, the Bad, and the Ugly
Cup Curriculum: Curriculum Learning on Model Capacity
Variance Reduced Model Based Methods: New rates and adaptive step sizes
SGD batch saturation for training wide neural networks
Follow the flow: Proximal flow inspired multi-step methods
The Sharp Power Law of Local Search on Expanders