Timezone: »
Optimization lies at the heart of many exciting developments in machine learning, statistics and signal processing. As models become more complex and datasets get larger, finding efficient, reliable and provable methods is one of the primary goals in these fields.
In the last few decades, much effort has been devoted to the development of first-order methods. These methods enjoy a low per-iteration cost and have optimal complexity, are easy to implement, and have proven to be effective for most machine learning applications. First-order methods, however, have significant limitations: (1) they require fine hyper-parameter tuning, (2) they do not incorporate curvature information, and thus are sensitive to ill-conditioning, and (3) they are often unable to fully exploit the power of distributed computing architectures.
Higher-order methods, such as Newton, quasi-Newton and adaptive gradient descent methods, are extensively used in many scientific and engineering domains. At least in theory, these methods possess several nice features: they exploit local curvature information to mitigate the effects of ill-conditioning, they avoid or diminish the need for hyper-parameter tuning, and they have enough concurrency to take advantage of distributed computing environments. Researchers have even developed stochastic versions of higher-order methods, that feature speed and scalability by incorporating curvature information in an economical and judicious manner. However, often higher-order methods are “undervalued.”
This workshop will attempt to shed light on this statement. Topics of interest include --but are not limited to-- second-order methods, adaptive gradient descent methods, regularization techniques, as well as techniques based on higher-order derivatives.
Fri 8:00 a.m. - 8:30 a.m.
|
Opening Remarks
(
Opening remarks
)
Opening remarks for the workshop by the organizers |
Anastasios Kyrillidis · Albert Berahas · Fred Roosta · Michael Mahoney 🔗 |
Fri 8:30 a.m. - 9:15 a.m.
|
Economical use of second-order information in training machine learning models
(
Plenary talk
)
Stochastic gradient descent (SGD) and variants such as Adagrad and Adam, are extensively used today to train modern machine learning models. In this talk we will discuss ways to economically use second-order information to modify both the step size (learning rate) used in SGD and the direction taken by SGD. Our methods adaptively control the batch sizes used to compute gradient and Hessian approximations and and ensure that the steps that are taken decrease the loss function with high probability assuming that the latter is self-concordant, as is true for many problems in empirical risk minimization. For such cases we prove that our basic algorithm is globally linearly convergent. A slightly modified version of our method is presented for training deep learning models. Numerical results will be presented that show that it exhibits excellent performance without the need for learning rate tuning. If there is time, additional ways to efficiently make use of second-order information will be presented. |
Donald Goldfarb 🔗 |
Fri 9:00 a.m. - 9:45 a.m.
|
Spotlight talks
(
Spotlight talks from paper submissions
)
How does mini-batching affect Curvature information for second order deep learning optimization? Diego Granziol (Oxford); Stephen Roberts (Oxford); Xingchen Wan (Oxford University); Stefan Zohren (University of Oxford); Binxin Ru (University of Oxford); Michael A. Osborne (University of Oxford); Andrew Wilson (NYU); sebastien ehrhardt (Oxford); Dmitry P Vetrov (Higher School of Economics); Timur Garipov (Samsung AI Center in Moscow) Acceleration through Spectral Modeling. Fabian Pedregosa (Google); Damien Scieur (Princeton University) Using better models in stochastic optimization. Hilal Asi (Stanford University); John Duchi (Stanford University) |
Diego Granziol · Fabian Pedregosa · Hilal Asi 🔗 |
Fri 9:45 a.m. - 10:30 a.m.
|
Poster Session
Poster Session |
Eduard Gorbunov · Alexandre d'Aspremont · Lingxiao Wang · Liwei Wang · Boris Ginsburg · Alessio Quaglino · Camille Castera · Saurabh Adya · Diego Granziol · Rudrajit Das · Raghu Bollapragada · Fabian Pedregosa · Martin Takac · Majid Jahani · Sai Praneeth Karimireddy · Hilal Asi · Balint Daroczy · Leonard Adolphs · Aditya Rawal · Nicolas Brandt · Minhan Li · Giuseppe Ughi · Orlando Romero · Ivan Skorokhodov · Damien Scieur · Kiwook Bae · Konstantin Mishchenko · Rohan Anil · Vatsal Sharan · Aditya Balu · Chao Chen · Zhewei Yao · Tolga Ergen · Paul Grigas · Chris Junchi Li · Jimmy Ba · Stephen J Roberts · Sharan Vaswani · Armin Eftekhari · Chhavi Sharma
|
Fri 10:30 a.m. - 11:15 a.m.
|
Adaptive gradient methods: efficient implementation and generalization
(
Plenary talk - Elad Hazan
)
Adaptive gradient methods have had a transformative impact in deep learning. We will describe recent theoretical and experimental advances in their understanding, including low-memory adaptive preconditioning, and insights into their generalizaton ability. |
🔗 |
Fri 11:15 a.m. - 12:00 p.m.
|
Spotlight talks
(
Spotlight talks from paper submissions
)
Symmetric Multisecant quasi-Newton methods. Damien Scieur (Samsung AI Research Montreal); Thomas Pumir (Princeton University); Nicolas Boumal (Princeton University) Stochastic Newton Method and its Cubic Regularization via Majorization-Minimization. Konstantin Mishchenko (King Abdullah University of Science & Technology (KAUST)); Peter Richtarik (KAUST); Dmitry Koralev (KAUST) Full Matrix Preconditioning Made Practical. Rohan Anil (Google); Vineet Gupta (Google); Tomer Koren (Google); Kevin Regan (Google); Yoram Singer (Princeton) |
Damien Scieur · Konstantin Mishchenko · Rohan Anil 🔗 |
Fri 12:00 p.m. - 2:00 p.m.
|
Lunch break
|
🔗 |
Fri 2:00 p.m. - 2:45 p.m.
|
K-FAC: Extensions, improvements, and applications
(
Plenary talk
)
Second order optimization methods have the potential to be much faster than first order methods in the deterministic case, or pre-asymptotically in the stochastic case. However, traditional second order methods have proven ineffective or impractical for neural network training, due in part to the extremely high dimension of the parameter space. Kronecker-factored Approximate Curvature (K-FAC) is second-order optimization method based on a tractable approximation to the Gauss-Newton/Fisher matrix that exploits the special structure present in neural network training objectives. This approximation is neither low-rank nor diagonal, but instead involves Kronecker-products, which allows for efficient estimation, storage and inversion of the curvature matrix. In this talk I will introduce the basic K-FAC method for standard MLPs and then present some more recent work in this direction, including extensions to CNNs and RNNs, both of which requires new approximations to the Fisher. For these I will provide mathematical intuitions and empirical results which speak to their efficacy in neural network optimization. Time permitting, I will also discuss some recent results on large-batch optimization with K-FAC, and the use of adaptive adjustment methods that can eliminate the need for costly hyperparameter tuning. |
James Martens 🔗 |
Fri 2:45 p.m. - 3:30 p.m.
|
Spotlight talks
(
Spotlight talks from paper submissions
)
Hessian-Aware trace-Weighted Quantization. Zhen Dong (UC Berkeley); Zhewei Yao (University of California, Berkeley); Amir Gholami (UC Berkeley); Yaohui Cai (Peking University); Daiyaan Arfeen (UC Berkeley); Michael Mahoney ("University of California, Berkeley"); Kurt Keutzer (UC Berkeley) New Methods for Regularization Path Optimization via Differential Equations. Paul Grigas (UC Berkeley); Heyuan Liu (University of California, Berkeley) Ellipsoidal Trust Region Methods for Neural Nets. Leonard Adolphs (ETHZ); Jonas Kohler (ETHZ) Sub-sampled Newton Methods Under Interpolation. Si Yi Meng (University of British Columbia); Sharan Vaswani (Mila, Université de Montréal); Issam Laradji (University of British Columbia); Mark Schmidt (University of British Columbia); Simon Lacoste-Julien (Mila, Université de Montréal) |
Paul Grigas · Zhewei Yao · Aurelien Lucchi · Si Yi Meng 🔗 |
Fri 3:30 p.m. - 4:15 p.m.
|
Poster Session (same as above)
(
Poster Session
)
An Accelerated Method for Derivative-Free Smooth Stochastic Convex Optimization. Eduard Gorbunov (Moscow Institute of Physics and Technology); Pavel Dvurechenskii (WIAS Germany); Alexander Gasnikov (Moscow Institute of Physics and Technology) Fast Bregman Gradient Methods for Low-Rank Minimization Problems. Radu-Alexandru Dragomir (Université Toulouse 1); Jérôme Bolte (Université Toulouse 1); Alexandre d'Aspremont (Ecole Normale Superieure) Gluster: Variance Reduced Mini-Batch SGD with Gradient Clustering. Fartash Faghri (University of Toronto); David Duvenaud (University of Toronto); David Fleet (University of Toronto); Jimmy Ba (University of Toronto) Neural Policy Gradient Methods: Global Optimality and Rates of Convergence. Lingxiao Wang (Northwestern University); Qi Cai (Northwestern University); Zhuoran Yang (Princeton University); Zhaoran Wang (Northwestern University) A Gram-Gauss-Newton Method Learning Overparameterized Deep Neural Networks for Regression Problems. Tianle Cai (Peking University); Ruiqi Gao (Peking University); Jikai Hou (Peking University); Siyu Chen (Peking University); Dong Wang (Peking University); Di He (Peking University); Zhihua Zhang (Peking University); Liwei Wang (Peking University) Stochastic Gradient Methods with Layerwise Adaptive Moments for Training of Deep Networks. Boris Ginsburg (NVIDIA); Oleksii Hrinchuk (NVIDIA); Jason Li (NVIDIA); Vitaly Lavrukhin (NVIDIA); Ryan Leary (NVIDIA); Oleksii Kuchaiev (NVIDIA); Jonathan Cohen (NVIDIA); Huyen Nguyen (NVIDIA); Yang Zhang (NVIDIA) Accelerating Neural ODEs with Spectral Elements. Alessio Quaglino (NNAISENSE SA); Marco Gallieri (NNAISENSE); Jonathan Masci (NNAISENSE); Jan Koutnik (NNAISENSE) An Inertial Newton Algorithm for Deep Learning. Camille Castera (CNRS, IRIT); Jérôme Bolte (Université Toulouse 1); Cédric Févotte (CNRS, IRIT); Edouard Pauwels (Toulouse 3 University) Nonlinear Conjugate Gradients for Scaling Synchronous Distributed DNN Training. Saurabh Adya (Apple); Vinay Palakkode (Apple Inc.); Oncel Tuzel (Apple Inc.)
On the Convergence of a Biased Version of Stochastic Gradient Descent. Rudrajit Das (University of Texas at Austin); Jiong Zhang (UT-Austin); Inderjit S. Dhillon (UT Austin & Amazon) Adaptive Sampling Quasi-Newton Methods for Derivative-Free Stochastic Optimization. Raghu Bollapragada (Argonne National Laboratory); Stefan Wild (Argonne National Laboratory)
Accelerating Distributed Stochastic L-BFGS by sampled 2nd-Order Information. Jie Liu (Lehigh University); Yu Rong (Tencent AI Lab); Martin Takac (Lehigh University); Junzhou Huang (Tencent AI Lab) Grow Your Samples and Optimize Better via Distributed Newton CG and Accumulating Strategy. Majid Jahani (Lehigh University); Xi He (Lehigh University); Chenxin Ma (Lehigh University); Aryan Mokhtari (UT Austin); Dheevatsa Mudigere (Intel Labs); Alejandro Ribeiro (University of Pennsylvania); Martin Takac (Lehigh University) Global linear convergence of trust-region Newton's method without strong-convexity or smoothness. Sai Praneeth Karimireddy (EPFL); Sebastian Stich (EPFL); Martin Jaggi (EPFL) FD-Net with Auxiliary Time Steps: Fast Prediction of PDEs using Hessian-Free Trust-Region Methods. Nur Sila Gulgec (Lehigh University); Zheng Shi (Lehigh University); Neil Deshmukh (MIT BeaverWorks - Medlytics); Shamim Pakzad (Lehigh University); Martin Takac (Lehigh University)
Tangent space separability in feedforward neural networks. Bálint Daróczy (Institute for Computer Science and Control, Hungarian Academy of Sciences); Rita Aleksziev (Institute for Computer Science and Control, Hungarian Academy of Sciences); Andras Benczur (Hungarian Academy of Sciences)
Closing the K-FAC Generalisation Gap Using Stochastic Weight Averaging. Xingchen Wan (University of Oxford); Diego Granziol (Oxford); Stefan Zohren (University of Oxford); Stephen Roberts (Oxford)
Learned First-Order Preconditioning. Aditya Rawal (Uber AI Labs); Rui Wang (Uber AI); Theodore Moskovitz (Gatsby Computational Neuroscience Unit); Sanyam Kapoor (Uber); Janice Lan (Uber AI); Jason Yosinski (Uber AI Labs); Thomas Miconi (Uber AI Labs) Iterative Hessian Sketch in Input Sparsity Time. Charlie Dickens (University of Warwick); Graham Cormode (University of Warwick) Nonlinear matrix recovery. Florentin Goyens (University of Oxford); Coralia Cartis (Oxford University); Armin Eftekhari (EPFL) Making Variance Reduction more Effective for Deep Networks. Nicolas Brandt (EPFL); Farnood Salehi (EPFL); Patrick Thiran (EPFL) Novel and Efficient Approximations for Zero-One Loss of Linear Classifiers. Hiva Ghanbari (Lehigh University); Minhan Li (Lehigh University); Katya Scheinberg (Lehigh) A Model-Based Derivative-Free Approach to Black-Box Adversarial Examples: BOBYQA. Giuseppe Ughi (University of Oxford) Distributed Accelerated Inexact Proximal Gradient Method via System of Coupled Ordinary Differential Equations. Chhavi Sharma (IIT Bombay); Vishnu Narayanan (IIT Bombay); Balamurugan Palaniappan (IIT Bombay) Finite-Time Convergence of Continuous-Time Optimization Algorithms via Differential Inclusions. Orlando Romero (Rensselaer Polytechnic Institute); Mouhacine Benosman (MERL) Loss Landscape Sightseeing by Multi-Point Optimization. Ivan Skorokhodov (MIPT); Mikhail Burtsev (NI)
Does Adam optimizer keep close to the optimal point? Kiwook Bae (KAIST)*; Heechang Ryu (KAIST); Hayong Shin (KAIST)
Memory-Sample Tradeoffs for Linear Regression with Small Error. Vatsal Sharan (Stanford University); Aaron Sidford (Stanford); Gregory Valiant (Stanford University) On the Higher-order Moments in Adam. Zhanhong Jiang (Johnson Controls International); Aditya Balu (Iowa State University); Sin Yong Tan (Iowa State University); Young M Lee (Johnson Controls International); Chinmay Hegde (Iowa State University); Soumik Sarkar (Iowa State University) h-matrix approximation for Gauss-Newton Hessian. Chao Chen (UT Austin)
Random Projections for Learning Non-convex Models. Tolga Ergen (Stanford University); Emmanuel Candes (Stanford University); Mert Pilanci (Stanford)
Hessian-Aware Zeroth-Order Optimization. Haishan Ye (HKUST); Zhichao Huang (HKUST); Cong Fang (Peking University); Chris Junchi Li (Tencent); Tong Zhang (HKUST) Higher-Order Accelerated Methods for Faster Non-Smooth Optimization. Brian Bullins (TTIC) |
🔗 |
Fri 4:15 p.m. - 5:00 p.m.
|
Analysis of linear search methods for various gradient approximation schemes for noisy derivative free optimization.
(
Plenary talk
)
We develop convergence analysis of a modified line search method for objective functions whose value is computed with noise and whose gradient estimates are not directly available. The noise is assumed to be bounded in absolute value without any additional assumptions. In this case, gradient approximation can be constructed via interpolation or sample average approximation of smoothing gradients and thus they are always inexact and possibly random. We extend the framework based on stochastic methods which was developed to provide analysis of a standard line-search method with exact function values and random gradients to the case of noisy function. We introduce a condition on the gradient which when satisfied with some sufficiently large probability at each iteration, guarantees convergence properties of the line search method. We derive expected complexity bounds for convex, strongly convex and nonconvex functions. We motivate these results with several recent papers related to policy optimization. |
Katya Scheinberg 🔗 |
Fri 5:00 p.m. - 5:45 p.m.
|
Second-order methods for nonconvex optimization with complexity guarantees
(
Plenary talk
)
We consider problems of smooth nonconvex optimization: unconstrained, bound-constrained, and with general equality constraints. We show that algorithms for these problems that are widely used in practice can be modified slightly in ways that guarantees convergence to approximate first- and second-order optimal points with complexity guarantees that depend on the desired accuracy. The methods we discuss are constructed from Newton's method, the conjugate gradient method, log-barrier method, and augmented Lagrangians. (In some cases, special structure of the objective function makes for only a weak dependence on the accuracy parameter.) Our methods require Hessian information only in the form of Hessian-vector products, so do not require the Hessian to be evaluated and stored explicitly. This talk describes joint work with Clement Royer, Yue Xie, and Michael O'Neill. |
Stephen Wright 🔗 |
Fri 5:45 p.m. - 6:00 p.m.
|
Final remarks
Final remarks for the workshop |
Anastasios Kyrillidis · Albert Berahas · Fred Roosta · Michael Mahoney 🔗 |
Author Information
Anastasios Kyrillidis (Rice University)
Albert Berahas (Lehigh University)
Fred Roosta (University of Queensland)
Michael Mahoney (UC Berkeley)
More from the Same Authors
-
2021 Spotlight: Newton-LESS: Sparsification without Trade-offs for the Sketched Newton Update »
Michal Derezinski · Jonathan Lacotte · Mert Pilanci · Michael Mahoney -
2021 : Acceleration and Stability of the Stochastic Proximal Point Algorithm »
Junhyung Lyle Kim · Panos Toulis · Anastasios Kyrillidis -
2021 : Acceleration and Stability of the Stochastic Proximal Point Algorithm »
Junhyung Lyle Kim · Panos Toulis · Anastasios Kyrillidis -
2022 : LOFT: Finding Lottery Tickets through Filter-wise Training »
Qihan Wang · Chen Dun · Fangshuo Liao · Christopher Jermaine · Anastasios Kyrillidis -
2022 : Strong Lottery Ticket Hypothesis with $\epsilon$–perturbation »
Fangshuo Liao · Zheyang Xiong · Anastasios Kyrillidis -
2022 : Strong Lottery Ticket Hypothesis with $\epsilon$–perturbation »
Fangshuo Liao · Zheyang Xiong · Anastasios Kyrillidis -
2022 : Efficient and Light-Weight Federated Learning via Asynchronous Distributed Dropout »
Chen Dun · Mirian Hipolito Garcia · Dimitrios Dimitriadis · Christopher Jermaine · Anastasios Kyrillidis -
2022 : GIST: Distributed Training for Large-Scale Graph Convolutional Networks »
Cameron Wolfe · Jingkang Yang · Fangshuo Liao · Arindam Chowdhury · Chen Dun · Artun Bayer · Santiago Segarra · Anastasios Kyrillidis -
2022 Poster: A Fast Post-Training Pruning Framework for Transformers »
Woosuk Kwon · Sehoon Kim · Michael Mahoney · Joseph Hassoun · Kurt Keutzer · Amir Gholami -
2022 Poster: Squeezeformer: An Efficient Transformer for Automatic Speech Recognition »
Sehoon Kim · Amir Gholami · Albert Shaw · Nicholas Lee · Karttikeya Mangalam · Jitendra Malik · Michael Mahoney · Kurt Keutzer -
2022 Poster: LSAR: Efficient Leverage Score Sampling Algorithm for the Analysis of Big Time Series Data »
Ali Eshragh · Fred Roosta · Asef Nazari · Michael Mahoney -
2021 : Q&A with Michael Mahoney »
Michael Mahoney -
2021 : Putting Randomized Matrix Algorithms in LAPACK, and Connections with Second-order Stochastic Optimization, Michael Mahoney »
Michael Mahoney -
2021 Poster: Newton-LESS: Sparsification without Trade-offs for the Sketched Newton Update »
Michal Derezinski · Jonathan Lacotte · Mert Pilanci · Michael Mahoney -
2021 Poster: Noisy Recurrent Neural Networks »
Soon Hoe Lim · N. Benjamin Erichson · Liam Hodgkinson · Michael Mahoney -
2021 Poster: Hessian Eigenspectra of More Realistic Nonlinear Models »
Zhenyu Liao · Michael Mahoney -
2021 Poster: Characterizing possible failure modes in physics-informed neural networks »
Aditi Krishnapriyan · Amir Gholami · Shandian Zhe · Robert Kirby · Michael Mahoney -
2021 Poster: Taxonomizing local versus global structure in neural network loss landscapes »
Yaoqing Yang · Liam Hodgkinson · Ryan Theisen · Joe Zou · Joseph Gonzalez · Kannan Ramchandran · Michael Mahoney -
2021 Poster: Stateful ODE-Nets using Basis Function Expansions »
Alejandro Queiruga · N. Benjamin Erichson · Liam Hodgkinson · Michael Mahoney -
2021 Oral: Hessian Eigenspectra of More Realistic Nonlinear Models »
Zhenyu Liao · Michael Mahoney -
2020 Poster: Boundary thickness and robustness in learning models »
Yaoqing Yang · Rajiv Khanna · Yaodong Yu · Amir Gholami · Kurt Keutzer · Joseph Gonzalez · Kannan Ramchandran · Michael Mahoney -
2020 Poster: Debiasing Distributed Second Order Optimization with Surrogate Sketching and Scaled Regularization »
Michal Derezinski · Burak Bartan · Mert Pilanci · Michael Mahoney -
2020 Poster: Exact expressions for double descent and implicit regularization via surrogate random design »
Michal Derezinski · Feynman Liang · Michael Mahoney -
2020 Poster: Improved guarantees and a multiple-descent curve for Column Subset Selection and the Nystrom method »
Michal Derezinski · Rajiv Khanna · Michael Mahoney -
2020 Poster: Precise expressions for random projections: Low-rank approximation and randomized Newton »
Michal Derezinski · Feynman Liang · Zhenyu Liao · Michael Mahoney -
2020 Oral: Improved guarantees and a multiple-descent curve for Column Subset Selection and the Nystrom method »
Michal Derezinski · Rajiv Khanna · Michael Mahoney -
2020 Poster: A random matrix analysis of random Fourier features: beyond the Gaussian kernel, a precise phase transition, and the corresponding double descent »
Zhenyu Liao · Romain Couillet · Michael Mahoney -
2020 Poster: A Statistical Framework for Low-bitwidth Training of Deep Neural Networks »
Jianfei Chen · Yu Gai · Zhewei Yao · Michael Mahoney · Joseph Gonzalez -
2019 : Final remarks »
Anastasios Kyrillidis · Albert Berahas · Fred Roosta · Michael Mahoney -
2019 : Opening Remarks »
Anastasios Kyrillidis · Albert Berahas · Fred Roosta · Michael Mahoney -
2019 Poster: ANODEV2: A Coupled Neural ODE Framework »
Tianjun Zhang · Zhewei Yao · Amir Gholami · Joseph Gonzalez · Kurt Keutzer · Michael Mahoney · George Biros -
2019 Poster: Learning Sparse Distributions using Iterative Hard Thresholding »
Jacky Zhang · Rajiv Khanna · Anastasios Kyrillidis · Sanmi Koyejo -
2019 Poster: DINGO: Distributed Newton-Type Method for Gradient-Norm Optimization »
Rixon Crane · Fred Roosta -
2019 Poster: Distributed estimation of the inverse Hessian by determinantal averaging »
Michal Derezinski · Michael Mahoney -
2018 Poster: GIANT: Globally Improved Approximate Newton Method for Distributed Optimization »
Shusen Wang · Fred Roosta · Peng Xu · Michael Mahoney -
2018 Poster: Hessian-based Analysis of Large Batch Training and Robustness to Adversaries »
Zhewei Yao · Amir Gholami · Qi Lei · Kurt Keutzer · Michael Mahoney -
2016 Poster: Feature-distributed sparse regression: a screen-and-clean approach »
Jiyan Yang · Michael Mahoney · Michael Saunders · Yuekai Sun -
2016 Poster: Sub-sampled Newton Methods with Non-uniform Sampling »
Peng Xu · Jiyan Yang · Farbod Roosta-Khorasani · Christopher Ré · Michael Mahoney -
2015 : Challenges in Multiresolution Methods for Graph-based Learning »
Michael Mahoney -
2015 : Using Local Spectral Methods in Theory and in Practice »
Michael Mahoney -
2015 Poster: Fast Randomized Kernel Ridge Regression with Statistical Guarantees »
Ahmed Alaoui · Michael Mahoney -
2013 Workshop: Large Scale Matrix Analysis and Inference »
Reza Zadeh · Gunnar Carlsson · Michael Mahoney · Manfred K. Warmuth · Wouter M Koolen · Nati Srebro · Satyen Kale · Malik Magdon-Ismail · Ashish Goel · Matei A Zaharia · David Woodruff · Ioannis Koutis · Benjamin Recht