Timezone: »

What is a Good Metric to Study Generalization of Minimax Learners?
Asuman Ozdaglar · Sarath Pattathil · Jiawei Zhang · Kaiqing Zhang

Thu Dec 01 02:00 PM -- 04:00 PM (PST) @ Hall J #617

Minimax optimization has served as the backbone of many machine learning problems. Although the convergence behavior of optimization algorithms has been extensively studied in minimax settings, their generalization guarantees, i.e., how the model trained on empirical data performs on the unseen testing data, have been relatively under-explored. A fundamental question remains elusive: What is a good metric to study generalization of minimax learners? In this paper, we aim to answer this question by first showing that primal risk, a universal metric to study generalization in minimization problems, fails in simple examples of minimax problems. Furthermore, another popular metric, the primal-dual risk, also fails to characterize the generalization behavior for minimax problems with nonconvexity, due to non-existence of saddle points. We thus propose a new metric to study generalization of minimax learners: the primal gap, to circumvent these issues. Next, we derive generalization bounds for the primal gap in nonconvex-concave settings. As byproducts of our analysis, we also solve two open questions: establishing generalization bounds for primal risk and primal-dual risk in this setting, and in the strong sense, i.e., without assuming that the maximization and expectation can be interchanged. Finally, we leverage this new metric to compare the generalization behavior of two popular algorithms - gradient descent-ascent (GDA) and gradient descent-max (GDMax) in minimax optimization.

Author Information

Asuman Ozdaglar (Massachusetts Institute of Technology)

Asu Ozdaglar received the B.S. degree in electrical engineering from the Middle East Technical University, Ankara, Turkey, in 1996, and the S.M. and the Ph.D. degrees in electrical engineering and computer science from the Massachusetts Institute of Technology, Cambridge, in 1998 and 2003, respectively. She is currently a professor in the Electrical Engineering and Computer Science Department at the Massachusetts Institute of Technology. She is also the director of the Laboratory for Information and Decision Systems. Her research expertise includes optimization theory, with emphasis on nonlinear programming and convex analysis, game theory, with applications in communication, social, and economic networks, distributed optimization and control, and network analysis with special emphasis on contagious processes, systemic risk and dynamic control. Professor Ozdaglar is the recipient of a Microsoft fellowship, the MIT Graduate Student Council Teaching award, the NSF Career award, the 2008 Donald P. Eckman award of the American Automatic Control Council, the Class of 1943 Career Development Chair, the inaugural Steven and Renee Innovation Fellowship, and the 2014 Spira teaching award. She served on the Board of Governors of the Control System Society in 2010 and was an associate editor for IEEE Transactions on Automatic Control. She is currently the area co-editor for a new area for the journal Operations Research, entitled "Games, Information and Networks. She is the co-author of the book entitled “Convex Analysis and Optimization” (Athena Scientific, 2003).

Sarath Pattathil (MIT)
Jiawei Zhang (The Chinese University of Hong Kong, Shenzhen)
Kaiqing Zhang (Massachusetts Institute of Technology)

More from the Same Authors