Timezone: »
Poster
Scalable Primal-Dual Actor-Critic Method for Safe Multi-Agent RL with General Utilities
Donghao Ying · Yunkai Zhang · Yuhao Ding · Alec Koppel · Javad Lavaei
We investigate safe multi-agent reinforcement learning, where agents seek to collectively maximize an aggregate sum of local objectives while satisfying their own safety constraints. The objective and constraints are described by general utilities, i.e., nonlinear functions of the long-term state-action occupancy measure, which encompass broader decision-making goals such as risk, exploration, or imitations. The exponential growth of the state-action space size with the number of agents presents challenges for global observability, further exacerbated by the global coupling arising from agents' safety constraints. To tackle this issue, we propose a primal-dual method utilizing shadow reward and $\kappa$-hop neighbor truncation under a form of correlation decay property, where $\kappa$ is the communication radius. In the exact setting, our algorithm converges to a first-order stationary point (FOSP) at the rate of $\mathcal{O}\left(T^{-2/3}\right)$. In the sample-based setting, we demonstrate that, with high probability, our algorithm requires $\widetilde{\mathcal{O}}\left(\epsilon^{-3.5}\right)$ samples to achieve an $\epsilon$-FOSP with an approximation error of $\mathcal{O}(\phi_0^{2\kappa})$, where $\phi_0\in (0,1)$. Finally, we demonstrate the effectiveness of our model through extensive numerical experiments.
Author Information
Donghao Ying (UC Berkeley)
Yunkai Zhang (UC Berkeley)
Yuhao Ding (UC Berkeley)
Alec Koppel (U.S. Army Research Laboratory)
Javad Lavaei (University of California, Berkeley)
More from the Same Authors
-
2022 : Posterior Coreset Construction with Kernelized Stein Discrepancy for Model-Based Reinforcement Learning »
Souradip Chakraborty · Amrit Bedi · Alec Koppel · Furong Huang · Pratap Tokekar · Dinesh Manocha -
2023 : Insight Miner: A Large-scale Multimodal Model for Insight Mining from Time Series »
Yunkai Zhang · Yawen Zhang · Ming Zheng · Kezhen Chen · Kezhen Chen · Chongyang Gao · Chongyang Gao · Ruian Ge · Ruian Ge · Siyuan Teng · Amine Jelloul · Amine Jelloul · Jinmeng Rao · Xiaoyuan Guo · Chiang-Wei Fang · Zeyu Zheng · Jie Yang -
2023 : Model Robustness and Active Learning with Missing-Not-At-Random Outcomes »
Alan Mishler · Mohsen Ghassemi · Alec Koppel · Sumitra Ganesh -
2023 Poster: Algorithmic Regularization in Tensor Optimization: Towards a Lifted Approach in Matrix Sensing »
Ziye Ma · Javad Lavaei · Somayeh Sojoudi -
2023 Poster: Geometric Analysis of Matrix Sensing over Graphs »
Haixiang Zhang · Ying Chen · Javad Lavaei -
2023 Poster: Tempo Adaptation in Non-stationary Reinforcement Learning »
Hyunin Lee · Yuhao Ding · Jongmin Lee · Ming Jin · Javad Lavaei · Somayeh Sojoudi -
2023 Poster: No-Regret Learning in Dynamic Competition with Reference Effects Under Logit Demand »
Mengzi Amy Guo · Donghao Ying · Javad Lavaei · Zuo-Jun Shen -
2022 : Mind Your Step: Continuous Conditional GANs with Generator Regularization »
Yunkai Zhang · Yufeng Zheng · Amber Ma · Siyuan Teng · Zeyu Zheng -
2021 Poster: Stochastic $L^\natural$-convex Function Minimization »
Haixiang Zhang · Zeyu Zheng · Javad Lavaei -
2021 Poster: General Low-rank Matrix Optimization: Geometric Analysis and Sharper Bounds »
Haixiang Zhang · Yingjie Bi · Javad Lavaei -
2020 Poster: Variational Policy Gradient Method for Reinforcement Learning with General Utilities »
Junyu Zhang · Alec Koppel · Amrit Singh Bedi · Csaba Szepesvari · Mengdi Wang -
2020 Spotlight: Variational Policy Gradient Method for Reinforcement Learning with General Utilities »
Junyu Zhang · Alec Koppel · Amrit Singh Bedi · Csaba Szepesvari · Mengdi Wang -
2019 : Poster and Coffee Break 1 »
Aaron Sidford · Aditya Mahajan · Alejandro Ribeiro · Alex Lewandowski · Ali H Sayed · Ambuj Tewari · Angelika Steger · Anima Anandkumar · Asier Mujika · Hilbert J Kappen · Bolei Zhou · Byron Boots · Chelsea Finn · Chen-Yu Wei · Chi Jin · Ching-An Cheng · Christina Yu · Clement Gehring · Craig Boutilier · Dahua Lin · Daniel McNamee · Daniel Russo · David Brandfonbrener · Denny Zhou · Devesh Jha · Diego Romeres · Doina Precup · Dominik Thalmeier · Eduard Gorbunov · Elad Hazan · Elena Smirnova · Elvis Dohmatob · Emma Brunskill · Enrique Munoz de Cote · Ethan Waldie · Florian Meier · Florian Schaefer · Ge Liu · Gergely Neu · Haim Kaplan · Hao Sun · Hengshuai Yao · Jalaj Bhandari · James A Preiss · Jayakumar Subramanian · Jiajin Li · Jieping Ye · Jimmy Smith · Joan Bas Serrano · Joan Bruna · John Langford · Jonathan Lee · Jose A. Arjona-Medina · Kaiqing Zhang · Karan Singh · Yuping Luo · Zafarali Ahmed · Zaiwei Chen · Zhaoran Wang · Zhizhong Li · Zhuoran Yang · Ziping Xu · Ziyang Tang · Yi Mao · David Brandfonbrener · Shirli Di-Castro · Riashat Islam · Zuyue Fu · Abhishek Naik · Saurabh Kumar · Benjamin Petit · Angeliki Kamoutsi · Simone Totaro · Arvind Raghunathan · Rui Wu · Donghwan Lee · Dongsheng Ding · Alec Koppel · Hao Sun · Christian Tjandraatmadja · Mahdi Karami · Jincheng Mei · Chenjun Xiao · Junfeng Wen · Zichen Zhang · Ross Goroshin · Mohammad Pezeshki · Jiaqi Zhai · Philip Amortila · Shuo Huang · Mariya Vasileva · El houcine Bergou · Adel Ahmadyan · Haoran Sun · Sheng Zhang · Lukas Gruber · Yuanhao Wang · Tetiana Parshakova -
2018 Poster: How Much Restricted Isometry is Needed In Nonconvex Matrix Recovery? »
Richard Zhang · Cedric Josz · Somayeh Sojoudi · Javad Lavaei -
2018 Spotlight: How Much Restricted Isometry is Needed In Nonconvex Matrix Recovery? »
Richard Zhang · Cedric Josz · Somayeh Sojoudi · Javad Lavaei -
2018 Poster: A theory on the absence of spurious solutions for nonconvex and nonsmooth optimization »
Cedric Josz · Yi Ouyang · Richard Zhang · Javad Lavaei · Somayeh Sojoudi