Timezone: »

Non-Linear Coordination Graphs
Yipeng Kang · Tonghan Wang · Qianlan Yang · Xiaoran Wu · Chongjie Zhang

Tue Nov 29 09:00 AM -- 11:00 AM (PST) @ Hall J #308

Value decomposition multi-agent reinforcement learning methods learn the global value function as a mixing of each agent's individual utility functions. Coordination graphs (CGs) represent a higher-order decomposition by incorporating pairwise payoff functions and thus is supposed to have a more powerful representational capacity. However, CGs decompose the global value function linearly over local value functions, severely limiting the complexity of the value function class that can be represented. In this paper, we propose the first non-linear coordination graph by extending CG value decomposition beyond the linear case. One major challenge is to conduct greedy action selections in this new function class to which commonly adopted DCOP algorithms are no longer applicable. We study how to solve this problem when mixing networks with LeakyReLU activation are used. An enumeration method with a global optimality guarantee is proposed and motivates an efficient iterative optimization method with a local optimality guarantee. We find that our method can achieve superior performance on challenging multi-agent coordination tasks like MACO.

Author Information

Yipeng Kang (Tsinghua University)
Tonghan Wang (Tsinghua University)

Tonghan Wang is currently a Master student working with Prof. Chongjie Zhang at Institute for Interdisciplinary Information Sciences, Tsinghua University, headed by Prof. Andrew Yao. His primary research goal is to develop innovative models and methods to enable effective multi-agent cooperation, allowing a group of individuals to explore, communicate, and accomplish tasks of higher complexity. His research interests include multi-agent learning, reasoning under uncertainty, reinforcement learning, and representation learning in multi-agent systems.

Qianlan Yang (University of Illinois, Urbana-champaign)
Xiaoran Wu (Tsinghua University)
Chongjie Zhang (Tsinghua University)

More from the Same Authors