Timezone: »

 
Normative disagreement as a challenge for Cooperative AI
Julian Stastny · Maxime Riché · Aleksandr Lyzhov · Johannes Treutlein · Allan Dafoe · Jesse Clifton

Cooperation in settings where agents have both common and conflicting interests (mixed-motive environments) has recently received considerable attention in multi-agent learning. However, the mixed-motive environments typically studied have a single cooperative outcome on which all agents can agree. Many real-world multi-agent environments are instead bargaining problems (BPs): they have several Pareto-optimal payoff profiles over which agents have conflicting preferences. We argue that typical cooperation-inducing learning algorithms fail to cooperate in BPs when there is room for \textit{normative disagreement} resulting in the existence of multiple competing cooperative equilibria, and illustrate this problem empirically. To remedy the issue, we introduce the notion of \textit{norm-adaptive} policies. Norm-adaptive policies are capable of behaving according to different norms in different circumstances, creating opportunities for resolving normative disagreement. We develop a class of norm-adaptive policies and show in experiments that these significantly increase cooperation. However, norm-adaptiveness cannot address residual bargaining failure arising from a fundamental tradeoff between exploitability and cooperative robustness.

Author Information

Julian Stastny (University of Tuebingen)
Maxime Riché (Center on Long-Term Risk)
Aleksandr Lyzhov (New York University)
Johannes Treutlein (University of Toronto)
Allan Dafoe (Centre for the Governance of AI)
Jesse Clifton

More from the Same Authors