NeurIPS Poster Online Convex Optimization with Hard Constraints: Towards the Best of Two Worlds and Beyond

Poster

Online Convex Optimization with Hard Constraints: Towards the Best of Two Worlds and Beyond

Hengquan Guo · Xin Liu · Honghao Wei · Lei Ying

Hall J (level 1) #824

Keywords: [ hard constraints ] [ best of two worlds and beyond ] [ online convex optimization ] [ cumulative absolute violation ]

[ Abstract ]

[ Paper] [ OpenReview]

Abstract: This paper considers online convex optimization with hard constraints and analyzes achievable regret and cumulative hard constraint violation (violation for short). The problem distinguishes itself from online convex optimization with soft constraints, where a violation at one round can be compensated/cancelled by a conservative decision at a different round. We propose a RECtified Online Optimization algorithm (RECOO) and consider two settings: fixed constraints and adversarial constraints. Both settings have been considered in the literature. Compared with existing results, {\em RECOO achieves the best of two worlds and beyond.} For the fixed-constraints setting, RECOO achieves

O (\sqrt{T})

$O\left(\sqrt{T}\right)$ regret and

O (1)

$O(1)$ violation, where

T

$T$ is the learning horizon. The best known results in this case are

O (\sqrt{T})

$O(\sqrt{T})$ regret and

O (T^{1 / 4})

$O\left(T^{1/4}\right)$ violation. For the adversarial-constraints setting, it guarantees

O (\sqrt{T})

$O(\sqrt{T})$ regret and

O (T^{3 / 4})

$O(T^{3/4})$ violation, which match the best existing results. When the loss functions are strongly convex, RECOO can guarantee

O (\log T)

$O(\log T)$ regret and

O (1)

$O(1)$ violation for fixed constraints, and

O (\log T)

$O(\log T)$ regret and

O (\sqrt{T \log T})

$O(\sqrt{T\log T})$ violation for adversarial constraints. Both these results are order-wise better than the existing bounds. The regret and violation bounds mentioned above use the best fixed decision in hindsight as the baseline. This paper further considers a dynamic baseline where the comparator sequence is time-varying. This paper shows that RECOO not only improves the existing results in the fixed-constraints setting but also {\em for the first time,} guarantees dynamic regret and violation bounds in the adversarial-constraints setting. Our experiment results confirm that RECOO outperforms several existing algorithms for both fixed and adversarial constraints.

Chat is not available.