Skip to yearly menu bar Skip to main content


Automatic Clipping: Differentially Private Deep Learning Made Easier and Stronger

Zhiqi Bu · Yu-Xiang Wang · Sheng Zha · George Karypis

Great Hall & Hall B1+B2 (level 1) #1613
[ ] [ Project Page ]
[ Paper [ Poster [ OpenReview
Wed 13 Dec 3 p.m. PST — 5 p.m. PST

Abstract: Per-example gradient clipping is a key algorithmic step that enables practical differential private (DP) training for deep learning models. The choice of clipping threshold $R$, however, is vital for achieving high accuracy under DP. We propose an easy-to-use replacement, called automatic clipping, that eliminates the need to tune $R$ for any DP optimizers, including DP-SGD, DP-Adam, DP-LAMB and many others.The automatic variants are as private and computationally efficient as existing DP optimizers, but require no DP-specific hyperparameters and thus make DP training as amenable as the standard non-private training. We give a rigorous convergence analysis of automatic DP-SGD in the non-convex setting, showing that it can enjoy an asymptotic convergence rate that matches the standard SGD, under a symmetric gradient noise assumption of the per-sample gradients (commonly used in the non-DP literature). We demonstrate on various language and vision tasks that automatic clipping outperforms or matches the state-of-the-art, and can be easily employed with minimal changes to existing codebases.

Chat is not available.