NeurIPS Poster Covariance-Aware Private Mean Estimation Without Private Covariance Estimation

Poster

Covariance-Aware Private Mean Estimation Without Private Covariance Estimation

Gavin Brown · Marco Gaboardi · Adam Smith · Jonathan Ullman · Lydia Zakynthinou

[ Abstract ]

[ OpenReview]

Abstract: We present two sample-efficient differentially private mean estimators for

d

$d$ -dimensional (sub)Gaussian distributions with unknown covariance. Informally, given

n ≳ d / α^{2}

$n \gtrsim d/\alpha^2$ samples from such a distribution with mean

μ

$\mu$ and covariance

Σ

$\Sigma$ , our estimators output

\tilde{μ}

$\tilde\mu$ such that

‖ \tilde{μ} - μ ‖_{Σ} \leq α

$\| \tilde\mu - \mu \|_{\Sigma} \leq \alpha$ , where

‖ \cdot ‖_{Σ}

$\| \cdot \|_{\Sigma}$ is the \emph{Mahalanobis distance}. All previous estimators with the same guarantee either require strong a priori bounds on the covariance matrix or require

Ω (d^{3 / 2})

$\Omega(d^{3/2})$ samples. Each of our estimators is based on a simple, general approach to designing differentially private mechanisms, but with novel technical steps to make the estimator private and sample-efficient. Our first estimator samples a point with approximately maximum Tukey depth using the exponential mechanism, but restricted to the set of points of large Tukey depth. Proving that this mechanism is private requires a novel analysis. Our second estimator perturbs the empirical mean of the data set with noise calibrated to the empirical covariance. Only the mean is released, however; the covariance is only used internally. Its sample complexity guarantees hold more generally for subgaussian distributions, albeit with a slightly worse dependence on the privacy parameter. For both estimators, careful preprocessing of the data is required to satisfy differential privacy.

Chat is not available.