Workshop: Trustworthy and Socially Responsible Machine Learning

Provable Re-Identification Privacy

Zachary Izzo · Jinsung Yoon · Sercan Arik · James Zou


In applications involving sensitive data, such as finance and healthcare, the necessity for preserving data privacy can be a significant barrier to machine learning model development. Differential privacy (DP) has emerged as one canonical standard for provable privacy. However, DP's strong theoretical guarantees often come at the cost of a large drop in its utility for machine learning; and DP guarantees themselves can be difficult to interpret. As a result, standard DP has encountered deployment challenges in practice. In this work, we propose a different privacy notion, re-identification privacy (RIP), to address these challenges. RIP guarantees are easily interpretable in terms of the success rate of membership inference attacks. We give a precise characterization of the relationship between RIP and DP, and show that RIP can be achieved using less randomness compared to the amount required for guaranteeing DP, leading to smaller drop in utility. Our theoretical results also give rise to a simple algorithm for guaranteeing RIP which can be used as a wrapper around any algorithm with a continuous output, including parametric model training.

Chat is not available.