Timezone: »

Differential Privacy for Growing Databases
Rachel Cummings · Sara Krehbiel · Kevin A Lai · Uthaipon Tantipongpipat

Wed Dec 05 07:45 AM -- 09:45 AM (PST) @ Room 517 AB #146

The large majority of differentially private algorithms focus on the static setting, where queries are made on an unchanging database. This is unsuitable for the myriad applications involving databases that grow over time. To address this gap in the literature, we consider the dynamic setting, in which new data arrive over time. Previous results in this setting have been limited to answering a single non-adaptive query repeatedly as the database grows. In contrast, we provide tools for richer and more adaptive analysis of growing databases. Our first contribution is a novel modification of the private multiplicative weights algorithm, which provides accurate analysis of exponentially many adaptive linear queries (an expressive query class including all counting queries) for a static database. Our modification maintains the accuracy guarantee of the static setting even as the database grows without bound. Our second contribution is a set of general results which show that many other private and accurate algorithms can be immediately extended to the dynamic setting by rerunning them at appropriate points of data growth with minimal loss of accuracy, even when data growth is unbounded.

Author Information

Rachel Cummings (Georgia Tech)
Sara Krehbiel (University of Richmond)
Kevin A Lai (Georgia Tech)
Tao (Uthaipon) Tantipongpipat (Georgia Tech)

Graduating PhD student in machine learning theory and optimization. Strong background in mathematics and algorithmic foundations of data science with hands-on implementations on real-world datasets. Strive for impact and efficiency while attentive to details. Enjoy public speaking and experienced in leading research projects. Published many theoretical results in academic conferences and developed several optimized algorithms for public use. My research includes • Approximation algorithms in optimal design in statistics, as known as design of experiments (DoE) using combinatorial optimization. Diversity or representative sampling. • Differential privacy – theory of privacy in growing database; its deployment in deep learning models such as RNNs, LSTMs, autoencoders, and GANs; and its application in private synthetic data generation. • Fairness in machine learning – fair principle component analysis (fair PCA) using convex optimization and randomized rounding to obtain low-rank solution to semi-definite programming Other Interests: model compressions; privacy and security in machine learning; fair and explainable/interpretable machine learning

More from the Same Authors