Timezone: »
Principal components analysis (PCA) is a standard tool for identifying good low-dimensional approximations to data sets in high dimension. Many current data sets of interest contain private or sensitive information about individuals. Algorithms which operate on such data should be sensitive to the privacy risks in publishing their outputs. Differential privacy is a framework for developing tradeoffs between privacy and the utility of these outputs. In this paper we investigate the theory and empirical performance of differentially private approximations to PCA and propose a new method which explicitly optimizes the utility of the output. We demonstrate that on real data, there this a large performance gap between the existing methods and our method. We show that the sample complexity for the two procedures differs in the scaling with the data dimension, and that our method is nearly optimal in terms of this scaling.
Author Information
Kamalika Chaudhuri (UCSD)
Anand D Sarwate (Rutgers, The State University of New Jersey)
Kaushik Sinha (The Ohio State University)
More from the Same Authors
-
2022 : The Interpolated MVU Mechanism For Communication-efficient Private Federated Learning »
Chuan Guo · Kamalika Chaudhuri · Pierre STOCK · Mike Rabbat -
2022 : Forgetting Data from Pre-trained GANs »
Zhifeng Kong · Kamalika Chaudhuri -
2022 : Panel Discussion »
Kamalika Chaudhuri · Been Kim · Dorsa Sadigh · Huan Zhang · Linyi Li -
2022 : Invited Talk: Kamalika Chaudhuri »
Kamalika Chaudhuri -
2021 Workshop: Privacy in Machine Learning (PriML) 2021 »
Yu-Xiang Wang · Borja Balle · Giovanni Cherubin · Kamalika Chaudhuri · Antti Honkela · Jonathan Lebensold · Casey Meehan · Mi Jung Park · Adrian Weller · Yuqing Zhu -
2021 Poster: Understanding Instance-based Interpretability of Variational Auto-Encoders »
Zhifeng Kong · Kamalika Chaudhuri -
2021 Poster: Consistent Non-Parametric Methods for Maximizing Robustness »
Robi Bhattacharjee · Kamalika Chaudhuri -
2020 Workshop: Privacy Preserving Machine Learning - PriML and PPML Joint Edition »
Borja Balle · James Bell · Aurélien Bellet · Kamalika Chaudhuri · Adria Gascon · Antti Honkela · Antti Koskela · Casey Meehan · Olga Ohrimenko · Mi Jung Park · Mariana Raykova · Mary Anne Smart · Yu-Xiang Wang · Adrian Weller -
2020 Poster: A Closer Look at Accuracy vs. Robustness »
Yao-Yuan Yang · Cyrus Rashtchian · Hongyang Zhang · Russ Salakhutdinov · Kamalika Chaudhuri -
2019 : Audrey Durand, Douwe Kiela, Kamalika Chaudhuri moderated by Yann Dauphin »
Audrey Durand · Kamalika Chaudhuri · Yann Dauphin · Orhan Firat · Dilan Gorur · Douwe Kiela -
2019 : Kamalika Chaudhuri - A Three Sample Test to Detect Data Copying in Generative Models »
Kamalika Chaudhuri -
2019 : Poster Session »
Clement Canonne · Kwang-Sung Jun · Seth Neel · Di Wang · Giuseppe Vietri · Liwei Song · Jonathan Lebensold · Huanyu Zhang · Lovedeep Gondara · Ang Li · FatemehSadat Mireshghallah · Jinshuo Dong · Anand D Sarwate · Antti Koskela · Joonas Jälkö · Matt Kusner · Dingfan Chen · Mi Jung Park · Ashwin Machanavajjhala · Jayashree Kalpathy-Cramer · · Vitaly Feldman · Andrew Tomkins · Hai Phan · Hossein Esfandiari · Mimansa Jaiswal · Mrinank Sharma · Jeff Druce · Casey Meehan · Zhengli Zhao · Hsiang Hsu · Davis Railsback · Abraham Flaxman · · Julius Adebayo · Aleksandra Korolova · Jiaming Xu · Naoise Holohan · Samyadeep Basu · Matthew Joseph · My Thai · Xiaoqian Yang · Ellen Vitercik · Michael Hutchinson · Chenghong Wang · Gregory Yauney · Yuchao Tao · Chao Jin · Si Kai Lee · Audra McMillan · Rauf Izmailov · Jiayi Guo · Siddharth Swaroop · Tribhuvanesh Orekondy · Hadi Esmaeilzadeh · Kevin Procopio · Alkis Polyzotis · Jafar Mohammadi · Nitin Agrawal -
2019 Workshop: Privacy in Machine Learning (PriML) »
Borja Balle · Kamalika Chaudhuri · Antti Honkela · Antti Koskela · Casey Meehan · Mi Jung Park · Mary Anne Smart · Mary Anne Smart · Adrian Weller -
2019 Poster: The Label Complexity of Active Learning from Observational Data »
Songbai Yan · Kamalika Chaudhuri · Tara Javidi -
2019 Poster: Capacity Bounded Differential Privacy »
Kamalika Chaudhuri · Jacob Imola · Ashwin Machanavajjhala -
2018 : Invited talk 3: Challenges in the Privacy-Preserving Analysis of Structured Data »
Kamalika Chaudhuri -
2018 : Plenary Talk 2 »
Kamalika Chaudhuri -
2018 Workshop: Workshop on Security in Machine Learning »
Nicolas Papernot · Jacob Steinhardt · Matt Fredrikson · Kamalika Chaudhuri · Florian Tramer -
2017 : Analyzing Robustness of Nearest Neighbors to Adversarial Examples »
Kamalika Chaudhuri -
2017 Poster: Renyi Differential Privacy Mechanisms for Posterior Sampling »
Joseph Geumlek · Shuang Song · Kamalika Chaudhuri -
2017 Poster: Approximation and Convergence Properties of Generative Adversarial Learning »
Shuang Liu · Olivier Bousquet · Kamalika Chaudhuri -
2017 Spotlight: Approximation and Convergence Properties of Generative Adversarial Learning »
Shuang Liu · Olivier Bousquet · Kamalika Chaudhuri -
2017 Tutorial: Differentially Private Machine Learning: Theory, Algorithms and Applications »
Kamalika Chaudhuri · Anand D Sarwate -
2016 Poster: Active Learning from Imperfect Labelers »
Songbai Yan · Kamalika Chaudhuri · Tara Javidi -
2015 : Kamalika Chaudhuri »
Kamalika Chaudhuri -
2015 Workshop: Non-convex Optimization for Machine Learning: Theory and Practice »
Anima Anandkumar · Niranjan Uma Naresh · Kamalika Chaudhuri · Percy Liang · Sewoong Oh -
2015 Poster: Active Learning from Weak and Strong Labelers »
Chicheng Zhang · Kamalika Chaudhuri -
2015 Poster: Spectral Learning of Large Structured HMMs for Comparative Epigenomics »
Chicheng Zhang · Jimin Song · Kamalika Chaudhuri · Kevin Chen -
2015 Poster: Convergence Rates of Active Learning for Maximum Likelihood Estimation »
Kamalika Chaudhuri · Sham Kakade · Praneeth Netrapalli · Sujay Sanghavi -
2014 Poster: Beyond Disagreement-Based Agnostic Active Learning »
Chicheng Zhang · Kamalika Chaudhuri -
2014 Poster: Rates of Convergence for Nearest Neighbor Classification »
Kamalika Chaudhuri · Sanjoy Dasgupta -
2014 Spotlight: Beyond Disagreement-Based Agnostic Active Learning »
Chicheng Zhang · Kamalika Chaudhuri -
2014 Spotlight: Rates of Convergence for Nearest Neighbor Classification »
Kamalika Chaudhuri · Sanjoy Dasgupta -
2014 Poster: The Large Margin Mechanism for Differentially Private Maximization »
Kamalika Chaudhuri · Daniel Hsu · Shuang Song -
2013 Poster: Auditing: Active Learning with Outcome-Dependent Query Costs »
Sivan Sabato · Anand D Sarwate · Nati Srebro -
2013 Poster: A Stability-based Validation Procedure for Differentially Private Machine Learning »
Kamalika Chaudhuri · Staal A Vinterbo -
2011 Poster: Spectral Methods for Learning Multivariate Latent Tree Structure »
Anima Anandkumar · Kamalika Chaudhuri · Daniel Hsu · Sham M Kakade · Le Song · Tong Zhang -
2010 Poster: Rates of convergence for the cluster tree »
Kamalika Chaudhuri · Sanjoy Dasgupta -
2009 Poster: A Parameter-free Hedging Algorithm »
Kamalika Chaudhuri · Yoav Freund · Daniel Hsu -
2009 Poster: Semi-supervised Learning using Sparse Eigenfunction Bases »
Kaushik Sinha · Mikhail Belkin -
2008 Poster: Privacy-preserving logistic regression »
Kamalika Chaudhuri · Claire Monteleoni -
2007 Spotlight: The Value of Labeled and Unlabeled Examples when the Model is Imperfect »
Kaushik Sinha · Mikhail Belkin -
2007 Poster: The Value of Labeled and Unlabeled Examples when the Model is Imperfect »
Kaushik Sinha · Mikhail Belkin