Timezone: »
Poster
Label Robust and Differentially Private Linear Regression: Computational and Statistical Efficiency
Xiyang Liu · Prateek Jain · Weihao Kong · Sewoong Oh · Arun Suggala
We study the canonical problem of linear regression under $(\varepsilon,\delta)$-differential privacy when the datapoints are sampled i.i.d.~from a distribution and a fraction of response variables are adversarially corrupted. We provide the first provably efficient -- both computationally and statistically -- method for this problem, assuming standard assumptions on the data distribution. Our algorithm is a variant of the popular differentially private stochastic gradient descent (DP-SGD) algorithm with two key innovations: a full-batch gradient descent to improve sample complexity and a novel adaptive clipping to guarantee robustness. Our method requires only linear time in input size, and still matches the information theoretical optimal sample complexity up to a data distribution dependent condition number factor. Interestingly, the same algorithm, when applied to a setting where there is no adversarial corruption, still improves upon the existing state-of-the-art and achieves a near optimal sample complexity.
Author Information
Xiyang Liu (University of Washington)
Prateek Jain (Google Research)
Weihao Kong (Google Research)
Sewoong Oh (University of Washington)
Arun Suggala (Google)
More from the Same Authors
-
2021 Spotlight: Near-optimal Offline and Streaming Algorithms for Learning Non-Linear Dynamical Systems »
Suhas Kowshik · Dheeraj Nagaraj · Prateek Jain · Praneeth Netrapalli -
2021 Spotlight: Differentially Private Model Personalization »
Prateek Jain · John Rush · Adam Smith · Shuang Song · Abhradeep Guha Thakurta -
2021 : Gradient flows on graphons: existence, convergence, continuity equations »
Sewoong Oh · Soumik Pal · Raghav Somani · Raghav Tripathi -
2022 : MET: Masked Encoding for Tabular Data »
Kushal Majmundar · Sachin Goyal · Praneeth Netrapalli · Prateek Jain -
2022 : Learning an Invertible Output Mapping Can Mitigate Simplicity Bias in Neural Networks »
Sravanti Addepalli · Anshul Nasery · Venkatesh Babu R · Praneeth Netrapalli · Prateek Jain -
2022 : Few-shot Backdoor Attacks via Neural Tangent Kernels »
Jonathan Hayase · Sewoong Oh -
2023 : DPZero: Dimension-Independent and Differentially Private Zeroth-Order Optimization »
Liang Zhang · Kiran Thekumparampil · Sewoong Oh · Niao He -
2023 : One-shot Empirical Privacy Estimation for Federated Learning »
Galen Andrew · Peter Kairouz · Sewoong Oh · Alina Oprea · H. Brendan McMahan · Vinith Suriyakumar -
2023 : MatFormer: Nested Transformer for Elastic Inference »
Fnu Devvrit · Sneha Kudugunta · Aditya Kusupati · Tim Dettmers · Kaifeng Chen · Inderjit Dhillon · Yulia Tsvetkov · Hannaneh Hajishirzi · Sham Kakade · Ali Farhadi · Prateek Jain -
2023 Poster: Label Poisoning is All You Need »
Rishi Jha · Jonathan Hayase · Sewoong Oh -
2023 Poster: Unleashing the Power of Randomization in Auditing Differentially Private ML »
Krishna Pillutla · Galen Andrew · Peter Kairouz · H. Brendan McMahan · Alina Oprea · Sewoong Oh -
2023 Poster: DataComp: In search of the next generation of multimodal datasets »
Samir Yitzhak Gadre · Gabriel Ilharco · Alex Fang · Jonathan Hayase · Georgios Smyrnis · Thao Nguyen · Ryan Marten · Mitchell Wortsman · Dhruba Ghosh · Jieyu Zhang · Eyal Orgad · Rahim Entezari · Giannis Daras · Sarah Pratt · Vivek Ramanujan · Yonatan Bitton · Kalyani Marathe · Stephen Mussmann · Richard Vencu · Mehdi Cherti · Ranjay Krishna · Pang Wei Koh · Olga Saukh · Alexander Ratner · Shuran Song · Hannaneh Hajishirzi · Ali Farhadi · Romain Beaumont · Sewoong Oh · Alex Dimakis · Jenia Jitsev · Yair Carmon · Vaishaal Shankar · Ludwig Schmidt -
2023 Oral: DataComp: In search of the next generation of multimodal datasets »
Samir Yitzhak Gadre · Gabriel Ilharco · Alex Fang · Jonathan Hayase · Georgios Smyrnis · Thao Nguyen · Ryan Marten · Mitchell Wortsman · Dhruba Ghosh · Jieyu Zhang · Eyal Orgad · Rahim Entezari · Giannis Daras · Sarah Pratt · Vivek Ramanujan · Yonatan Bitton · Kalyani Marathe · Stephen Mussmann · Richard Vencu · Mehdi Cherti · Ranjay Krishna · Pang Wei Koh · Olga Saukh · Alexander Ratner · Shuran Song · Hannaneh Hajishirzi · Ali Farhadi · Romain Beaumont · Sewoong Oh · Alex Dimakis · Jenia Jitsev · Yair Carmon · Vaishaal Shankar · Ludwig Schmidt -
2023 Poster: Private (Stochastic) Non-Convex Optimization Revisited: Second-Order Stationary Points and Excess Risks »
Daogao Liu · Arun Ganesh · Sewoong Oh · Abhradeep Guha Thakurta -
2023 Poster: AdANNS: A Framework for Adaptive Semantic Search »
Aniket Rege · Aditya Kusupati · Sharan Ranjit S · Alan Fan · Qingqing Cao · Sham Kakade · Prateek Jain · Ali Farhadi -
2023 Poster: Responsible AI (RAI) Games and Ensembles »
Yash Gupta · Runtian Zhai · Arun Suggala · Pradeep Ravikumar -
2023 Poster: Simplicity Bias in 1-Hidden Layer Neural Networks »
Depen Morwani · Jatin Batra · Prateek Jain · Praneeth Netrapalli -
2023 Poster: Blocked Collaborative Bandits: Online Collaborative Filtering with Per-Item Budget Constraints »
Soumyabrata Pal · Arun Suggala · Karthikeyan Shanmugam · Prateek Jain -
2023 Poster: On the Connection between Pre-training Data Diversity and Fine-tuning Robustness »
Vivek Ramanujan · Thao Nguyen · Sewoong Oh · Ali Farhadi · Ludwig Schmidt -
2023 Poster: Improving multimodal datasets with image captioning »
Thao Nguyen · Samir Yitzhak Gadre · Gabriel Ilharco · Sewoong Oh · Ludwig Schmidt -
2022 Poster: Trimmed Maximum Likelihood Estimation for Robust Generalized Linear Model »
Pranjal Awasthi · Abhimanyu Das · Weihao Kong · Rajat Sen -
2022 Poster: DP-PCA: Statistically Optimal and Differentially Private PCA »
Xiyang Liu · Weihao Kong · Prateek Jain · Sewoong Oh -
2022 Poster: Zonotope Domains for Lagrangian Neural Network Verification »
Matt Jordan · Jonathan Hayase · Alex Dimakis · Sewoong Oh -
2022 Poster: S3GC: Scalable Self-Supervised Graph Clustering »
Fnu Devvrit · Aditya Sinha · Inderjit Dhillon · Prateek Jain -
2022 Poster: Reproducibility in Optimization: Theoretical Framework and Limits »
Kwangjun Ahn · Prateek Jain · Ziwei Ji · Satyen Kale · Praneeth Netrapalli · Gil I Shamir -
2022 Poster: Quality Not Quantity: On the Interaction between Dataset Design and Robustness of CLIP »
Thao Nguyen · Gabriel Ilharco · Mitchell Wortsman · Sewoong Oh · Ludwig Schmidt -
2022 Poster: Bring Your Own Algorithm for Optimal Differentially Private Stochastic Minimax Optimization »
Liang Zhang · Kiran Thekumparampil · Sewoong Oh · Niao He -
2022 Poster: Matryoshka Representation Learning »
Aditya Kusupati · Gantavya Bhatt · Aniket Rege · Matthew Wallingford · Aditya Sinha · Vivek Ramanujan · William Howard-Snyder · Kaifeng Chen · Sham Kakade · Prateek Jain · Ali Farhadi -
2021 Poster: Divergence Frontiers for Generative Models: Sample Complexity, Quantization Effects, and Frontier Integrals »
Lang Liu · Krishna Pillutla · Sean Welleck · Sewoong Oh · Yejin Choi · Zaid Harchaoui -
2021 Poster: Robust and differentially private mean estimation »
Xiyang Liu · Weihao Kong · Sham Kakade · Sewoong Oh -
2021 Poster: Differentially Private Model Personalization »
Prateek Jain · John Rush · Adam Smith · Shuang Song · Abhradeep Guha Thakurta -
2021 Poster: Streaming Linear System Identification with Reverse Experience Replay »
Suhas Kowshik · Dheeraj Nagaraj · Prateek Jain · Praneeth Netrapalli -
2021 Poster: Gradient Inversion with Generative Image Prior »
Jinwoo Jeon · jaechang Kim · Kangwook Lee · Sewoong Oh · Jungseul Ok -
2021 Poster: LLC: Accurate, Multi-purpose Learnt Low-dimensional Binary Codes »
Aditya Kusupati · Matthew Wallingford · Vivek Ramanujan · Raghav Somani · Jae Sung Park · Krishna Pillutla · Prateek Jain · Sham Kakade · Ali Farhadi -
2021 Poster: Do Input Gradients Highlight Discriminative Features? »
Harshay Shah · Prateek Jain · Praneeth Netrapalli -
2021 Poster: Near-optimal Offline and Streaming Algorithms for Learning Non-Linear Dynamical Systems »
Suhas Kowshik · Dheeraj Nagaraj · Prateek Jain · Praneeth Netrapalli -
2021 Poster: Statistically and Computationally Efficient Linear Meta-representation Learning »
Kiran Thekumparampil · Prateek Jain · Praneeth Netrapalli · Sewoong Oh -
2020 Poster: Projection Efficient Subgradient Method and Optimal Nonsmooth Frank-Wolfe Method »
Kiran Thekumparampil · Prateek Jain · Praneeth Netrapalli · Sewoong Oh -
2020 Spotlight: Projection Efficient Subgradient Method and Optimal Nonsmooth Frank-Wolfe Method »
Kiran Thekumparampil · Prateek Jain · Praneeth Netrapalli · Sewoong Oh -
2020 Poster: RNNPool: Efficient Non-linear Pooling for RAM Constrained Inference »
Oindrila Saha · Aditya Kusupati · Harsha Vardhan Simhadri · Manik Varma · Prateek Jain -
2020 Poster: Robust Meta-learning for Mixed Linear Regression with Small Batches »
Weihao Kong · Raghav Somani · Sham Kakade · Sewoong Oh -
2020 Spotlight: RNNPool: Efficient Non-linear Pooling for RAM Constrained Inference »
Oindrila Saha · Aditya Kusupati · Harsha Vardhan Simhadri · Manik Varma · Prateek Jain -
2020 Poster: The Pitfalls of Simplicity Bias in Neural Networks »
Harshay Shah · Kaustav Tamuly · Aditi Raghunathan · Prateek Jain · Praneeth Netrapalli -
2020 Poster: Least Squares Regression with Markovian Data: Fundamental Limits and Algorithms »
Dheeraj Nagaraj · Xian Wu · Guy Bresler · Prateek Jain · Praneeth Netrapalli -
2020 Spotlight: Least Squares Regression with Markovian Data: Fundamental Limits and Algorithms »
Dheeraj Nagaraj · Xian Wu · Guy Bresler · Prateek Jain · Praneeth Netrapalli -
2019 Poster: Efficient Algorithms for Smooth Minimax Optimization »
Kiran Thekumparampil · Prateek Jain · Praneeth Netrapalli · Sewoong Oh -
2019 Poster: Turbo Autoencoder: Deep learning based channel codes for point-to-point communication channels »
Yihan Jiang · Hyeji Kim · Himanshu Asnani · Sreeram Kannan · Sewoong Oh · Pramod Viswanath -
2019 Poster: Minimax Optimal Estimation of Approximate Differential Privacy on Neighboring Databases »
Xiyang Liu · Sewoong Oh -
2018 Poster: Estimating Learnability in the Sublinear Data Regime »
Weihao Kong · Gregory Valiant -
2018 Poster: Deepcode: Feedback Codes via Deep Learning »
Hyeji Kim · Yihan Jiang · Sreeram Kannan · Sewoong Oh · Pramod Viswanath -
2018 Poster: Robustness of conditional GANs to noisy labels »
Kiran Thekumparampil · Ashish Khetan · Zinan Lin · Sewoong Oh -
2018 Spotlight: Robustness of conditional GANs to noisy labels »
Kiran Thekumparampil · Ashish Khetan · Zinan Lin · Sewoong Oh -
2018 Poster: PacGAN: The power of two samples in generative adversarial networks »
Zinan Lin · Ashish Khetan · Giulia Fanti · Sewoong Oh -
2017 : New perspective from Blackwell's "comparisons of experiments" on generative adversarial networks »
Sewoong Oh -
2017 Poster: Optimal Sample Complexity of M-wise Data for Top-K Ranking »
Minje Jang · Sunghyun Kim · Changho Suh · Sewoong Oh -
2017 Poster: Learning Populations of Parameters »
Kevin Tian · Weihao Kong · Gregory Valiant -
2017 Poster: Estimating Mutual Information for Discrete-Continuous Mixtures »
Weihao Gao · Sreeram Kannan · Sewoong Oh · Pramod Viswanath -
2017 Poster: Matrix Norm Estimation from a Few Entries »
Ashish Khetan · Sewoong Oh -
2017 Spotlight: Estimating Mutual Information for Discrete-Continuous Mixtures »
Weihao Gao · Sreeram Kannan · Sewoong Oh · Pramod Viswanath -
2017 Spotlight: Matrix Norm Estimation from a Few Entries »
Ashish Khetan · Sewoong Oh -
2017 Poster: Discovering Potential Correlations via Hypercontractivity »
Hyeji Kim · Weihao Gao · Sreeram Kannan · Sewoong Oh · Pramod Viswanath -
2016 : Sewoong Oh: "The Minimax Rate for Adaptive Crowdsourcing" »
Sewoong Oh -
2016 Poster: Breaking the Bandwidth Barrier: Geometrical Adaptive Entropy Estimation »
Weihao Gao · Sewoong Oh · Pramod Viswanath -
2016 Poster: Computational and Statistical Tradeoffs in Learning to Rank »
Ashish Khetan · Sewoong Oh -
2016 Poster: Achieving budget-optimality with adaptive schemes in crowdsourcing »
Ashish Khetan · Sewoong Oh -
2015 Workshop: Non-convex Optimization for Machine Learning: Theory and Practice »
Anima Anandkumar · Niranjan Uma Naresh · Kamalika Chaudhuri · Percy Liang · Sewoong Oh -
2015 Poster: Secure Multi-party Differential Privacy »
Peter Kairouz · Sewoong Oh · Pramod Viswanath -
2015 Poster: Collaboratively Learning Preferences from Ordinal Data »
Sewoong Oh · Kiran Thekumparampil · Jiaming Xu -
2014 Workshop: Analysis of Rank Data: Confluence of Social Choice, Operations Research, and Machine Learning »
Shivani Agarwal · Hossein Azari Soufiani · Guy Bresler · Sewoong Oh · David Parkes · Arun Rajkumar · Devavrat Shah -
2014 Poster: Provable Tensor Factorization with Missing Data »
Prateek Jain · Sewoong Oh -
2014 Poster: Extremal Mechanisms for Local Differential Privacy »
Peter Kairouz · Sewoong Oh · Pramod Viswanath -
2014 Poster: Minimax-optimal Inference from Partial Rankings »
Bruce Hajek · Sewoong Oh · Jiaming Xu -
2014 Poster: Learning Mixed Multinomial Logit Model from Ordinal Data »
Sewoong Oh · Devavrat Shah -
2012 Poster: Iterative ranking from pair-wise comparisons »
Sahand N Negahban · Sewoong Oh · Devavrat Shah -
2012 Spotlight: Iterative ranking from pair-wise comparisons »
Sahand N Negahban · Sewoong Oh · Devavrat Shah -
2011 Poster: Iterative Learning for Reliable Crowdsourcing Systems »
David R Karger · Sewoong Oh · Devavrat Shah -
2011 Oral: Iterative Learning for Reliable Crowdsourcing Systems »
David R Karger · Sewoong Oh · Devavrat Shah -
2009 Poster: Matrix Completion from Noisy Entries »
Raghunandan Keshavan · Andrea Montanari · Sewoong Oh