Timezone: »
While statistics and machine learning offers numerous methods for ensuring generalization, these methods often fail in the presence of post selection---the common practice in which the choice of analysis depends on previous interactions with the same dataset. A recent line of work has introduced powerful, general purpose algorithms that ensure a property called post hoc generalization (Cummings et al., COLT'16), which says that no person when given the output of the algorithm should be able to find any statistic for which the data differs significantly from the population it came from.
In this work we show several limitations on the power of algorithms satisfying post hoc generalization. First, we show a tight lower bound on the error of any algorithm that satisfies post hoc generalization and answers adaptively chosen statistical queries, showing a strong barrier to progress in post selection data analysis. Second, we show that post hoc generalization is not closed under composition, despite many examples of such algorithms exhibiting strong composition properties.
Author Information
Jonathan Ullman (Northeastern University)
Adam Smith (Boston University)
Kobbi Nissim (Georgetown University)
Uri Stemmer (Ben-Gurion University)
Thomas Steinke (IBM Research - Almaden)
Thomas Steinke is a Research Staff Member at the IBM Almaden Research Center. He has been at IBM since completing his PhD at Harvard University in 2016. His research focuses on data privacy (specifically, differential privacy) and its connections to machine learning (particularly, generalization when data is reused adaptively).
More from the Same Authors
-
2021 Spotlight: Covariance-Aware Private Mean Estimation Without Private Covariance Estimation »
Gavin Brown · Marco Gaboardi · Adam Smith · Jonathan Ullman · Lydia Zakynthinou -
2021 Poster: Covariance-Aware Private Mean Estimation Without Private Covariance Estimation »
Gavin Brown · Marco Gaboardi · Adam Smith · Jonathan Ullman · Lydia Zakynthinou -
2021 Poster: On the Sample Complexity of Privately Learning Axis-Aligned Rectangles »
Menachem Sadigurschi · Uri Stemmer -
2021 Poster: Differentially Private Multi-Armed Bandits in the Shuffle Model »
Jay Tenenbaum · Haim Kaplan · Yishay Mansour · Uri Stemmer -
2020 Poster: The Flajolet-Martin Sketch Itself Preserves Differential Privacy: Private Counting with Minimal Space »
Adam Smith · Shuang Song · Abhradeep Guha Thakurta -
2020 Poster: CoinPress: Practical Private Mean and Covariance Estimation »
Sourav Biswas · Yihe Dong · Gautam Kamath · Jonathan Ullman -
2020 Poster: Private Identity Testing for High-Dimensional Distributions »
Clément L Canonne · Gautam Kamath · Audra McMillan · Jonathan Ullman · Lydia Zakynthinou -
2020 Poster: Auditing Differentially Private Machine Learning: How Private is Private SGD? »
Matthew Jagielski · Jonathan Ullman · Alina Oprea -
2020 Poster: Adversarially Robust Streaming Algorithms via Differential Privacy »
Avinatan Hassidim · Haim Kaplan · Yishay Mansour · Yossi Matias · Uri Stemmer -
2020 Poster: Private Learning of Halfspaces: Simplifying the Construction and Reducing the Sample Complexity »
Haim Kaplan · Yishay Mansour · Uri Stemmer · Eliad Tsfadia -
2020 Spotlight: Private Identity Testing for High-Dimensional Distributions »
Clément L Canonne · Gautam Kamath · Audra McMillan · Jonathan Ullman · Lydia Zakynthinou -
2020 Oral: Adversarially Robust Streaming Algorithms via Differential Privacy »
Avinatan Hassidim · Haim Kaplan · Yishay Mansour · Yossi Matias · Uri Stemmer -
2019 Poster: Differentially Private Algorithms for Learning Mixtures of Separated Gaussians »
Gautam Kamath · Or Sheffet · Vikrant Singhal · Jonathan Ullman -
2019 Poster: Efficiently Estimating Erdos-Renyi Graphs with Node Differential Privacy »
Jonathan Ullman · Adam Sealfon -
2018 Poster: Local Differential Privacy for Evolving Data »
Matthew Joseph · Aaron Roth · Jonathan Ullman · Bo Waggoner -
2018 Poster: Differentially Private k-Means with Constant Multiplicative Error »
Uri Stemmer · Haim Kaplan -
2018 Spotlight: Differentially Private k-Means with Constant Multiplicative Error »
Uri Stemmer · Haim Kaplan -
2018 Spotlight: Local Differential Privacy for Evolving Data »
Matthew Joseph · Aaron Roth · Jonathan Ullman · Bo Waggoner -
2017 Poster: Practical Locally Private Heavy Hitters »
Raef Bassily · Kobbi Nissim · Uri Stemmer · Abhradeep Guha Thakurta -
2016 Poster: Privacy Odometers and Filters: Pay-as-you-Go Composition »
Ryan Rogers · Salil Vadhan · Aaron Roth · Jonathan Ullman