Timezone: »
Poster
Estimating Entropy of Distributions in Constant Space
Jayadev Acharya · Sourbh Bhadane · Piotr Indyk · Ziteng Sun
Thu Dec 12 10:45 AM -- 12:45 PM (PST) @ East Exhibition Hall B + C #237
We consider the task of estimating the entropy of $k$-ary distributions from samples in the streaming model, where space is limited. Our main contribution is an algorithm that requires $O\left(\frac{k \log (1/\varepsilon)^2}{\varepsilon^3}\right)$ samples and a constant $O(1)$ memory words of space and outputs a $\pm\varepsilon$ estimate of $H(p)$. Without space limitations, the sample complexity has been established as $S(k,\varepsilon)=\Theta\left(\frac k{\varepsilon\log k}+\frac{\log^2 k}{\varepsilon^2}\right)$, which is sub-linear in the domain size $k$, and the current algorithms that achieve optimal sample complexity also require nearly-linear space in $k$.
Our algorithm partitions $[0,1]$ into intervals and estimates the entropy contribution of probability values in each interval. The intervals are designed to trade bias and variance.
Distribution property estimation and testing with limited memory is a largely unexplored research area. We hope our work will motivate research in this field.
Author Information
Jayadev Acharya (Cornell University)
Sourbh Bhadane (Cornell University)
Piotr Indyk (MIT)
Ziteng Sun (Cornell University)
More from the Same Authors
-
2022 : Hidden Poison: Machine Unlearning Enables Camouflaged Poisoning Attacks »
Jimmy Di · Jack Douglas · Jayadev Acharya · Gautam Kamath · Ayush Sekhari -
2022 : Hidden Poison: Machine unlearning enables camouflaged poisoning attacks »
Jimmy Di · Jack Douglas · Jayadev Acharya · Gautam Kamath · Ayush Sekhari -
2023 Poster: Worst-case Performance of Popular Approximate Nearest Neighbor Search Implementations: Guarantees and Limitations »
Piotr Indyk · Haike Xu -
2023 Poster: Differentially Private Approximate Near Neighbor Counting in High Dimensions »
Alexandr Andoni · Piotr Indyk · Sepideh Mahabadi · Shyam Narayanan -
2023 Poster: SpecTr: Fast Speculative Decoding via Optimal Transport »
Ziteng Sun · Ananda Theertha Suresh · Jae Hun Ro · Ahmad Beirami · Himanshu Jain · Felix Yu -
2023 Poster: Near-Linear Time Algorithm for the Chamfer Distance »
Ainesh Bakshi · Piotr Indyk · Rajesh Jayaram · Sandeep Silwal · Erik Waingarten -
2023 Poster: Hidden Poison: Machine Unlearning Enables Camouflaged Poisoning Attacks »
Jimmy Di · Jack Douglas · Jayadev Acharya · Gautam Kamath · Ayush Sekhari -
2023 Poster: Unified Lower Bounds for Interactive High-dimensional Estimation under Information Constraints »
Jayadev Acharya · Clément L Canonne · Ziteng Sun · Himanshu Tyagi -
2022 Poster: Faster Linear Algebra for Distance Matrices »
Piotr Indyk · Sandeep Silwal -
2022 Poster: (Optimal) Online Bipartite Matching with Degree Information »
Anders Aamand · Justin Chen · Piotr Indyk -
2022 Poster: Exponentially Improving the Complexity of Simulating the Weisfeiler-Lehman Test with Graph Neural Networks »
Anders Aamand · Justin Chen · Piotr Indyk · Shyam Narayanan · Ronitt Rubinfeld · Nicholas Schiefer · Sandeep Silwal · Tal Wagner -
2021 Poster: Distributed Estimation with Multiple Samples per User: Sharp Rates and Phase Transition »
Jayadev Acharya · Clement Canonne · Yuhan Liu · Ziteng Sun · Himanshu Tyagi -
2021 Poster: Information-constrained optimization: can adaptive processing of gradients help? »
Jayadev Acharya · Clement Canonne · Prathamesh Mayekar · Himanshu Tyagi -
2021 Poster: Learning with User-Level Privacy »
Daniel Levy · Ziteng Sun · Kareem Amin · Satyen Kale · Alex Kulesza · Mehryar Mohri · Ananda Theertha Suresh -
2021 Poster: Remember What You Want to Forget: Algorithms for Machine Unlearning »
Ayush Sekhari · Jayadev Acharya · Gautam Kamath · Ananda Theertha Suresh -
2021 Poster: Few-Shot Data-Driven Algorithms for Low Rank Approximation »
Piotr Indyk · Tal Wagner · David Woodruff -
2021 Poster: Optimal Rates for Nonparametric Density Estimation under Communication Constraints »
Jayadev Acharya · Clement Canonne · Aditya Vikram Singh · Himanshu Tyagi -
2019 : Poster Session »
Lili Yu · Aleksei Kroshnin · Alex Delalande · Andrew Carr · Anthony Tompkins · Aram-Alexandre Pooladian · Arnaud Robert · Ashok Vardhan Makkuva · Aude Genevay · Bangjie Liu · Bo Zeng · Charlie Frogner · Elsa Cazelles · Esteban G Tabak · Fabio Ramos · François-Pierre PATY · Georgios Balikas · Giulio Trigila · Hao Wang · Hinrich Mahler · Jared Nielsen · Karim Lounici · Kyle Swanson · Mukul Bhutani · Pierre Bréchet · Piotr Indyk · samuel cohen · Stefanie Jegelka · Tao Wu · Thibault Sejourne · Tudor Manole · Wenjun Zhao · Wenlin Wang · Wenqi Wang · Yonatan Dukler · Zihao Wang · Chaosheng Dong -
2019 : Poster Session »
Jonathan Scarlett · Piotr Indyk · Ali Vakilian · Adrian Weller · Partha P Mitra · Benjamin Aubin · Bruno Loureiro · Florent Krzakala · Lenka Zdeborová · Kristina Monakhova · Joshua Yurtsever · Laura Waller · Hendrik Sommerhoff · Michael Moeller · Rushil Anirudh · Shuang Qiu · Xiaohan Wei · Zhuoran Yang · Jayaraman Thiagarajan · Salman Asif · Michael Gillhofer · Johannes Brandstetter · Sepp Hochreiter · Felix Petersen · Dhruv Patel · Assad Oberai · Akshay Kamath · Sushrut Karmalkar · Eric Price · Ali Ahmed · Zahra Kadkhodaie · Sreyas Mohan · Eero Simoncelli · Carlos Fernandez-Granda · Oscar Leong · Wesam Sakla · Rebecca Willett · Stephan Hoyer · Jascha Sohl-Dickstein · Sam Greydanus · Gauri Jagatap · Chinmay Hegde · Michael Kellman · Jonathan Tamir · Nouamane Laanait · Ousmane Dia · Mirco Ravanelli · Jonathan Binas · Negar Rostamzadeh · Shirin Jalali · Tiantian Fang · Alex Schwing · Sébastien Lachapelle · Philippe Brouillard · Tristan Deleu · Simon Lacoste-Julien · Stella Yu · Arya Mazumdar · Ankit Singh Rawat · Yue Zhao · Jianshu Chen · Xiaoyang Li · Hubert Ramsauer · Gabrio Rizzuti · Nikolaos Mitsakos · Dingzhou Cao · Thomas Strohmer · Yang Li · Pei Peng · Gregory Ongie -
2019 : Learning-Based Low-Rank Approximations »
Piotr Indyk -
2019 Poster: Learning-Based Low-Rank Approximations »
Piotr Indyk · Ali Vakilian · Yang Yuan -
2019 Poster: Space and Time Efficient Kernel Density Estimation in High Dimensions »
Arturs Backurs · Piotr Indyk · Tal Wagner -
2018 Poster: Learning and Testing Causal Models with Interventions »
Jayadev Acharya · Arnab Bhattacharyya · Constantinos Daskalakis · Saravanan Kandasamy -
2018 Poster: Differentially Private Testing of Identity and Closeness of Discrete Distributions »
Jayadev Acharya · Ziteng Sun · Huanyu Zhang -
2018 Spotlight: Differentially Private Testing of Identity and Closeness of Discrete Distributions »
Jayadev Acharya · Ziteng Sun · Huanyu Zhang -
2017 : Data-dependent methods for similarity search in high dimensions »
Piotr Indyk -
2017 Poster: Practical Data-Dependent Metric Compression with Provable Guarantees »
Piotr Indyk · Ilya Razenshteyn · Tal Wagner -
2017 Poster: On the Fine-Grained Complexity of Empirical Risk Minimization: Kernel Methods and Neural Networks »
Arturs Backurs · Piotr Indyk · Ludwig Schmidt -
2016 Poster: Fast recovery from a union of subspaces »
Chinmay Hegde · Piotr Indyk · Ludwig Schmidt -
2015 Poster: Practical and Optimal LSH for Angular Distance »
Alexandr Andoni · Piotr Indyk · Thijs Laarhoven · Ilya Razenshteyn · Ludwig Schmidt -
2014 Workshop: Optimal Transport and Machine Learning »
Marco Cuturi · Gabriel Peyré · Justin Solomon · Alexander Barvinok · Piotr Indyk · Robert McCann · Adam Oberman