Timezone: »
Poster
NearOptimal Entrywise Sampling for Data Matrices
Dimitris Achlioptas · Zohar Karnin · Edo Liberty
Fri Dec 06 07:00 PM  11:59 PM (PST) @ Harrah's Special Events Center, 2nd Floor
We consider the problem of independently sampling $s$ nonzero entries of a matrix $A$ in order to produce a sparse sketch of it, $B$, that minimizes $\AB\_2$. For large $m \times n$ matrices, such that $n \gg m$ (for example, representing $n$ observations over $m$ attributes) we give distributions exhibiting four important properties. First, they have closed forms for the probability of sampling each item which are computable from minimal information regarding $A$. Second, they allow sketching of matrices whose nonzeros are presented to the algorithm in arbitrary order as a stream, with $O(1)$ computation per nonzero. Third, the resulting sketch matrices are not only sparse, but their nonzero entries are highly compressible. Lastly, and most importantly, under mild assumptions, our distributions are provably competitive with the optimal offline distribution. Note that the probabilities in the optimal offline distribution may be complex functions of all the entries in the matrix. Therefore, regardless of computational complexity, the optimal distribution might be impossible to compute in the streaming model.
Author Information
Dimitris Achlioptas (UC Santa Cruz)
Zohar Karnin (Yahoo Research)
Edo Liberty (Yahoo! Research)
More from the Same Authors

2016 Poster: Multiarmed Bandits: Competing with Optimal Sequences »
Zohar Karnin · Oren Anava 
2016 Poster: Verification Based Solution for Structured MAB Problems »
Zohar Karnin 
2015 Poster: Copeland Dueling Bandits »
Masrour Zoghi · Zohar Karnin · Shimon Whiteson · Maarten de Rijke 
2013 Poster: Distributed Exploration in MultiArmed Bandits »
Eshcar Hillel · Zohar Karnin · Tomer Koren · Ronny Lempel · Oren Somekh 
2013 Spotlight: Distributed Exploration in MultiArmed Bandits »
Eshcar Hillel · Zohar Karnin · Tomer Koren · Ronny Lempel · Oren Somekh