Timezone: »
A cluster tree provides an intuitive summary of a density function that reveals essential structure about the high-density clusters. The true cluster tree is estimated from a finite sample from an unknown true density. This paper addresses the basic question of quantifying our uncertainty by assessing the statistical significance of different features of an empirical cluster tree. We first study a variety of metrics that can be used to compare different trees, analyzing their properties and assessing their suitability for our inference task. We then propose methods to construct and summarize confidence sets for the unknown true cluster tree. We introduce a partial ordering on cluster trees which we use to prune some of the statistically insignificant features of the empirical tree, yielding interpretable and parsimonious cluster trees. Finally, we provide a variety of simulations to illustrate our proposed methods and furthermore demonstrate their utility in the analysis of a Graft-versus-Host Disease (GvHD) data set.
Author Information
Jisu KIM (Carnegie Mellon University)
Yen-Chi Chen (Carnegie Mellon University)
Sivaraman Balakrishnan (Carnegie Mellon University)
Alessandro Rinaldo (Carnegie Mellon University)
Larry Wasserman (Carnegie Mellon University)
More from the Same Authors
-
2022 Spotlight: Lightning Talks 1B-4 »
Andrei Atanov · Shiqi Yang · Wanshan Li · Yongchang Hao · Ziquan Liu · Jiaxin Shi · Anton Plaksin · Jiaxiang Chen · Ziqi Pan · yaxing wang · Yuxin Liu · Stepan Martyanov · Alessandro Rinaldo · Yuhao Zhou · Li Niu · Qingyuan Yang · Andrei Filatov · Yi Xu · Liqing Zhang · Lili Mou · Ruomin Huang · Teresa Yeo · kai wang · Daren Wang · Jessica Hwang · Yuanhong Xu · Qi Qian · Hu Ding · Michalis Titsias · Shangling Jui · Ajay Sohmshetty · Lester Mackey · Joost van de Weijer · Hao Li · Amir Zamir · Xiangyang Ji · Antoni Chan · Rong Jin -
2022 Spotlight: Detecting Abrupt Changes in Sequential Pairwise Comparison Data »
Wanshan Li · Alessandro Rinaldo · Daren Wang -
2022 Poster: Detecting Abrupt Changes in Sequential Pairwise Comparison Data »
Wanshan Li · Alessandro Rinaldo · Daren Wang -
2021 Poster: Lattice partition recovery with dyadic CART »
OSCAR HERNAN MADRID PADILLA · Yi Yu · Alessandro Rinaldo -
2020 Poster: A Unified View of Label Shift Estimation »
Saurabh Garg · Yifan Wu · Sivaraman Balakrishnan · Zachary Lipton -
2020 Poster: PLLay: Efficient Topological Layer based on Persistent Landscapes »
Kwangho Kim · Jisu Kim · Manzil Zaheer · Joon Kim · Frederic Chazal · Larry Wasserman -
2019 Poster: Statistical Analysis of Nearest Neighbor Methods for Anomaly Detection »
Xiaoyi Gu · Leman Akoglu · Alessandro Rinaldo -
2019 Poster: Are sample means in multi-armed bandits positively or negatively biased? »
Jaehyeok Shin · Aaditya Ramdas · Alessandro Rinaldo -
2019 Spotlight: Are sample means in multi-armed bandits positively or negatively biased? »
Jaehyeok Shin · Aaditya Ramdas · Alessandro Rinaldo -
2017 : Introduction to the R package TDA »
Jisu KIM -
2017 : Persistent homology of KDE filtration of Rips complexes »
Jaehyeok Shin · Alessandro Rinaldo -
2017 Poster: A Sharp Error Analysis for the Fused Lasso, with Application to Approximate Changepoint Screening »
Kevin Lin · James Sharpnack · Alessandro Rinaldo · Ryan Tibshirani -
2016 Poster: Local Maxima in the Likelihood of Gaussian Mixture Models: Structural Results and Algorithmic Consequences »
Chi Jin · Yuchen Zhang · Sivaraman Balakrishnan · Martin J Wainwright · Michael Jordan -
2015 Poster: Optimal Ridge Detection using Coverage Risk »
Yen-Chi Chen · Christopher Genovese · Shirley Ho · Larry Wasserman -
2015 Poster: Nonparametric von Mises Estimators for Entropies, Divergences and Mutual Informations »
Kirthevasan Kandasamy · Akshay Krishnamurthy · Barnabas Poczos · Larry Wasserman · james m robins -
2013 Poster: Cluster Trees on Manifolds »
Sivaraman Balakrishnan · Srivatsan Narayanan · Alessandro Rinaldo · Aarti Singh · Larry Wasserman -
2012 Workshop: Algebraic Topology and Machine Learning »
Sivaraman Balakrishnan · Alessandro Rinaldo · Donald Sheehy · Aarti Singh · Larry Wasserman -
2012 Workshop: Modern Nonparametric Methods in Machine Learning »
Sivaraman Balakrishnan · Arthur Gretton · Mladen Kolar · John Lafferty · Han Liu · Tong Zhang -
2012 Poster: Optimal kernel choice for large-scale two-sample tests »
Arthur Gretton · Bharath Sriperumbudur · Dino Sejdinovic · Heiko Strathmann · Sivaraman Balakrishnan · Massimiliano Pontil · Kenji Fukumizu -
2012 Poster: Exponential Concentration for Mutual Information Estimation with Application to Forests »
Han Liu · John Lafferty · Larry Wasserman -
2011 Workshop: Philosophy and Machine Learning »
Marcello Pelillo · Joachim M Buhmann · Tiberio Caetano · Bernhard Schölkopf · Larry Wasserman -
2011 Poster: Minimax Localization of Structural Information in Large Noisy Matrices »
Mladen Kolar · Sivaraman Balakrishnan · Alessandro Rinaldo · Aarti Singh -
2011 Poster: Noise Thresholds for Spectral Clustering »
Sivaraman Balakrishnan · Min Xu · Akshay Krishnamurthy · Aarti Singh -
2011 Spotlight: Noise Thresholds for Spectral Clustering »
Sivaraman Balakrishnan · Min Xu · Akshay Krishnamurthy · Aarti Singh -
2011 Spotlight: Minimax Localization of Structural Information in Large Noisy Matrices »
Mladen Kolar · Sivaraman Balakrishnan · Alessandro Rinaldo · Aarti Singh -
2010 Spotlight: Graph-Valued Regression »
Han Liu · Xi Chen · John Lafferty · Larry Wasserman -
2010 Poster: Graph-Valued Regression »
Han Liu · Xi Chen · John Lafferty · Larry Wasserman -
2010 Poster: Stability Approach to Regularization Selection (StARS) for High Dimensional Graphical Models »
Han Liu · Kathryn Roeder · Larry Wasserman -
2008 Poster: Nonparametric regression and classification with joint sparsity constraints »
Han Liu · John Lafferty · Larry Wasserman -
2008 Spotlight: Nonparametric regression and classification with joint sparsity constraints »
Han Liu · John Lafferty · Larry Wasserman -
2007 Poster: SpAM: Sparse Additive Models »
Pradeep Ravikumar · Han Liu · John Lafferty · Larry Wasserman -
2007 Spotlight: SpAM: Sparse Additive Models »
Pradeep Ravikumar · Han Liu · John Lafferty · Larry Wasserman -
2007 Spotlight: Statistical Analysis of Semi-Supervised Regression »
John Lafferty · Larry Wasserman -
2007 Poster: Statistical Analysis of Semi-Supervised Regression »
John Lafferty · Larry Wasserman -
2007 Poster: Compressed Regression »
Shuheng Zhou · John Lafferty · Larry Wasserman