Timezone: »
Learning a decision tree from data is a difficult optimization problem. The most widespread algorithm in practice, dating to the 1980s, is based on a greedy growth of the tree structure by recursively splitting nodes, and possibly pruning back the final tree. The parameters (decision function) of an internal node are approximately estimated by minimizing an impurity measure. We give an algorithm that, given an input tree (its structure and the parameter values at its nodes), produces a new tree with the same or smaller structure but new parameter values that provably lower or leave unchanged the misclassification error. This can be applied to both axis-aligned and oblique trees and our experiments show it consistently outperforms various other algorithms while being highly scalable to large datasets and trees. Further, the same algorithm can handle a sparsity penalty, so it can learn sparse oblique trees, having a structure that is a subset of the original tree and few nonzero parameters. This combines the best of axis-aligned and oblique trees: flexibility to model correlated data, low generalization error, fast inference and interpretable nodes that involve only a few features in their decision.
Author Information
Miguel A. Carreira-Perpinan (University of California, Merced)
Pooya Tavallali (UC Merced)
More from the Same Authors
-
2022 Poster: Semi-Supervised Learning with Decision Trees: Graph Laplacian Tree Alternating Optimization »
Arman Zharmagambetov · Miguel A. Carreira-Perpinan -
2017 : Poster Session 2 »
Farhan Shafiq · Antonio Tomas Nevado Vilchez · Takato Yamada · Sakyasingha Dasgupta · Robin Geyer · Moin Nabi · Crefeda Rodrigues · Edoardo Manino · Alexantrou Serb · Miguel A. Carreira-Perpinan · Kar Wai Lim · Bryan Kian Hsiang Low · Rohit Pandey · Marie C White · Pavel Pidlypenskyi · Xue Wang · Christine Kaeser-Chen · Michael Zhu · Suyog Gupta · Sam Leroux -
2017 : Poster Session (encompasses coffee break) »
Beidi Chen · Borja Balle · Daniel Lee · iuri frosio · Jitendra Malik · Jan Kautz · Ke Li · Masashi Sugiyama · Miguel A. Carreira-Perpinan · Ramin Raziperchikolaei · Theja Tulabandhula · Yung-Kyun Noh · Adams Wei Yu -
2016 Poster: An ensemble diversity approach to supervised binary hashing »
Miguel A. Carreira-Perpinan · Ramin Raziperchikolaei -
2016 Poster: Optimizing affinity-based binary hashing using auxiliary coordinates »
Ramin Raziperchikolaei · Miguel A. Carreira-Perpinan -
2015 Poster: A fast, universal algorithm to learn parametric nonlinear embeddings »
Miguel A. Carreira-Perpinan · Max Vladymyrov -
2011 Poster: A Denoising View of Matrix Completion »
Weiran Wang · Miguel A. Carreira-Perpinan · Zhengdong Lu -
2007 Poster: People Tracking with the Laplacian Eigenmaps Latent Variable Model »
Zhengdong Lu · Miguel A. Carreira-Perpinan · Cristian Sminchisescu