in

Workshop: OPT 2021: Optimization for Machine Learning

Abstract:

Boosting is a standard framework for learning a large-margin sparse ensemble of base hypotheses, where the algorithm assumes an oracle called the base learner, which returns a base hypothesis h with maximum edge Ei[yih(xi)] with respect to a given distribution over the sample ((x1, y1), . . . , (xm, ym)). In particular, for the l1-norm regularized soft margin optimization problem, there are several boosting algorithms that have theoretical iteration bound for finding ε- approximate solutions. They are not as fast as classical LPBoost by Demiriz et al., which has no non-trivial iteration bound. In this paper, we propose a new boosting scheme, where we assume a special base learner, which returns the average margin distribution vector Eh[(yih(xi))mi=1] with respect to a certain distribution over the base hypothesis class H. Under this scheme, we pro- pose a boosting algorithm whose iteration bound is O((r ln m)/ε2) and running time is O(m ln m) per iteration, where r is the VC dimension of H. Moreover, we also propose an efficient implementation for the new base learner, given that a relevant sub-class H|S of H, is represented by a non-deterministic version of the Zero-suppressed Decision Diagram (NZDD), where NZDD is a data structure for representing a family of sets succinctly.