Timezone: »

Multiclass Boosting and the Cost of Weak Learning
Nataly Brukhim · Elad Hazan · Shay Moran · Indraneel Mukherjee · Robert E Schapire

Thu Dec 09 08:30 AM -- 10:00 AM (PST) @ None #None
Boosting is an algorithmic approach which is based on the idea of combining weak and moderately inaccurate hypotheses to a strong and accurate one. In this work we study multiclass boosting with a possibly large number of classes or categories. Multiclass boosting can be formulated in various ways. Here, we focus on an especially natural formulation in which the weak hypotheses are assumed to belong to an ''easy-to-learn'' base class, and the weak learner is an agnostic PAC learner for that class with respect to the standard classification loss. This is in contrast with other, more complicated losses as have often been considered in the past. The goal of the overall boosting algorithm is then to learn a combination of weak hypotheses by repeatedly calling the weak learner.We study the resources required for boosting, especially how theydepend on the number of classes $k$, for both the booster and weak learner.We find that the boosting algorithm itself only requires $O(\log k)$samples, as we show by analyzing a variant of AdaBoost for oursetting. In stark contrast, assuming typical limits on the number of weak-learner calls,we prove that the number of samples required by a weak learner is at least polynomial in $k$, exponentially more than thenumber of samples needed by the booster.Alternatively, we prove that the weak learner's accuracy parametermust be smaller than an inverse polynomial in $k$, showing that the returned weakhypotheses must be nearly the best in their class when $k$ is large.We also prove a trade-off between number of oracle calls and theresources required of the weak learner, meaning that the fewer calls to theweak learner the more that is demanded on each call.

Author Information

Nataly Brukhim (Princeton University)
Elad Hazan (Princeton University)
Shay Moran (Technion)
Indraneel Mukherjee (Princeton University)
Robert E Schapire (MIcrosoft Research)

Robert Schapire received his ScB in math and computer science from Brown University in 1986, and his SM (1988) and PhD (1991) from MIT under the supervision of Ronald Rivest. After a short post-doc at Harvard, he joined the technical staff at AT&T Labs (formerly AT&T Bell Laboratories) in 1991 where he remained for eleven years. At the end of 2002, he became a Professor of Computer Science at Princeton University. His awards include the 1991 ACM Doctoral Dissertation Award, the 2003 Gödel Prize and the 2004 Kanelakkis Theory and Practice Award (both of the last two with Yoav Freund). His main research interest is in theoretical and applied machine learning.

More from the Same Authors