Overview of New Developments in Boosting
I will give an overview of recent developments in boosting, focusing on three papers which take very different approaches towards making boosting more efficient and effective. Boosters iteratively choose base classifiers via a weak learner and then update a distribution over training examples. Roughly, the three papers show progress on the three issues implicit in this one-sentence description of boosting: the number of iterations required, the computational cost of choosing good base classifiers, and the time and space complexity from maintaining a distribution over training examples. Warmuth, Liao, and Ratsch (2006) propose TotalBoost, which is a "totally corrective" boosting algorithm. Intuitively, on each round, totally corrective boosters choose base classifiers which give more information not present in previously chosen base classifiers. This leads to fewer iterations and smaller final hypotheses. Barutcuoglu, Long, and Servedio (2007) describe an alternative model for boosting where assumptions about the diversity of base classifiers allow the booster to learn in a single pass over the set of base classifiers. This eliminates the need to optimize over all base classifiers on each round. Bradley and Schapire (2007) propose an algorithm called FilterBoost which trains on examples drawn from an oracle rather than a fixed set of examples. This alternative learning framework can model learning via a single pass over the set of training examples and allows the booster to train efficiently on very large datasets.
Author: Joseph K. Bradley, Machine Learning Department; School Of Computer Science; Carnegie Mellon University