A study of new algorithms that improve “online boosting” has won a Best Paper award at the world’s leading academic conference on machine learning.
In the paper, Optimal and Adaptive Algorithms for Online Boosting, (opens in new tab) by Alina Beygelzimer, Satyen Kale, and Haipeng Luo, delivered to the recent International Conference on Machine Learning (ICML) (opens in new tab), researchers at Yahoo Labs and Princeton University published test results showing an average performance gain of more than 5% over existing boosting methods.
Spotlight: Blog post
The findings build on more than a decade of machine learning research into enabling more accurate predictions by running boosting algorithms against datasets that may contain weak, unreliable, or inaccurate information (so-called “weak learners”). If most of the weak information is at least more reliable than data that could be obtained from random chance, boosting algorithms can go to work in an effort to deliver more accurate information.
Testing for the new algorithm variants described in the study—OnlineBBM (optimal) and AdaBoost.OL (adaptive)—was performed on Vowpal Wabbit (VW) (opens in new tab), the out-of-core learning system library from Microsoft Research, across 13 publicly available benchmark datasets for categories such as news and gaming (poker). Results showed OnlineBBM made fewer errors than all other boosting algorithms in 10 of 12 datasets.
The study credited three factors for the improved performance:
- Weaker learning requirements. Runs against weaker (less accurate) datasets.
- No weighting requirements. Does not require “importance-weighted online learning.”
- Optimal utilization. No other online boosting algorithm achieves “the same error rate with fewer weak learners or examples asymptotically.”
OnlineBBM is suited for situations in which certain parameters are known ahead of time. For cases in which such parameters remain unknown, the study described improvements made to AdaBoost, an existing adaptive algorithm that uses “online loss minimization” to achieve results.
The paper concludes: “Although the number of weak learners and excess loss for Adaboost.OL are suboptimal, the adaptivity of AdaBoost.OL is an appealing feature and leads to good performance in experiments. The possibility of obtaining an algorithm that is both adaptive and optimal is left as an open question.”
A total of 270 papers were accepted (opens in new tab) at the ICML conference, July 6-11, 2015 in Lille, France.
—John Kaiser, Research News
For more computer science research news, visit ResearchNews.com (opens in new tab).