Poster
Stabilizing Linear Passive-Aggressive Online Learning with Weighted Reservoir Sampling
Skyler Wu · Fred Lu · Edward Raff · James Holt
West Ballroom A-D #5601
Abstract:
Online learning methods, like the seminal Passive-Aggressive (PA) classifier, are still highly effective for high-dimensional streaming data, out-of-core processing, and other throughput-sensitive applications. Many such algorithms rely on fast adaption to individual errors as a key to their convergence. While such algorithms enjoy low theoretical risk bounds, in real-world deployment they can be sensitive to individual outliers that cause the algorithm to over-correct. When such outliers occur at the end of the data stream, this can cause the final solution to have unexpectedly low accuracy. We design a weighted reservoir sampling (WRS) approach to remedy this risk without requiring additional passes over the data, hold-out sets, or a constant factor of additional memory. Our key insight is that good solutions tend to be error-free for more iterations than bad solutions, and thus, the number of passive rounds provides an estimate of a solution's relative quality. Our reservoir thus contains $K$ previous intermediate weight vectors that are of an (expected) high accuracy. We demonstrate our WRS approach on the Passive-Aggressive Classifier (PAC) and First-Order Sparse Online Learning (FSOL), where our method consistently and significantly outperforms the unmodified approach.
Live content is unavailable. Log in and register to view live content