Timezone: »

Linear Classification and Selective Sampling Under Low Noise Conditions
Giovanni Cavallanti · Nicolò Cesa-Bianchi · Claudio Gentile

Tue Dec 09 07:30 PM -- 12:00 AM (PST) @ None #None
We provide a new analysis of an efficient margin-based algorithm for selective sampling in classification problems. Using the so-called Tsybakov low noise condition to parametrize the instance distribution, we show bounds on the convergence rate to the Bayes risk of both the fully supervised and the selective sampling versions of the basic algorithm. Our analysis reveals that, excluding logarithmic factors, the average risk of the selective sampler converges to the Bayes risk at rate $n^{-(1+\alpha)/(3+\alpha)}$, with labels being sampled at the same rate (here $n$ denotes the sample size, and $\alpha > 0$ is the exponent in the low noise condition). We compare this convergence rate to the rate $n^{-(1+\alpha)/(2+\alpha)}$ achieved by the fully supervised algorithm using all labels. Experiments on textual data reveal that simple variants of the proposed selective sampler perform much better than popular and similarly efficient competitors.

Author Information

Giovanni Cavallanti (Universita' degli Studi di Milano)
Nicolò Cesa-Bianchi (Università degli Studi di Milano, Italy)
Claudio Gentile (INRIA)

More from the Same Authors