Abstract
We carefully study how well minimizing convex surrogate loss functions corresponds to minimizing the misclassification error rate for the problem of binary classification with linear predictors. We consider the agnostic setting, and investigate guarantees on the misclassification error of the loss-minimizer in terms of the margin error rate of the best predictor. We show that, aiming for such a guarantee, the hinge loss is essentially optimal among all convex losses.