Abstract
Multi-fold cross-validation is an established practice to estimate the error rate of a learning algorithm. Quantifying the variance reduction gains due to cross-validation has been challenging due to the inherent correlations introduced by the folds. In this work we introduce a new and weak measure called loss stability and relate the cross-validation performance to this measure; we also establish that this relationship is near-optimal. Our work thus quantitatively improves the current best bounds on cross-validation.