Abstract
Hierarchical classification problems are multiclass supervised learning problems with a predefined hierarchy over the set of class labels. In this work, we study the consistency of hierarchical classification algorithms with respect to a natural loss, namely the tree distance metric on the hierarchy tree of class labels, via the usage of calibrated surrogates. We first show that the Bayes optimal classifier for this loss classifies an instance according to the deepest node in the hierarchy such that the total conditional probability of the subtree rooted at the node is greater than 21 . We exploit this insight to develo new consistent algorithm for hierarchical classification, that makes use of an algorithm known to be consistent for the “multiclass classification with reject option (MCRO)” problem as a subroutine. Our experiments on a number of benchmark datasets show that the resulting algorithm, which we term OvA-Cascade, gives improved performance over other state-of-the-art hierarchical classification algorithms.