Abstract
A main theme in object detection are currently discrim-inative part-based models. The powerful model that com-bines all parts is then typically only feasible for few con-stituents, which are in turn iteratively trained to make themas strong as possible. We follow the opposite strategy byrandomly sampling a large number of instance specific partclassifiers. Due to their number, we cannot directly train apowerful classifier to combine all parts. Therefore, we ran-domly group them into fewer, overlapping compositions thatare trained using a maximum-margin approach. In contrastto the common rationale of compositional approaches, wedo not aim for semantically meaningful ensembles. Ratherwe seek randomized compositions that are discriminativeand generalize over all instances of a category. Our ap-proach not only localizes objects in cluttered scenes, butalso explains them by parsing with compositions and theirconstituent parts. We conducted experiments on PASCAL VOC07, on theVOC10 evaluation server, and on the MITIndoor scene dataset. To the best of our knowledge, our randomized maxmargin compositions (RM2 C) are the currently best performing single class object detector using only HOG features. Moreover, the individual contributions of compositions and their parts are evaluated in separate experimentsthat demonstrate their potential.