Abstract
We present a probabilistic framework for recognizing ob jects in images of cluttered scenes. Hundreds of ob jects may be considered and searched in parallel. Each ob ject is learned from a single training image and modeled by the visual appearance of a set of features, and their position with respect to a common reference frame. The recognition process computes identity and position of ob jects in the scene by finding the best interpretation of the scene in terms of learned ob jects. Features detected in an input image are either paired with database features, or marked as clutters. Each hypothesis is scored using a generative model of the image which is defined using the learned ob jects and a model for clutter. While the space of possible hypotheses is enormously large, one may find the best hypothesis efficiently – we explore some heuristics to do so. Our algorithm compares favorably with state-of-the-art recognition systems.