Abstract
A visual representation of an ob ject must meet at least three basic requirements. First, it must allow identification of the ob ject in the presence of slight but unpredictable changes in its visual appearance. Sec- ond, it must account for larger changes in appearance due to variations in the ob ject’s fundamental degrees of freedom, such as, e.g., changes in pose. And last, any ob ject representation must be derivable from visual input alone, i.e., it must be learnable. We here construct such a representation by deriving transformations between the di?erent views of a given ob ject, so that they can be parameterized in terms of the ob ject’s physical degrees of freedom. Our method allows to automatically derive the appearance representations of an ob ject in conjunction with their linear deformation model from example images. These are subsequently used to provide linear charts to the entire appearance manifold of a three-dimensional ob ject. In contrast to approaches aiming at mere dimensionality reduction the local linear charts to the ob ject’s appearance manifold are estimated on a strictly local basis avoiding any reference to a metric embedding space to all views. A real understanding of the ob ject’s appearance in terms of its physical degrees of freedom is this way learned from single views alone.