Abstract. We study the problem of building models that can transfer
selected attributes from one image to another without affecting the other
attributes. Towards this goal, we develop analysis and a training methodology for autoencoding models, whose encoded features aim to disentangle
attributes. These features are explicitly split into two components: one
that should represent attributes in common between pairs of images, and
another that should represent attributes that change between pairs of
images. We show that achieving this objective faces two main challenges:
One is that the model may learn degenerate mappings, which we call
shortcut problem, and the other is that the attribute representation for
an image is not guaranteed to follow the same interpretation on another
image, which we call reference ambiguity. To address the shortcut problem, we introduce novel constraints on image pairs and triplets and show
their effectiveness both analytically and experimentally. In the case of
the reference ambiguity, we formally prove that a model that guarantees
an ideal feature separation cannot be built. We validate our findings
on several datasets and show that, surprisingly, trained neural networks
often do not exhibit the reference ambiguity