Abstract
We propose a new technique to jointly recover cosegmen-tation and dense per-pixel correspondence in two images.Our method parameterizes the correspondence field usingpiecewise similarity transformations and recovers a map-ping between the estimated common “foreground” regions in the two images allowing them to be precisely aligned. Our formulation is based on a hierarchical Markov random field model with segmentation and transformation labels. The hierarchical structure uses nested image regions to con-strain inference across multiple scales. Unlike prior hier-archical methods which assume that the structure is given, our proposed iterative technique dynamically recovers the structure along with the labeling. This joint inference is per-formed in an energy minimization framework using iterated graph cuts. We evaluate our method on a new dataset of 400 image pairs with manually obtained ground truth, where it outperforms state-of-the-art methods designed specifically for either cosegmentation or correspondence estimation.