Abstract
This paper presents an approach for modeling landmark sites such as the Statue of Liberty based on large-scale contaminated image collections gathered from the Internet. Our system combines 2D appear- ance and 3D geometric constraints to efficiently extract scene summaries, build 3D models, and recognize instances of the landmark in new test im- ages. We start by clustering images using low-dimensional global “gist” descriptors. Next, we perform geometric verification to retain only the clusters whose images share a common 3D structure. Each valid cluster is then represented by a single iconic view, and geometric relationships between iconic views are captured by an iconic scene graph. In addition to serving as a compact scene summary, this graph is used to guide struc- ture from motion to efficiently produce 3D models of the different aspects of the landmark. The set of iconic images is also used for recognition, i.e., determining whether new test images contain the landmark. Results on three data sets consisting of tens of thousands of images demonstrate the potential of the proposed approach.