Abstract
We propose a novel approach to fine-grained image classifi- cation in which instances from different classes share common parts but have wide variation in shape and appearance. We use dog breed identi- fication as a test case to show that extracting corresponding parts im- proves classification performance. This domain is especially challenging since the appearance of corresponding parts can vary dramatically, e.g., the faces of bulldogs and beagles are very different. To find accurate cor- respondences, we build exemplar-based geometric and appearance mod- els of dog breeds and their face parts. Part correspondence allows us to extract and compare descriptors in like image locations. Our approach also features a hierarchy of parts (e.g., face and eyes) and breed-specific part localization. We achieve 67% recognition rate on a large real-world dataset including 133 dog breeds and 8,351 images, and experimental results show that accurate part localization significantly increases clas- sification performance compared to state-of-the-art approaches.