Abstract
One of the cornerstone principles of deep models is their abstraction capacity, i.e. their ability to learn abstract concepts from ‘simpler’ ones. Through extensive experiments, we characterize the nature of the relationshipbetween abstract concepts (specifically objects in images)learned by popular and high performing convolutional net-works (conv-nets) and established mid-level representationsused in computer vision (specifically semantic visual at-tributes). We focus on attributes due to their impact on several applications, such as object description, retrieval and mining, and active (and zero-shot) learning. Among the findings we uncover, we show empirical evidence of the existence of Attribute Centric Nodes (ACNs) within a conv-net, which is trained to recognize objects (not attributes) in images. These special conv-net nodes (1) collectively encode information pertinent to visual attribute representation and discrimination, (2) are unevenly and sparsely distribution across all layers of the conv-net, and (3) play an important role in conv-net based object recognition.