Abstract
Bilinear models has been shown to achieve impressiveperformance on a wide range of visual tasks, such as semantic segmentation, fine grained recognition and face recognition. However, bilinear features are high dimensional, typically on the order of hundreds of thousands to afew million, which makes them impractical for subsequentanalysis. We propose two compact bilinear representationswith the same discriminative power as the full bilinear rep-resentation but with only a few thousand dimensions. Our compact representations allow back-propagation of classification errors enabling an end-to-end optimization of the visual recognition system. The compact bilinear representations are derived through a novel kernelized analysis of bilinear pooling which provide insights into the discriminative power of bilinear pooling, and a platform for further research in compact pooling methods. Experimentation illustrate the utility of the proposed representations for imageclassification and few-shot learning across several datasets.