Abstract
Visual vocabulary construction is an integral part of the pop- ular Bag-of-Features (BOF) model. When visual data scale up (in terms of the dimensionality of features or/and the number of samples), most existing algorithms (e.g. k-means) become unfavorable due to the pro- hibitive time and space requirements. In this paper we propose the ran- dom locality sensitive vocabulary (RLSV) scheme towards efficient visual vocabulary construction in such scenarios. Integrating ideas from the Locality Sensitive Hashing (LSH) and the Random Forest (RF), RLSV generates and aggregates multiple visual vocabularies based on random pro jections, without taking clustering or training efforts. This simple scheme demonstrates superior time and space efficiency over prior meth- ods, in both theory and practice, while often achieving comparable or even better performances. Besides, extensions to supervised and kernel- ized vocabulary constructions are also discussed and experimented with.