Abstract
In content-based image retrieval (CBIR) performance char- acterization is easily being neglected. A ma jor dificulty lies in the fact that ground truth and the definition of benchmarks are extremely user and application dependent. This paper proposes a two-stage CBIR frame- work which allows to predict the behavior of the retrieval system as well as to optimize its performance. In particular, it is possible to maximize precision, recall, or jointly precision and recall. The framework is based on the detection of high-level concepts in images. These concepts cor- respond to vocabulary users can query the database with. Performance optimization is carried out on the basis of the user query, the perfor- mance of the concept detectors, and an estimated distribution of the concepts in the database. The optimization is transparent to the user and leads to a set of internal parameters that optimize the succeeding retrieval. Depending only on the query and the desired concept, preci- sion and recall of the retrieval can be increased by up to 40%. The paper discusses the theoretical and empirical results of the optimization as well as its dependency on the estimate of the concept distribution.