Abstract
For many tasks in computer vision, it is very important to produce the groundtruth data. At present, this is mostly done manually. Manual data labeling is labor-intensive and prone to the human errors. The training data it produces often lacks in both quantity and quality. Fully automatic data labeling, on the other hand, is not feasible and reliable. In this paper, we propose an interactive image labeling technique for efficient and accurate data labeling. The proposed technique includes two parts: an automatic labeling part and a human intervention part. Constructed on a Bayesian Network, the automatic im- age labeler produces an initial labeling of the image. A person then examines the initial labeling and makes some minor corrections. The selected human correc- tions and the image measurements are then integrated by the Bayesian Network framework to produce a refined labeling. To minimize the human involvement, an active user feedback strategy is developed, through which the optimal user feedback is determined, so that the labeling errors in the subsequent re-labeling process can be maximally reduced. The proposed framework combines the ad- vantages of the human input with those of the machine so that the reliable, accu- rate, and efficient data labeling can be achieved. We demonstrate the validity of the proposed framework for interactive labeling of facial action units. The pro- posed methodology, however, is not limited to labeling of facial action units. It can be easily extended to other areas such as interactive image segmentation.