Abstract. We present a new combined approach for monocular modelbased 3D tracking. A preliminary object pose is estimated by using a
keypoint-based technique. The pose is then refined by optimizing the
contour energy function. The energy determines the degree of correspondence between the contour of the model projection and the image edges.
It is calculated based on both the intensity and orientation of the raw
image gradient. For optimization, we propose a technique and search
area constraints that allow overcoming the local optima and taking into
account information obtained through keypoint-based pose estimation.
Owing to its combined nature, our method eliminates numerous issues
of keypoint-based and edge-based approaches. We demonstrate the effi-
ciency of our method by comparing it with state-of-the-art methods on
a public benchmark dataset that includes videos with various lighting
conditions, movement patterns, and speed