Abstract
This paper presents a prior-less method for tracking
and clustering an unknown number of human faces and
maintaining their individual identities in unconstrained
videos. The key challenge is to accurately track faces
with partial occlusion and drastic appearance changes
in multiple shots resulting from significant variations of
makeup, facial expression, head pose and illumination.
To address this challenge, we propose a new multi-face
tracking and re-identification algorithm, which provides
high accuracy in face association in the entire video
with automatic cluster number generation, and is robust
to outliers. We develop a co-occurrence model of
multiple body parts to seamlessly create face tracklets,
and recursively link tracklets to construct a graph for
extracting clusters. A Gaussian Process model is introduced
to compensate the deep feature insufficiency, and is further
used to refine the linking results. The advantages of the
proposed algorithm are demonstrated using a variety of
challenging music videos and newly introduced body-worn
camera videos. The proposed method obtains significant
improvements over the state of the art [51], while relying
less on handling video-specific prior information to achieve
high performance