Face Tracking and Recognition with Visual Constraints in Real-World Videos


We address the problem of tracking and recognizing faces in real-world, noisy videos. We track faces using a tracker that adaptively builds a target model reflecting changes in appearance, typical of a video setting. However, adaptive appearance trackers often suffer from drift, a gradual adaptation of the tracker to non-targets. To alleviate this problem, our tracker introduces visual constraints using a combination of generative and discriminative models in a particle filtering framework. The generative term conforms the particles to the space of generic face poses while the discriminative one ensures rejection of poorly aligned targets. This leads to a tracker that significantly improves robustness against abrupt appearance changes and occlusions, critical for the subsequent recognition phase. Identity of the tracked subject is established by fusing pose-discriminant and person-discriminant features over the duration of a video sequence. This leads to a robust video-based face recognizer with state-of-the-art recognition performance. We test the quality of tracking and face recognition on realworld noisy videos from YouTube as well as the standard Honda/UCSD database. Our approach produces successful face tracking results on over 80%% of all videos without video or person-specific parameter tuning. The good tracking performance induces similarly high recognition rates: 100%% on Honda/UCSD and over 70%% on the YouTube set containing 35 celebrities in 1500 sequences.