This research aims to realize a videosurveillance system for real-time 3D tracking of multiple people moving over an extended area, as seen from a rotating and zooming camera. The proposed method exploits multi-view image matching techniques to obtain dynamic-calibration of the camera and track many ground targets simultaneously, by slewing the video sensor from target to target and zooming in and out as necessary.
The image-to-world relation obtained with dynamic-calibration is further exploited to perform scale inference from focal length value, and to make robust tracking with scale invariant template matching and joint data-association techniques. We achieve an almost constant standard deviation error of less than 0.3 meters in recovering 3D trajectories of multiple moving targets, in an area of 70×15 meters.
This general framework will serve as support for the future development of a sensor resource manager component that schedules camera pan, tilt, and zoom, supports kinematic tracking, multiple target tracks association, scene context modeling, confirmatory identification, and collateral damage avoidance and in general to enhance multiple target tracking in PTZ camera networks.