We are developing a system which will reduce human intervention in football video shooting and highlights editing, allowing automatic panning and zooming on salient areas of the playing field. The system is built up of two main components: the video shooting subsystem and the recognition subsystem. The video shooting part is built up of two synchronized network cameras with 4K resolution, each one focused on one half of the pitch. Cameras are calibrated to remove radial distortion of their lenses. Undistorted videos are then composed to obtain a unique frame representing the entire playing field, by video stitching operation.
Recognition subsystem will exploit computer vision algorithms on the stitched video to perform automatic pan and zoom and extraction of salient segments of a match. Recorded matches are elaborated offline by an external server, which extracts and analyzes motion and visual features of the elements in the scene, i.e. players, and, if possible, the ball. As we use ultra high definition cameras, salient area zooming can be performed without loss of quality in the final video, as we crop a portion of the original frame instead of enlarging it.
Automatic summarization is performed by classifying subsequences of a match, with machine learning algorithms which are pretrained on previously acquired and annotated videos of other matches. Among salient events, corner kicks, free kicks, shots on goal, penalties and goals are identified.
Summarized videos of salient actions will be available online on www.higoal.it