Abstract
Conference Title: ICASSP 2014 - 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) Conference Start Date: 2014, May 4 Conference End Date: 2014, May 9 Conference Location: Florence, Italy We propose in this paper a spatio-temporal pyramid representation (STPR) of the video based Accordion image. The Accordion image allows the pixels having a high temporal correlation to be put in space adjacency. The STPR introduces spatial and temporal layout information to the local SIFT features computed on the Accordion image. It consists in applying firstly, a temporal pyramid decomposition on the video to divide it into a sequence of increasingly finer temporal blocks and secondly in performing a spatial pyramid representation on the Accordion images relative to the temporal blocks. The Multiple Kernel Learning approach is used to combine the multi-histograms coming from different Spatio-Temporal Pyramid levels. Experiments using the human action recognition datasets (Hollywood2 and Olympic sports) show the effectiveness of the proposed approach. [PUBLICATION ABSTRACT]