Abstract
In this article, we focus on the dynamic facial emotion recognition from online video. We combine deep neural networks with transfer learning theory and propose a novel model named DT-EFER. In detail, DT-EFER uses GoogLeNet to extract the deep features of key images from video clips. Then to solve the dynamic facial emotion recognition scenario, the framework introduces transfer learning theory. Thus, to improve the recognition performance, model DT-EFER focuses on the differences between key images instead of those images themselves. Moreover, the time complexity of this model is not high, even if previous exemplars are introduced here. In contrast to other exemplar-based models, experiments based on two datasets, namely, BAUM-1s and Extended Cohn–Kanade, have shown the efficiency of the proposed DT-EFER model.