Dictionary based action video classification with action bank

Classifying action videos became challenging problem in computer vision community. In this work, action videos are represented by dictionaries which are learned by online dictionary learning (ODL). Here, we have used two simple measures to classify action videos, reconstruction error and projection. Sparse approximation algorithm LASSO is used to reconstruct test video and reconstruction error is calculated for each of the dictionaries.

To get another discriminative measure projection, the test vector is projected onto the atoms in the dictionary. Minimum reconstruction error and maximum projection give information regarding the action category of the test vector. With action bank as a feature vector, our best performance is 59.3% on UCF50 (benchmark is 57.9%), 97.7% on KTH (benchmark is 98.2%)and 23.63% on HMDB51 (benchmark is 26.9%).

Share This Post