The P3D-CTN is a framework for spatio-temporal action detection. It
integrates the frame-based two-dimensional convolutional module with the
video-based P3D convolutional module.
A two step manner. First, tube proposals are generated from
P3D-module, and box proposals are produced from 2d-module based on the
tube proposals.
Installation
Just follow Caffe standard installation instructions.
Run P3D-module setup.py to build fundamental enviroment
python P3D-module/setup.py
Datasets
Download three benchmark datasets(JHMDB, UCF101, UCFSports). Use the scripts on P3D-module/datasets to generate the data format for training
P3D-module
Training
P3D_cls_train.sh and P3D_loc_train.sh are used for training P3D-module
sh P3D-module/P3D_cls_train.sh
sh P3D-module/P3D_loc_train.sh
Testing
P3D_cls_eval.py and P3D_loc_eval.py are used for testing P3D-module