资源算法tensorflow-CoViAR

tensorflow-CoViAR

2020-02-24 | |  24 |   0 |   0

tensorflow-CoViAR

This repository reimplement CoViAI on TensorFlow, modifying some architectural design from their original pyTorch version. The model can be trained on both UCF101 and HMDB-51 as the original implementation in their paper.

Prerequisites

The model is implemented in the following environment:

  • python3.5+

  • CUDA 9.0

  • cuDNN 7.2+

  • TensorFlow 1.11

  • FFmpeg (Please follow this description from the author to install FFmpeg for using the data loader)

Datasets

As mentioned in the paper, original .avi input videos need to be encoded into the .mpeg4 format with the GOP structure described in this paper. For convenience, we provide converted training data in the following links.

Data Loader

The author provides an excellent data loader to extract the compressed represnetation (i.e., I-frames, motion vectors, and residuals as described in their paper) from .mpeg4 videos. Based on their implementation, we integrate this loader with the TensorFlow dataset API to train our model more efficiently.

To use the data loader, please follow this instruction from the author. Before running our code, make sure you have built the data loader in the folder without any error message.

Pretrained Weights

Loading existing pretrained models and training their weights in TensorFlow could be an arduous and tedious task. We provide the weights of ResNet152 and ResNet50 for initializing our model.

Train the Model

After prepaing the above data, it is now ready to train the CoViAR model. One can simply type

python3 tf-coviar.py

to run the training scripts. On the top of the script, there are some hyperparameters and dataset directory defined by tf.app.flags. Please adjust the setting to fit your training environment.

Note that different from the original pyTorch implementation which train each model separately, we can train the three prediction models end-to-end and update the fusion parameters accordingly.

Visualization

It is also possible to visualize the classication loss via tensorboard. Here is the example and result:

tensorboard --logdir=logs/coviar

Visualization of classification loss

Visualization of validation accuracy

Note taht the i-frames model is based on ResNet 152, therefore it takes more time to converge during training.

Reference

Chao-Yuan Wu, et al., "Compressed Video Action Recognition", CVPR 2018.

"pytorch-coviar", the original implementation from the author


上一篇:AI-ReID-TripletLoss

下一篇:My_DrQA

用户评价
全部评价

热门资源

  • seetafaceJNI

    项目介绍 基于中科院seetaface2进行封装的JAVA...

  • spark-corenlp

    This package wraps Stanford CoreNLP annotators ...

  • Keras-ResNeXt

    Keras ResNeXt Implementation of ResNeXt models...

  • capsnet-with-caps...

    CapsNet with capsule-wise convolution Project ...

  • inferno-boilerplate

    This is a very basic boilerplate example for pe...