资源算法efficient-densenet-pytorch

efficient-densenet-pytorch

2019-12-25 | |  40 |   0 |   0

efficient-densenet-pytorch

Memory-Efficient Implementation of DenseNets, supporting both DenseNet and DeseNet-BC series.

Environments: Linux CPU/GPU, Python 3, PyTorch 1.0

Check the implementation correctness by python -m utils.gradient_checking.py with different settings in utils/gradient_checking.py (CPU, single GPU, multiple GPUs).

Benchmark the forward/backward of efficient&non-efficient DenseNet by python -m utils.benckmark_effi.py (CPU, single GPU, multiple GPUs). The following results are reported on the Linux system equipped with 40 Intel(R) Xeon(R) CPUs (E5-2630 v4 @ 2.20GHz) and NVIDIA GTX TiTan 1080Ti.

Model setting:

num_init_features=24, block_config=(12, 12, 12), compression=1, input_size=32, bn_size=None, batch size=128.
ModelCPU1 GPU2 GPUs4 GPUs
EfficientF=15849, B=36738, R=2.3F=38.0, B=103.5, R=2.72F=38.5, B=64.0, R=1.66F=63.3, B=77.5, R=1.23
Non-efficientF=24889, B=36732, R=1.5F=38.0, B=77.9, R=2.05F=36.2, B=42.1, R=1.16F=56.2, B=31.8, R=0.57

F means average forward time (ms), B means average backward time (ms), R=B/F.

The efficient version can process up to 1450 batches in a single GPU (~12GB), compared with 350 batches of the non-efficient version. That is, the efficient version is ~4x memory-efficient as the non-efficient version.

How to load the pretrained DenseNet into the efficient version?

It is simple.

Take DenseNet-121 as an example.

First, download the checkpoint:

wget https://download.pytorch.org/models/densenet121-a639ec97.pth

Then run

cd utils
python convert.py --to efficient --checkpoint densenet121-a639ec97.pth  --output densenet121_effi.pth

Done.

You can load the state dict in the saved file densenet121_effi.pth into the efficient model now!

import torchfrom models import DenseNet
model = DenseNet(num_init_features=64, block_config=(6, 12, 24, 16), compression=0.5,                 input_size=224, bn_size=4, num_classes=1000, efficient=True)
state_dict = torch.load('densenet121_effi.pth')
model.load_state_dict(state_dict, strict=True)

Train efficient DenseNet on ImageNet

Easy configuration and run:

  1. Install the requirements via pip install -r requirements.txt

  2. Prepare ImageNet dataset following the installation instructions, and the shell scripts in the datasets/imagenet_pre should be helpful.

  3. Configure the experiment settings in config.yaml.

  4. run the command like ./run.sh 0,1,2,3 config.yaml .

You will find that 4 GPUs are totally enough for training the efficient model with batch size of 256!

CIFAR training is also provided and easy to configure :-)

References

Pleiss, Geoff, et al. "Memory-efficient implementation of densenets." arXiv preprint arXiv:1707.06990 (2017).


上一篇:keras-kaldi

下一篇:Regularized-Linear-Autoencoders

用户评价
全部评价

热门资源

  • Keras-ResNeXt

    Keras ResNeXt Implementation of ResNeXt models...

  • seetafaceJNI

    项目介绍 基于中科院seetaface2进行封装的JAVA...

  • spark-corenlp

    This package wraps Stanford CoreNLP annotators ...

  • capsnet-with-caps...

    CapsNet with capsule-wise convolution Project ...

  • inferno-boilerplate

    This is a very basic boilerplate example for pe...