DPN
|
This repository contains the code and trained models of:
Yunpeng Chen, Jianan Li, Huaxin Xiao, Xiaojie Jin, Shuicheng Yan, Jiashi Feng. "Dual Path Networks" (NIPS17).
DPNs helped us won the 1st place in Object Localization Task in ILSVRC 2017, with all competition tasks within Top 3. (Team: NUS-Qihoo_DPNs)
DPNs are implemented by MXNet @92053bd.
| Method | Settings | | :------------- | :--------: | | Random Mirror | True | | Random Crop | 8% - 100% | | Aspect Ratio | 3/4 - 4/3 | | Random HSL | [20,40,50] |
Note: We did not use PCA Lighting and any other advanced augmentation methods. Input images are resized by bicubic interpolation.
The augmented input images are substrated by mean RGB = [ 124, 117, 104 ], and then multiplied by 0.0167.
Here, we introduce a new testing technique by using Mean-Max Pooling which can further improve the performance of a well trained CNN in the testing phase without the need of any training/fine-tuining process. This testing technique is designed for the case when the testing images is larger than training crops. The idea is to first convert a trained CNN model into a convolutional network and then insert the following Mean-Max Pooling layer (a.k.a. Max-Avg Pooling), i.e. 0.5 * (global average pooling + global max pooling), just before the final softmax layer.
Based on our observations, Mean-Max Pooling consistently boost the testing accuracy. We adopted this testing strategy in both LSVRC16 and LSVRC17.
Single Model, Single Crop Validation Error:
Model | Size | GFLOPs | 224x224 | 320x320 | 320x320 ( with mean-max pooling ) | |||
---|---|---|---|---|---|---|---|---|
Top 1 | Top 5 | Top 1 | Top 5 | Top 1 | Top 5 | |||
DPN-68 | 49 MB | 2.5 | 23.57 | 6.93 | 22.15 | 5.90 | 21.51 | 5.52 |
DPN-92 | 145 MB | 6.5 | 20.73 | 5.37 | 19.34 | 4.66 | 19.04 | 4.53 |
DPN-98 | 236 MB | 11.7 | 20.15 | 5.15 | 18.94 | 4.44 | 18.72 | 4.40 |
DPN-131 | 304 MB | 16.0 | 19.93 | 5.12 | 18.62 | 4.23 | 18.55 | 4.16 |
Single Model, Single Crop Validation Error:
Model | Size | GFLOPs | 224x224 | 320x320 | 320x320 ( with mean-max pooling ) | |||
---|---|---|---|---|---|---|---|---|
Top 1 | Top 5 | Top 1 | Top 5 | Top 1 | Top 5 | |||
DPN-68 | 49 MB | 2.5 | 22.45 | 6.09 | 20.92 | 5.26 | 20.62 | 5.07 |
DPN-92 | 145 MB | 6.5 | 19.98 | 5.06 | 19.00 | 4.37 | 18.79 | 4.19 |
DPN-107 | 333 MB | 18.3 | 19.75 | 4.94 | 18.34 | 4.19 | 18.15 | 4.03 |
Note: DPN-107 is not well trained.
Single Model, Single Crop Validation Accuracy:
Model | Size | GFLOPs | 224x224 | 320x320 | 320x320 ( with mean-max pooling ) | |||
---|---|---|---|---|---|---|---|---|
Top 1 | Top 5 | Top 1 | Top 5 | Top 1 | Top 5 | |||
DPN-68 | 61 MB | 2.5 | 61.27 | 85.46 | 61.54 | 85.99 | 62.35 | 86.20 |
DPN-92 | 184 MB | 6.5 | 67.31 | 89.49 | 66.84 | 89.38 | 67.42 | 89.76 |
Note: The higher model complexity comes from the final classifier. Models trained on ImageNet-5k learn much richer feature representation than models trained on ImageNet-1k.
The training speed is tested based on MXNet @92053bd.
Multiple Nodes (Without specific code optimization):
Model | CUDA
/cuDNN | #Node | GPU Card
(per node) | Batch Size
(per GPU) | kvstore
| GPU Mem
(per GPU) | Training Speed*
(per node) :-------|:------------:|:----:|:---------------------:|:----------------------:|:---------:|:---------:|:-----------: DPN-68 | 8.0 / 5.1 | 10 | 4 x K80 (Tesla) | 64 |dist_sync
| 9337 MiB | 284 img/sec DPN-92 | 8.0 / 5.1 | 10 | 4 x K80 (Tesla) | 32 |dist_sync
| 8017 MiB | 133 img/sec DPN-98 | 8.0 / 5.1 | 10 | 4 x K80 (Tesla) | 32 |dist_sync
| 11128 MiB | 85 img/sec DPN-131 | 8.0 / 5.1 | 10 | 4 x K80 (Tesla) | 24 |dist_sync
| 11448 MiB | 60 img/sec DPN-107 | 8.0 / 5.1 | 10 | 4 x K80 (Tesla) | 24 |dist_sync
| 12086 MiB | 55 img/sec
*This is the actual training speed, which includes
data augmentation
,forward
,backward
,parameter update
,network communication
, etc. MXNet is awesome, we observed a linear speedup as has been shown in link
Model | Size | Dataset | MXNet Model :--------|:------:|:---------:|:-----------------------------------: DPN-68 | 49 MB |ImageNet-1k|GoogleDrive DPN-68* | 49 MB |ImageNet-1k|GoogleDrive DPN-68 | 61 MB |ImageNet-5k|GoogleDrive DPN-92 | 145 MB |ImageNet-1k|GoogleDrive DPN-92 | 138 MB |Places365-Standard|GoogleDrive DPN-92* | 145 MB |ImageNet-1k|GoogleDrive DPN-92 | 184 MB |ImageNet-5k|GoogleDrive DPN-98 | 236 MB |ImageNet-1k|GoogleDrive DPN-131 | 304 MB |ImageNet-1k|GoogleDrive DPN-107*| 333 MB |ImageNet-1k|GoogleDrive
*Pretrained on ImageNet-5k and then fine-tuned on ImageNet-1k.
Caffe Implementation with trained models by soeaver
PyTorch Implementation with trained models by rwightman
ImageNet-1k Trainig/Validation List: - Download link: GoogleDrive
ImageNet-1k category name mapping table: - Download link: GoogleDrive
ImageNet-5k Raw Images: - The ImageNet-5k is a subset of ImageNet10K provided by this paper. - Please download the ImageNet10K and then extract the ImageNet-5k by the list below.
ImageNet-5k Trainig/Validation List: - It contains about 5k leaf categories from ImageNet10K. There is no category overlapping between our provided ImageNet-5k and the official ImageNet-1k. - Download link: GoogleDrive
Places365-Standard Validation List & Matlab code for 10 crops testing: - Download link: GoogleDrive
If you use DPN in your research, please cite the paper:
@article{Chen2017, title={Dual Path Networks}, author={Yunpeng Chen, Jianan Li, Huaxin Xiao, Xiaojie Jin, Shuicheng Yan, Jiashi Feng}, journal={arXiv preprint arXiv:1707.01629}, year={2017} }
上一篇:YOLO in caffe
下一篇:PyTorch YOLOv3
还没有评论,说两句吧!
热门资源
Keras-ResNeXt
Keras ResNeXt Implementation of ResNeXt models...
seetafaceJNI
项目介绍 基于中科院seetaface2进行封装的JAVA...
spark-corenlp
This package wraps Stanford CoreNLP annotators ...
capsnet-with-caps...
CapsNet with capsule-wise convolution Project ...
inferno-boilerplate
This is a very basic boilerplate example for pe...
智能在线
400-630-6780
聆听.建议反馈
E-mail: support@tusaishared.com