This repository contains the pytorch code for multiple CNN architectures and improve methods based on the following papers, hope the implementation and results will helpful for your research!!
## 1 GPU for lenetCUDA_VISIBLE_DEVICES=0 python -u train.py --work-path ./experiments/cifar10/lenet## resume from ckptCUDA_VISIBLE_DEVICES=0 python -u train.py --work-path ./experiments/cifar10/lenet --resume## 2 GPUs for resnet1202CUDA_VISIBLE_DEVICES=0,1 python -u train.py --work-path ./experiments/cifar10/preresnet1202## 4 GPUs for densenet190bcCUDA_VISIBLE_DEVICES=0,1,2,3 python -u train.py --work-path ./experiments/cifar10/densenet190bc
We use yaml file config.yaml to save the parameters, check any files in ./experimets for more details. You can see the training curve via tensorboard, tensorboard --logdir path-to-event --port your-port. The training log will be dumped via logging, check log.txt in your work path.
Results on CIFAR
Vanilla architectures
architecture
params
batch size
epoch
C10 test acc (%)
C100 test acc (%)
Lecun
62K
128
250
67.46
34.10
alexnet
2.4M
128
250
75.56
38.67
vgg19
20M
128
250
93.00
72.07
preresnet20
0.27M
128
250
91.88
67.03
preresnet110
1.7M
128
250
94.24
72.96
preresnet1202
19.4M
128
250
94.74
75.28
densenet100bc
0.76M
64
300
95.08
77.55
densenet190bc
25.6M
64
300
96.11
82.59
resnext29_16x64d
68.1M
128
300
95.94
83.18
se_resnext29_16x64d
68.6M
128
300
96.15
83.65
cbam_resnext29_16x64d
68.7M
128
300
96.27
83.62
ge_resnext29_16x64d
70.0M
128
300
96.21
83.57
With additional regularization
PS: the default data augmentation methods are RandomCrop + RandomHorizontalFlip + Normalize, and the √ means which additional method be used.