shuffleNet-cifar10

a pytorch implement of shuffleNet on cifar-10

channel shuffle is a operation proposed in shuffleNet to adress the information isolation between channels while using successive group convolution.

It can be done using only several lines code

# channel shufflen, c, w, h = x.shape
x = x.view(n, self.g, self.n, w, h)
x = x.transpose_(1, 2).contiguous()
x = x.view(n, c, w, h)

To make it suit cifar10's image size, I have disabled some downsample operation (i.e. maxpooling or stride = 2) and just keep the last two
because of the low efficiency of group convolution, it takes relatively long time to train, more details can be seen below
scale factor groups params/M flops/M training time accuracy
1.0 8 0.9131 161.70 11.4h 92.29%
0.5 8 0.2507 43.43 6.5h 91.48%
0.5 3 0.2427 42.97 4.0h 92.60%
0.5 1 0.2487 44.63 3.6h 91.44%

scale factor	groups	params/M	flops/M	training time	accuracy
1.0	8	0.9131	161.70	11.4h	92.29%
0.5	8	0.2507	43.43	6.5h	91.48%
0.5	3	0.2427	42.97	4.0h	92.60%
0.5	1	0.2487	44.63	3.6h	91.44%

here the accuracy means the max accuracy on validation set
training time is measured on a titan x (pascal) GPU
the results is comparable with resnet 20 which have the similar number of parameters:
resnet 20 params: 0.27M accuracy: 91.25%

下一篇： pytorch-trpo

用户评价

全部评价

还没有评论，说两句吧！

热门资源