Abstract
Benefitted from its great success on many tasks, deep
learning is increasingly used on low-computational-cost devices, e.g. smartphone, embedded devices, etc. To reduce
the high computational and memory cost, in this work, we
propose a fully learnable group convolution module (FLGC
for short) which is quite efficient and can be embedded into
any deep neural networks for acceleration. Specifically, our
proposed method automatically learns the group structure
in the training stage in a fully end-to-end manner, leading to a better structure than the existing pre-defined, twosteps, or iterative strategies. Moreover, our method can be
further combined with depthwise separable convolution, resulting in 5× acceleration than the vanilla Resnet50 on single CPU. An additional advantage is that in our FLGC the
number of groups can be set as any value, but not necessarily 2k as in most existing methods, meaning better tradeoff
between accuracy and speed. As evaluated in our experiments, our method achieves better performance than existing learnable group convolution and standard group convolution when using the same number of groups.