CapsNET-concepts

资源分类

CapsNET-concepts

2020-03-27 |

49 |

0 |

CapsNET-concepts

CapsNet

This is an implementation of CapsNet for mnist based on the Dynamic Routing Between Capsules paper by Sara Sabour, Nicholas Frosst, and Geoffrey E. Hinton. I used tensorflow and mnist data downloaded from Yann LeCun website.

The code is developed Jupyter notebook. I added naive unit tests to make sure the output of each layer is according to specs. I tried to use same name convention for most cases for clarity.

The following is the main architecture of CapsNet:

图片.png

Convolution layer

It consists of a convolution layer with 256 filters all with 9x9 kernels with stride of 1 with ReLU activation. This results in output 20x20x256.

图片.png

Primary capsule layer

It consists of 32 6x6x8 capsules that translates to 32 convolution layers each have 8 filters and 9x9 kernels, stride of 2 and linear activation. The output (u) is 32x6x6x8 and is reshaped to 1152x8. 1152 is total capsules outputs.

图片.png

Transformation

The output of the primary capsule layer is multiplied by Weight matrix to create u_hat. Considering the DigitCaps layer has 10-16D vectors the u_hat will have the shape of 1152x10x16.

图片.png

Routing

At this stage, the logits (bij) will be learned through routing algorithms. Logits will be translated to coupling coefficients (cij) using softmax function (calculated over DigitCaps). This defines the parent (Digit Caps 1 to 10) that is chosen by each capsule output. So the output is called (s). (s) then goes through squash function to create (v) with unit norm. Routing algorithm should be executed for each sample in the batch, so I used tf.while_loop.

图片.png

Number of iteration is set to two. The following shows routing algorithm which executed consecutively for every sample in the batch.

图片.png

Fully connected (decoder) layer

This layer reconstructs the input from the DigiCaps layer outputs. This forces the 16D vectors in DigitCaps layer represent the actual digits.

图片.png

Loss

Loss is combination of margin loss present, margin loss non-present and reconstruction loss with different scalers.

图片.png

Optimization

Adam optimizer is used with default parameters.

图片.png

Training

Training the model with batch size 100 and 5 epochs gave me 0.98% accuracy on test data.

Interesting characteristics of coupling coefficients

After training, I build a graph to visualize how coupling coefficients (cij) choose their parent capsules in DigiCaps layer. I used the values for coupling coefficient vectors to visualize the relationship between primary capsules and DigiCaps on a graph. As we can see, primary capsules from different layers are gathered around the DigiCaps.

图片.png

上一篇：CapsnetCancer

下一篇：Capsule-Networks-Notebook-MNIST

用户评价

全部评价

还没有评论，说两句吧！

热门资源

TensorFlow-Course

This repository aims to provide simple and read...
seetafaceJNI

项目介绍基于中科院seetaface2进行封装的JAVA...
mxnet_VanillaCNN

This is a mxnet implementation of the Vanilla C...
DuReader_QANet_BiDAF

Machine Reading Comprehension on DuReader Usin...
Klukshu-Sockeye-...

KLUKSHU SOCKEYE PROJECTS 2016 This repositor...

智能在线

400-630-6780
聆听.建议反馈

E-mail: support@tusaishared.com