资源算法CapsNet-tf

CapsNet-tf

2020-03-27 | |  32 |   0 |   0

A partial implementation of Dynamic Routing Between Capsules
Currently this repo contains only the experiments on MNIST/Multi-MNIST. My experiment on Multi-MNIST wasn't successful; the reconstructed images were too blur.

The CapsuleNet has an amazingly simple architecture. In my opinion, they have at least 2 special features:

  1. Iterative attention mechanism (they call it "dynamic routing).

  2. Vector representation of concepts/objects, with the magnitude representing the probibility of an object's existence.

Dependency

Python 3.5
Tensorflow 1.4

Usage

Run python capsule.py and use tensorboard --logdir [logdir/train/datetime-Capsule] to see the results.

To run the Multi-MNIST test, you need to first build the dataset by
python build_multimnist
and then run the experiment by
python train_multimnist.py

Results

Their figure showing the result of dimension pertubation.
图片.png


Each column is a dimension of the Digit Capsule.
Each row is a perturbation [-.25, 0.25] with 0.05 increment.

图片.png 图片.png

Indeed, there are always some dimensions that represent orientation and thickness.

The image, reconstruction, and latent representation of an input of digit 8.
图片.png

The 8th row is highly activated, so the prediction is naturally 8 (note that the competing digit is 3 but the magnitude is apparently lower). In addition, the highly activated dimensions (7 & 8) represent thickness and orientation. (use tensorboard to view the results above.)

Difference

  1. I rescaled the input to [-1, 1] and used tanh as the output non-linearity of the reconstruction net.

  2. Not all hyper-parameters were specified in their paper.

  3. (Note that my Multi-MNIST failed).

  4. In my Multi-MNIST implementation, I merge two images during runtime instead of precompute a 60M dataset. This differed from the paper's method.

  5. How "accuracy" is defined in Multi-MNIST is unclear to me. Should it be an "unweighted recall" rate? Or should it be the "exact match" rate? I ended up adding both into the summary.

  6. In my Multi-MNIST, the network seemed to loss its ability to infer from images with a single digit, let alone to reconstruct it. This wasn't something that I expected.

Discussion

  1. Their experimental setting on Multi-MNIST isn't 100% reasonable in my opinion. They assumed a strongly supervised condition where the 2 original images were known, which is less practical in real applications. Besides, this condition may allow us to do some other tricks during training...

图片.png 图片.png


上一篇:summarization-dpp-capsnet

下一篇:CapsNet-Gravitational-Lensing

用户评价
全部评价

热门资源

  • seetafaceJNI

    项目介绍 基于中科院seetaface2进行封装的JAVA...

  • spark-corenlp

    This package wraps Stanford CoreNLP annotators ...

  • Keras-ResNeXt

    Keras ResNeXt Implementation of ResNeXt models...

  • capsnet-with-caps...

    CapsNet with capsule-wise convolution Project ...

  • inferno-boilerplate

    This is a very basic boilerplate example for pe...