I've verified that given same argument, my network has exactly same
number of parameter as his model. It is able to reach the same
loss/accuracy level in these problems, BUT sometimes it gets good result
a little slower than the original implementation in Torch.
This repository mainly follows the structure of the original repo.
And for illustration of different tasks, you could take a look at keras TCN. The author provides some nice figures there.
[1] Bai, Shaojie, J. Zico Kolter, and Vladlen Koltun. "An empirical
evaluation of generic convolutional and recurrent networks for sequence
modeling." arXiv preprint arXiv:1803.01271 (2018).
[2] Salimans, Tim, and Diederik P. Kingma. "Weight normalization: A
simple reparameterization to accelerate training of deep neural
networks." Advances in Neural Information Processing Systems. 2016.