资源算法Character-level-cnn-pytorch

Character-level-cnn-pytorch

2020-02-06 | |  29 |   0 |   0

[PYTORCH] Character-level Convolutional Networks for Text Classification

Introduction

Here is my pytorch implementation of the model described in the paper Character-level Convolutional Networks for Text Classification paper.

Datasets:

Statistics of datasets I used for experiments. These datasets could be download from link

DatasetClassesTrain samplesTest samples
AG’s News4120 0007 600
Sogou News5450 00060 000
DBPedia14560 00070 000
Yelp Review Polarity2560 00038 000
Yelp Review Full5650 00050 000
Yahoo! Answers101 400 00060 000
Amazon Review Full53 000 000650 000
Amazon Review Polarity23 600 000400 000

Setting:

I almost keep default setting as described in the paper. For optimizer and learning rate, there are 2 settings I use:

  • SGD optimizer with initial learning rate of 0.01. The learning rate is halved every 3 epochs.

  • Adam optimizer with initial learning rate of 0.001.

Additionally, in the original model, one epoch is seen as a loop over batch_size x num_batch records (128x5000 or 128x10000 or 128x30000), so it means that there are records used more than once for 1 epoch. In my model, 1 epoch is a complete loop over the whole dataset, where each record is used exactly once.

Training

If you want to train a model with common dataset and default parameters, you could run:

  • python train.py -d dataset_name: For example, python train.py -d dbpedia

If you want to train a model with common dataset and your preference parameters, like optimizer and learning rate, you could run:

  • python train.py -d dataset_name -p optimizer_name -l learning_rate: For example, python train.py -d dbpedia -p sgd -l 0.01

If you want to train a model with your own dataset, you need to specify the path to input and output folders:

  • python train.py -i path/to/input/folder -o path/to/output/folder

You could find all trained models I have trained in link

Experiments:

I run experiments in 2 machines, one with NVIDIA TITAN X 12gb GPU and the other with NVIDIA quadro 6000 24gb GPU. For small and large models, you need about 1.6 gb GPU and 3.5 gb GPU respectively.

Results for test set are presented as follows: A(B):

  • A is accuracy reproduced here.

  • B is accuracy reported in the paper. I used SGD and Adam as optimizer, with different initial learning rate. You could find out specific configuration for each experiment in output/datasetname_scale/logs.txt, for example output/ag_news_small/logs.txt

Maximally, each experiment would be run for 20 epochs. Early stopping was applied with patience is set to 3 as default.

SizeSmallLarge
ag_news86.71(84.35)88.13(87.18)
sogu_news95.08(91.35)94.90(95.12)
db_pedia97.53(98.02)97.60(98.27)
yelp_polarity91.40(93.47)93.50(94.11)
yelp_review56.09(59.16)58.93(60.38)
yahoo_answer65.91(70.16)64.93(70.45)
amazon_review56.77(59.47)59.01(58.69)
amazon_polarity92.54(94.50)93.85(94.49)

The training/test loss/accuracy curves for each dataset's experiments (figures for small model are on the left side) are shown below:

  • ag_news

图片.png

图片.png

图片.png

图片.png

You could find detail log of each experiment containing loss, accuracy and confusion matrix at the end of each epoch in output/datasetname_scale/logs.txt, for example output/ag_news_small/logs.txt

上一篇:neural_chinese_transliterator

下一篇:pytorch-cifar100

用户评价
全部评价

热门资源

  • Keras-ResNeXt

    Keras ResNeXt Implementation of ResNeXt models...

  • seetafaceJNI

    项目介绍 基于中科院seetaface2进行封装的JAVA...

  • spark-corenlp

    This package wraps Stanford CoreNLP annotators ...

  • capsnet-with-caps...

    CapsNet with capsule-wise convolution Project ...

  • inferno-boilerplate

    This is a very basic boilerplate example for pe...