资源算法Chinese_QAnet

Chinese_QAnet

2020-02-20 | |  35 |   0 |   0

QAnet Chinese Reading Comprehension

demo 畫面 https://github.com/sinlin0908/QAmodel_Demo

Reference

Environment

  • OS: Ubuntu 18.04 LTS

  • GPU: GTX 1080Ti 11G

  • CPU: i7-4770

  • RAM: 16G

Requirement

  • Python 3.6

  • NumPy

  • tqdm

  • Tensorflow>=1.5

  • Jieba

  • opencc

  • bottle

Data Set

  • train set : 26936 questions

  • dev set : 3524 questions

  • test 3493 : questions

Modify

prepro file

  • word token: use jieba.cut(context,cut_all=False)

  • _getword() delete word.lower(), word.capitalize(), word.upper()

embedding dat set

  • use 1292607 words 300d embedding data set

  • use 14082 characters 300d character embedding data set

Config

  • word size: 1292607

  • hidden size: 128

  • num_head: 8

  • batch size: 12

  • char_emb_size : 300d

  • pretrain_char -> True

Usage

python config.py --mode prepro

train

python config.py --mode train

test

python config.py --mode test

demo:

python config.py --mode demo

Performance

  • F1: score 70.0496230556

  • EM: 70.0257658173

  • cost: 6 hours

  • use GPU memory : 9.4G

Tensorbord

tensorboard --logdir=./

dev loss

Comparison Chart

numberhidden sizeattention headstepdata sizeword embedding sizeF1EM
196160000153206360865151
296160000269366360866363
31288600002693612926077070
412881500002693612926076969

notice: character embedding has a little effect


上一篇:tf-qanet

下一篇:QANet-for-SQuAD-2.0

用户评价
全部评价

热门资源

  • seetafaceJNI

    项目介绍 基于中科院seetaface2进行封装的JAVA...

  • spark-corenlp

    This package wraps Stanford CoreNLP annotators ...

  • Keras-ResNeXt

    Keras ResNeXt Implementation of ResNeXt models...

  • capsnet-with-caps...

    CapsNet with capsule-wise convolution Project ...

  • shih-styletransfer

    shih-styletransfer Code from Style Transfer ...