QAnet Chinese Reading Comprehension
demo 畫面 https://github.com/sinlin0908/QAmodel_Demo
Reference
Environment
OS: Ubuntu 18.04 LTS
GPU: GTX 1080Ti 11G
CPU: i7-4770
RAM: 16G
Requirement
Python 3.6
NumPy
tqdm
Tensorflow>=1.5
Jieba
opencc
bottle
Data Set
Modify
prepro file
word token: use jieba.cut(context,cut_all=False)
_getword()
delete word.lower(), word.capitalize(), word.upper()
embedding dat set
Config
word size: 1292607
hidden size: 128
num_head: 8
batch size: 12
char_emb_size : 300d
pretrain_char -> True
Usage
python config.py --mode prepro
train
python config.py --mode train
test
python config.py --mode test
demo:
python config.py --mode demo
Performance
F1: score 70.0496230556
EM: 70.0257658173
cost: 6 hours
use GPU memory : 9.4G
Tensorbord
tensorboard --logdir=./
Comparison Chart
number | hidden size | attention head | step | data size | word embedding size | F1 | EM |
---|
1 | 96 | 1 | 60000 | 15320 | 636086 | 51 | 51 |
2 | 96 | 1 | 60000 | 26936 | 636086 | 63 | 63 |
3 | 128 | 8 | 60000 | 26936 | 1292607 | 70 | 70 |
4 | 128 | 8 | 150000 | 26936 | 1292607 | 69 | 69 |
notice: character embedding has a little effect