QANet_keras

资源分类

QANet_keras

2020-02-20 |

32 |

0 |

QANet_keras

QANet in keras

QANet: https://arxiv.org/abs/1804.09541

This keras model refers to QANet in tensorflow (https://github.com/NLPLearn/QANet).

I find that the conv based multi-head attention in tensor2tensor (https://github.com/NLPLearn/QANet/blob/master/layers.py) performs 3%~4% better than the multiplying matrices based one in (https://github.com/bojone/attention/blob/master/attention_keras.py).

Pipline

Download squad data dev-v1.1.json and train-v1.1.json from (https://rajpurkar.github.io/SQuAD-explorer/) to the folder ./original_data.
Download glove.840B.300d.txt from (https://nlp.stanford.edu/projects/glove/) to the folder ./original_data.
Run python preprocess.py to get the wordpiece based preprocessed data.
Run python train_QANet.py to start training.

Updates

I find that EMA in keras is hard to implement with GPU, and the training speed is greatly affected by it in keras. Besides, it's hard to add the slice op in keras too, so the training speed is further slower(cost about twice as much time compared with the optimized tensorflow version...).

Now, the gpu-version EMA can work perporly in keras.

Results

All models are set in 8 heads, 128 filters.

setting	epoch	EM/F1
batch_size=24	11	66.24% / 76.75%
batch_size=24 + ema_decay=0.9999	14	69.51% / 79.13%
batch_size=24 + ema_decay=0.9999 + wordpiece	17	70.07% / 79.52%
batch_size=24 + ema_decay=0.9999 + wordpiece + Cove	13	71.48% / 80.85%

上一篇：QANet_dureader

下一篇：DuReader_QANet_BiDAF

用户评价

全部评价

还没有评论，说两句吧！

热门资源

seetafaceJNI

项目介绍基于中科院seetaface2进行封装的JAVA...
spark-corenlp

This package wraps Stanford CoreNLP annotators ...
Keras-ResNeXt

Keras ResNeXt Implementation of ResNeXt models...
capsnet-with-caps...

CapsNet with capsule-wise convolution Project ...
shih-styletransfer

shih-styletransfer Code from Style Transfer ...

智能在线

400-630-6780
聆听.建议反馈

E-mail: support@tusaishared.com