资源算法Bert_Chinese_Ner_pytorch

Bert_Chinese_Ner_pytorch

2020-02-18 | |  40 |   0 |   0

Bert_Chinese_NER_By_pytorch

A implements of Named Recognition Entity based on BERT in the framework of Pytorch

Data

Tips

  • Given the BertTokenizer use a greedy longest-match-first algorithm to perform tokenization using a given vocabulary, a word is likely to be splitted into more than one spieces. For example, the input "unaffable" is splitted into ["un", "##aff", "##able"]. This means the number of words processed by BertTokenizer is generally larger than that of the raw inputs. Jocob keeps the first sub_word as the feature sent to crf in his paper, we do so. In fact, in Chinese NER, this case is few. But for robustness, we use a "output_mask" (see preprocessing.data_processor.convert_examples_to_features) to filter the non-first sub_word.

  • Note that if your raw inputs have a word as: "谢ing", it would be tokenized as "谢 ing", instead of "谢 ##ing", which couldn't be filtered by the "output_mask". So we need perform another preprocessing on our raw data to avoid this.

result

classify_report: precision recall f1-score support

  0       0.98      0.97      0.98     43652
  1       0.98      0.98      0.98     84099
  2       1.00      1.00      1.00     43323
  3       1.00      1.00      1.00    117899
  4       0.96      0.88      0.92      3569
  5       0.86      0.99      0.92      7847
  6       0.99      0.99      0.99     49667
  7       0.98      0.99      0.99     78875
  8       1.00      1.00      1.00   4055215
  avg / total 1.00 1.00 1.00 4484146

Epoch 2 - train_loss: 0.047266 - eval_loss: 0.037106 - train_acc:0.997468 - eval_acc:0.997581 - eval_f1:0.973249






上一篇:BERT-NER-TF

下一篇:keras-bert-ner

用户评价
全部评价

热门资源

  • Keras-ResNeXt

    Keras ResNeXt Implementation of ResNeXt models...

  • seetafaceJNI

    项目介绍 基于中科院seetaface2进行封装的JAVA...

  • spark-corenlp

    This package wraps Stanford CoreNLP annotators ...

  • capsnet-with-caps...

    CapsNet with capsule-wise convolution Project ...

  • inferno-boilerplate

    This is a very basic boilerplate example for pe...