pytorch-pretrained-BERT_annotation

2020-04-13 |

|

73 |

0 |

0

pytorch-pretrained-BERT_annotation

PyTorch Pretrained Bert Annotation

This BERT annotation repo is for my personal study.

The raw README of PyTorch Pretrained Bert is here.
A very nice PPT to help understanding.
Synthetic Self-Training PPT.

Arch

The BertModel and BertForMaskedLM arch.

BertModel Arch

BertEmbeddings

word_embeddings: Embedding(30522, 768)
position_embeddings: Embedding(512, 768)
token_type_embeddings: Embedding(2, 768)
LayerNorm: BertLayerNorm()
dropout: Dropout(p=0.1)

BertEncoder

BertAttention
BertIntermediate
BertOutput
dense: Linear(in_features=768, out_features=768, bias=True)
LayerNorm: BertLayerNorm()
dropout: Dropout(p=0.1)
query: Linear(in_features=768, out_features=768, bias=True)
key: Linear(in_features=768, out_features=768, bias=True)
value: Linear(in_features=768, out_features=768, bias=True)
dropout: Dropout(p=0.1)
BertSelfAttention
BertSelfOutput

dense: Linear(in_features=768, out_features=3072, bias=True)
activation: gelu
dense: Linear(in_features=3072, out_features=768, bias=True)
LayerNorm: BertLayerNorm()
dropout: Dropout(p=0.1)

BertLayer: (12 layers)

BertPooler

dense: Linear(in_features=768, out_features=768, bias=True)
activation: Tanh()

BertForMaskedLM Arch

BertModel

dense: Linear(in_features=768, out_features=768, bias=True)
activation: Tanh()
BertLayer: (12 layers)
dense: Linear(in_features=3072, out_features=768, bias=True)
LayerNorm: BertLayerNorm()
dropout: Dropout(p=0.1)
dense: Linear(in_features=768, out_features=3072, bias=True)
activation: gelu

BertSelfAttention
BertSelfOutput
query: Linear(in_features=768, out_features=768, bias=True)
key: Linear(in_features=768, out_features=768, bias=True)
value: Linear(in_features=768, out_features=768, bias=True)
dropout: Dropout(p=0.1)

dense: Linear(in_features=768, out_features=768, bias=True)
LayerNorm: BertLayerNorm()
dropout: Dropout(p=0.1)
BertAttention
BertIntermediate
BertOutput

word_embeddings: Embedding(30522, 768)
position_embeddings: Embedding(512, 768)
token_type_embeddings: Embedding(2, 768)
LayerNorm: BertLayerNorm()
dropout: Dropout(p=0.1)
BertEmbeddings
BertEncoder
BertPooler

BertOnlyMLMHead

transform: BertPredictionHeadTransform
decoder: Linear(in_features=768, out_features=30522, bias=False)
dense: Linear(in_features=768, out_features=768, bias=True)
LayerNorm: BertLayerNorm()
BertLMPredictionHead

上一篇：kubeflow-development

下一篇：pytorch_pretrained_BERT

用户评价

全部评价

还没有评论，说两句吧！

热门资源

TensorFlow-Course

This repository aims to provide simple and read...
seetafaceJNI

项目介绍基于中科院seetaface2进行封装的JAVA...
mxnet_VanillaCNN

This is a mxnet implementation of the Vanilla C...
DuReader_QANet_BiDAF

Machine Reading Comprehension on DuReader Usin...
Klukshu-Sockeye-...

KLUKSHU SOCKEYE PROJECTS 2016 This repositor...

智能在线

400-630-6780
聆听.建议反馈

E-mail: support@tusaishared.com