you will get result like this: [[0.98687243 0.01312758]]
Parameter
name
type
detail
gpu_no
int
which gpu will be use to init bert ner graph
log_dir
str
log dir
verbose
bool
whether show tensorflow log
bert_sim_model
str
bert sim model path
Train
Code
In this project, I just use bert pre model to fine tuning, so I just use their original code. I try to create new one, but the new one just same as the original code, so I given up.
Dataset
Because of my domain work, my work is based on judicial examination education, so I didn't use common dataset, my dataset were labeled by manual work, it include 80000+, 50000+ are similar, 30000+ are dissimilar, because of the privacy, I can't open source of this dataset
Suggest:
In original code, they just got the model pool output, I think there may be other ways to increase the accuracy, I tried some ways to increase the accuracy, but I found one, just concat the [CLS] embedding of the fourth from bottom to tailender in encoder output list, if you want to use my way, just do like this。
Delete the following code
output_layer = model.get_pooled_output()
Use the following code, it can increase the accuracy 1%.
output_layer = tf.concat([tf.squeeze(model.all_encoder_layers[i][:, 0:1, :], axis=1) for i in range(-4, 0, 1)], axis=-1)