ABSA-BERT-pair
Codes and corpora for paper "Utilizing BERT for Aspect-Based Sentiment Analysis via Constructing Auxiliary Sentence" (NAACL 2019)
pytorch: 1.0.0
python: 3.7.1
tensorflow: 1.13.1 (only needed for converting BERT-tensorflow-model to pytorch-model)
numpy: 1.15.4
nltk
sklearn
Since the link given in the dataset released paper has failed, we use the dataset mirror listed in NLP-progress and fix some mistakes (there are duplicate aspect data in several sentences). See directory: data/sentihood/
.
Run following commands to prepare datasets for tasks:
cd generate/ bash make.sh sentihood
Train Data is available in SemEval-2014 ABSA Restaurant Reviews - Train Data and Gold Test Data is available in SemEval-2014 ABSA Test Data - Gold Annotations. See directory: data/semeval2014/
.
Run following commands to prepare datasets for tasks:
cd generate/ bash make.sh semeval
Download BERT-Base (Google's pre-trained models) and then convert a tensorflow checkpoint to a pytorch model.
For example:
python convert_tf_checkpoint_to_pytorch.py --tf_checkpoint_path uncased_L-12_H-768_A-12/bert_model.ckpt --bert_config_file uncased_L-12_H-768_A-12/bert_config.json --pytorch_dump_path uncased_L-12_H-768_A-12/pytorch_model.bin
For example, BERT-pair-NLI_M task on SentiHood dataset:
CUDA_VISIBLE_DEVICES=0,1,2,3 python run_classifier_TABSA.py --task_name sentihood_NLI_M --data_dir data/sentihood/bert-pair/ --vocab_file uncased_L-12_H-768_A-12/vocab.txt --bert_config_file uncased_L-12_H-768_A-12/bert_config.json --init_checkpoint uncased_L-12_H-768_A-12/pytorch_model.bin --eval_test --do_lower_case --max_seq_length 512 --train_batch_size 24 --learning_rate 2e-5 --num_train_epochs 6.0 --output_dir results/sentihood/NLI_M --seed 42
Note:
For SentiHood, --task_name
must be chosen in sentihood_NLI_M
, sentihood_QA_M
, sentihood_NLI_B
, sentihood_QA_B
and sentihood_single
. And for sentihood_single
task, 8 different tasks (use datasets generated in step 1, see directory data/sentihood/bert-single
) should be trained separately and then evaluated together.
For SemEval-2014, --task_name
must be chosen in semeval_NLI_M
, semeval_QA_M
, semeval_NLI_B
, semeval_QA_B
and semeval_single
. And for semeval_single
task, 5 different tasks (use datasets generated in step 1, see directory : data/semeval2014/bert-single
) should be trained separately and then evaluated together.
Evaluate the results on test set (calculate Acc, F1, etc.).
For example, BERT-pair-NLI_M task on SentiHood dataset:
python evaluation.py --task_name sentihood_NLI_M --pred_data_dir results/sentihood/NLI_M/test_ep_4.txt
Note:
As mentioned in step 3, for sentihood_single
task, 8 different tasks should be trained separately and then evaluated together. --pred_data_dir
should be a directory that contains 8 files named as follows: loc1_general.txt
, loc1_price.txt
, loc1_safety.txt
, loc1_transit.txt
, loc2_general.txt
, loc2_price.txt
, loc2_safety.txt
and loc2_transit.txt
As mentioned in step 3, for semeval_single
task, 5 different tasks should be trained separately and then evaluated together. --pred_data_dir
should be a directory that contains 5 files named as follows: price.txt
, anecdotes.txt
, food.txt
, ambience.txt
and service.txt
For the rest 8 tasks, --pred_data_dir
should be a file just like that in the example.
@inproceedings{sun-etal-2019-utilizing, title = "Utilizing {BERT} for Aspect-Based Sentiment Analysis via Constructing Auxiliary Sentence", author = "Sun, Chi and Huang, Luyao and Qiu, Xipeng", booktitle = "Proceedings of the 2019 Conference of the North {A}merican Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)", month = jun, year = "2019", address = "Minneapolis, Minnesota", publisher = "Association for Computational Linguistics", url = "https://www.aclweb.org/anthology/N19-1035", pages = "380--385" }
上一篇:bert_score
下一篇: bert.erl
还没有评论,说两句吧!
热门资源
seetafaceJNI
项目介绍 基于中科院seetaface2进行封装的JAVA...
spark-corenlp
This package wraps Stanford CoreNLP annotators ...
Keras-ResNeXt
Keras ResNeXt Implementation of ResNeXt models...
capsnet-with-caps...
CapsNet with capsule-wise convolution Project ...
inferno-boilerplate
This is a very basic boilerplate example for pe...
智能在线
400-630-6780
聆听.建议反馈
E-mail: support@tusaishared.com