X-BERT
This is a README for the experimental code in our paper
Wei-Cheng Chang, Hsiang-Fu Yu, Kai Zhong, Yiming Yang, Inderjit Dhillon
Preprint 2019
> conda create -n xbert-env python=3.7 --file environment.yml > source activate xbert-env > (xbert-env) pip install -e .
**Notice: the following examples are executed under the > (xbert-env)
conda virtual environment
We demonstrate how to reproduce the evaluation results in our paper by downloading the raw dataset and pretrained models.
Change directory into ./datasets folder, download and unzip each dataset
cd ./datasets bash download-data.sh Eurlex-4K bash download-data.sh Wiki10-31K bash download-data.sh AmazonCat-13K bash download-data.sh Wiki-500Kcd ../
Each dataset contains the following files
X.trn.npz, X.val.npz, X.tst.npz
: data tf-idf sparse matrix
Y.trn.npz, Y.val.npz, Y.tst.npz
: label sparse matrix
L.elmo.npz, L.pifa.npz
: label embedding matrix
mlc2seq/{train,valid.test}.txt
: each line is label_ids tab raw_text
mlc2seq/label_vocab.txt
: each line is label_count tab label_text
Change directory into ./pretrained_models folder, download and unzip models for each dataset
cd ./pretrained_models bash download-models.sh Eurlex-4K bash download-models.sh Wiki10-31K bash download-models.sh AmazonCat-13K bash download-models.sh Wiki-500Kcd ../
load indexing codes, generate predicted codes from pretrained matchers, and predict labels from pretrained rankers.
export DATASETS=Eurlex-4K bash scripts/run_linear_eval.sh ${DATASETS}bash scripts/run_xbert_eval.sh ${DATASETS}bash scripts/run_xttention_eval.sh ${DATASETS}
DATASETS
: the dataset name such as Eurlex-4K, Wiki10-31K, AmazonCat-13K, or Wiki-500K.
python -m xbert.evaluator -y [path to Y.tst.npz] -e prediction-path [prediction-path ... ]
For example, given the ranker prediction files (tst.pred.xbert.npz),
python -m xbert.evaluator -y datasets/Eurlex-4K/Y.tst.npz -e pretrained_models/Eurlex-4K/*/ranker/tst.pred.xbert.npz
which computes the metric for the X-BERT ensemble of the label_emb={elmo,pifa} and seed={0,1,2} combinations.
We support ELMo and PIFA label embedding given the file label_vocab.txt.
cd ./datasets/ python label_embedding.py --dataset ${DATASET} --embed-type ${LABEL_EMB}cd ../
DATASETS
: the customized dataset name which contains the necessary files as described in [download dataset section]
LABEL_EMB
: currently support either elmo or pifa
Before training deep neural matcher, we first obtain indexed label codes and linear ranker. The following example assume to have a similar structure as the pretrained_models
folder.
An example usage would be:
OUTPUT_DIR=save_models/${DATASET}/${LABEL_EMB}-a${ALGO}-s${SEED}mkdir -p ${OUTPUT_DIR}/indexer python -m xbert.indexer -i datasets/${DATASET}/L.${LABEL_EMB}.npz -o ${OUTPUT_DIR}/indexer -d ${DEPTH} --algo ${ALGO} --seed ${SEED} --max-iter 20
ALGO
: clustering algorithm. 0 for KMEANS, 5 for SKMEANS
DEPTH
: The depth of hierarchical 2-means
SEED
: random seed
Before training, we need to generate preprocessed data as binary pickle files.
OUTPUT_DIR=save_models/${DATASET}/${LABEL_EMB}-a${ALGO}-s${SEED}mkdir -p $OUTPUT_DIR/data-bin-${MATCHER}CUDA_VISIBLE_DEVICES=${GPUS} python -m xbert.preprocess -m ${MATCHER} -i datasets/${DATASET} -c ${OUTPUT_DIR}/indexer/code.npz -o ${OUTPUT_DIR}/data-bin-${MATCHER}
-GPUS
: the available gpu_id -MATCHER
: currently support xttention
or xbert
Set hyper-parameters properly, an example would be
GPUS=0,1,2,3,4,5 MATCHER=xbert TRAIN_BATCH_SIZE=36 EVAL_BATCH_SIZE=64 LOG_INTERVAL=1000 EVAL_INTERVAL=10000 NUM_TRAIN_EPOCHS=12 LEARNING_RATE=5e-5 WARMUP_RATE=0.1
Users can also check scripts/run_xbert.sh
to see the detailed setting for each datasets used in the paper.
We are now ready to run the xbert models:
OUTPUT_DIR=save_models/${DATASET}/${LABEL_EMB}-a${ALGO}-s${SEED}mkdir -p ${OUTPUT_DIR}/matcher/${MATCHER}CUDA_VISIBLE_DEVICES=${GPUS} python -u -m xbert.matcher.bert -i ${OUTPUT_DIR}/data-bin-${MATCHER}/data_dict.pt -o ${OUTPUT_DIR}/matcher/${MATCHER} --bert_model bert-base-uncased --do_train --do_eval --stop_by_dev --learning_rate ${LEARNING_RATE} --warmup_proportion ${WARMUP_RATE} --train_batch_size ${TRAIN_BATCH_SIZE} --eval_batch_size ${EVAL_BATCH_SIZE} --num_train_epochs ${NUM_TRAIN_EPOCHS} --log_interval ${LOG_INTERVAL} --eval_interval ${EVAL_INTERVAL} > ${OUTPUT_DIR}/matcher/${MATCHER}.log
Set hyper-parameters properly, an example would be
GPUS=0 MATCHER=xttention TRAIN_BATCH_SIZE=128 LOG_INTERVAL=100 EVAL_INTERVAL=1000 NUM_TRAIN_EPOCHS=10
Users can also check scripts/run_xttention.sh
to see the detailed setting for each datasets used in the paper.
We are now ready to run the xttention models:
OUTPUT_DIR=save_models/${DATASET}/${LABEL_EMB}-a${ALGO}-s${SEED}mkdir -p ${OUTPUT_DIR}/matcher/${MATCHER}CUDA_VISIBLE_DEVICES=${GPUS} python -u -m xbert.matcher.attention -i ${OUTPUT_DIR}/data-bin-${MATCHER}/data_dict.pt -o ${OUTPUT_DIR}/matcher/${MATCHER} --do_train --do_eval --cuda --stop_by_dev --train_batch_size ${TRAIN_BATCH_SIZE} --num_train_epochs ${NUM_TRAIN_EPOCHS} --log_interval ${LOG_INTERVAL} --eval_interval ${EVAL_INTERVAL} > ${OUTPUT_DIR}/matcher/${MATCHER}.log
Predict the indices using trained XBERT or Xttention model.
OUTPUT_DIR=save_models/${DATASET}/${LABEL_EMB}-a${ALGO}-s${SEED}CUDA_VISIBLE_DEVICES=${GPUS} python -u -m xbert.matcher.bert -i ${OUTPUT_DIR}/data-bin-${MATCHER}/data_dict.pt -o ${OUTPUT_DIR}/matcher/${MATCHER} --bert_model bert-base-uncased --do_eval --init_checkpoint_dir ${OUTPUT_DIR}/matcher/${MATCHER}
The prediction output will be stored in
${OUTPUT_DIR}/matcher/${MATCHER}/C_eval_pred.npz
An example usage would be:
OUTPUT_DIR=save_models/${DATASET}/${LABEL_EMB}-a${ALGO}-s${SEED}mkdir -p $OUTPUT_DIR/ranker python -m xbert.ranker train -x datasets/${DATASET}/X.trn.npz -y datasets/${DATASET}/Y.trn.npz -c ${OUTPUT_DIR}/indexer/code.npz -o ${OUTPUT_DIR}/ranker
An example usage would be:
OUTPUT_DIR=save_models/${DATASET}/${LABEL_EMB}-a${ALGO}-s${SEED}mkdir -p $OUTPUT_DIR/ranker python -m xbert.ranker predict -m ${OUTPUT_DIR}/ranker -x datasets/${DATASET}/X.tst.npz -y datasets/${DATASET}/Y.tst.npz -c ${OUTPUT_DIR}/matcher/${MATCHER}/C_eval_pred.npz -o ${OUTPUT_DIR}/ranker/tst.prediction.npz
Some portions of this repo is borrowed from the following repos:
下一篇: adapter-bert
还没有评论,说两句吧!
热门资源
seetafaceJNI
项目介绍 基于中科院seetaface2进行封装的JAVA...
spark-corenlp
This package wraps Stanford CoreNLP annotators ...
Keras-ResNeXt
Keras ResNeXt Implementation of ResNeXt models...
capsnet-with-caps...
CapsNet with capsule-wise convolution Project ...
inferno-boilerplate
This is a very basic boilerplate example for pe...
智能在线
400-630-6780
聆听.建议反馈
E-mail: support@tusaishared.com