A Semi-Markov Structured Support Vector Machine Model forHigh-Precision Named Entity Recognition
Abstract
Named entity recognition (NER) is the backbone of many NLP solutions. F1 score, the
harmonic mean of precision and recall, is often
used to select/evaluate the best models. However, when precision needs to be prioritized
over recall, a state-of-the-art model might not
be the best choice. There is little in the literature that directly addresses training-time modifications to achieve higher precision information extraction. In this paper, we propose a
neural semi-Markov structured support vector machine model that controls the precisionrecall trade-off by assigning weights to different types of errors in the loss-augmented inference during training. The semi-Markov property provides more accurate phrase-level predictions, thereby improving performance. We
empirically demonstrate the advantage of our
model when high precision is required by comparing against strong baselines based on CRF.
In our experiments with the CoNLL 2003
dataset, our model achieves a better precisionrecall trade-off at various precision levels.