Predicting Humorousness and Metaphor Novelty
with Gaussian Process Preference Learning
Abstract
The inability to quantify key aspects of creative language is a frequent obstacle to natural
language understanding. To address this, we
introduce novel tasks for evaluating the creativeness of language—namely, scoring and
ranking text by humorousness and metaphor
novelty. To sidestep the difficulty of assigning
discrete labels or numeric scores, we learn from
pairwise comparisons between texts. We introduce a Bayesian approach for predicting humorousness and metaphor novelty using Gaussian
process preference learning (GPPL), which
achieves a Spearman’s ? of 0.56 against gold
using word embeddings and linguistic features.
Our experiments show that given sparse, crowdsourced annotation data, ranking using GPPL
outperforms best–worst scaling. We release
a new dataset for evaluating humour containing 28,210 pairwise comparisons of 4030 texts,
and make our software freely available