Abstract
Authorship verification is the task of determining whether or not two texts were written
by the same author. This paper deals with
the adversary task, called authorship obfuscation: Preventing verification by altering a tobe-obfuscated text. We introduce an approach
that (1) models writing style difference as the
Jensen-Shannon distance between the character n-gram distributions of texts, and (2) manipulates an author’s subconsciously encoded
writing style in a sophisticated manner using
heuristic search. To obfuscate, we explore the
huge space of textual variants in order to find
a paraphrased version of the to-be-obfuscated
text that has a sufficient Jensen-Shannon distance at minimal costs in terms of text quality
loss. We analyze, quantify, and illustrate the
rationale of this approach, define paraphrasing
operators, derive obfuscation thresholds, and
develop an effective obfuscation framework.
Our authorship obfuscation approach defeats
state-of-the-art verification approaches, including unmasking and compression models, while
keeping text changes at a minimum