Context-specific language modeling for human trafficking detection from
online advertisements
Abstract
Human trafficking is a worldwide crisis. Traf-
fickers exploit their victims by anonymously
offering sexual services through online advertisements. These ads often contain clues that
law enforcement can use to separate out potential trafficking cases from volunteer sex advertisements. The problem is that the sheer
volume of ads is too overwhelming for manual processing. Ideally, a centralized semiautomated tool can be used to assist law enforcement agencies with this task. Here, we
present an approach using natural language
processing to identify trafficking ads on these
websites. We propose a classifier by integrating multiple text feature sets, including the
publicly available pre-trained textual language
model Bi-directional Encoder Representation
from transformers (BERT). In this paper, we
demonstrate that a classifier using this composite feature set has significantly better performance compared to any single feature set
alone.