Abstract While paragraph embedding models are remarkably effective for downstream classifification tasks, what they learn and encode into a single vector remains opaque. In this paper, we investigate a state-of-the-art paragraph embedding method proposed by Zhang et al. (2017) and discover that it cannot reliably tell whether a given sentence occurs in the input paragraph or not. We formulate a sentence content task to probe for this basic linguistic property and fifind that even a much simpler bag-of-words method has no trouble solving it. This result motivates us to replace the reconstructionbased objective of Zhang et al. (2017) with our sentence content probe objective in a semisupervised setting. Despite its simplicity, our objective improves over paragraph reconstruction in terms of (1) downstream classifification accuracies on benchmark datasets, (2) faster training, and (3) better generalization ability