Abstract
We study two problems in high-dimensional robust statistics: robust mean estimation and outlier detection. In robust mean estimation the goal is to estimate the mean µ of a distribution on given n independent samples, an -fraction of which have been corrupted by a malicious adversary. In outlier detection the goal is to assign an outlier score to each element of a data set such that elements more likely to be outliers are assigned higher scores. Our algorithms for both problems are based on a new outlier scoring method we call QUE-scoring based on quantum entropy regularization. For robust mean estimation, this yields the first algorithm with optimal error rates and nearly-linear running time O(nd) e in all 6 parameters, improving on the previous fastest running time . For outlier detection, we evaluate the performance of QUE-scoring via extensive experiments on synthetic and real data, and demonstrate that it often performs better than previously proposed algorithms. Code for these experiments is available at https://github.com/twistedcubic/que-outlier-detection.