资源论文Stratified Sampling Meets Machine Learning

Stratified Sampling Meets Machine Learning

2020-03-06 | |  61 |   38 |   0

Abstract

This paper solves a specialized regression problem to obtain sampling probabilities for records in databases. The goal is to sample a small set of records over which evaluating aggregate queries can be done both efficiently and accurately. We provide a principled and provable solution for this problem; it is parameterless and requires no data insights. Unlike standard regression problems, the loss is inversely proportional to the regressed-to values. Moreover, a cost zero solution always exists and can only be excluded by hard budget constraints. A unique form of regularization is also needed. We provide an efficient and simple regularized Empirical Risk Minimization (ERM) algorithm along with a theoretical generalization result. Our extensive experimental results significantly improve over both uniform sampling and standard stratified sampling which are de-facto the industry standards.

上一篇:Dynamic Memory Networks for Visual and Textual Question Answering

下一篇:Fast K-Means with Accurate Bounds

用户评价
全部评价

热门资源

  • The Variational S...

    Unlike traditional images which do not offer in...

  • Learning to Predi...

    Much of model-based reinforcement learning invo...

  • Stratified Strate...

    In this paper we introduce Stratified Strategy ...

  • Learning to learn...

    The move from hand-designed features to learned...

  • A Mathematical Mo...

    Direct democracy, where each voter casts one vo...