资源论文Near-Optimal Policies for Dynamic Multinomial Logit Assortment Selection Models

Near-Optimal Policies for Dynamic Multinomial Logit Assortment Selection Models

2020-02-14 | |  60 |   40 |   0

Abstract 

In this paper we consider the dynamic assortment selection problem under an uncapacitated multinomial-logit (MNL) model. By carefully analyzing a revenue potential function, we show that a trisection based algorithm achieves an item-independent regret bound of image.png which matches information theoretical lower bounds up to iterated logarithmic terms. Our proof technique draws tools from the unimodal/convex bandit literature as well as adaptive confidence parameters in minimax multi-armed bandit problems. Keywords: dynamic assortment planning, multinomial logit choice model, trisection algorithm, regret analysis.

上一篇:Probabilistic Neural Programmed Networks for Scene Generation

下一篇:Learning Signed Determinantal Point Processes through the Principal Minor Assignment Problem

用户评价
全部评价

热门资源

  • Learning to Predi...

    Much of model-based reinforcement learning invo...

  • Stratified Strate...

    In this paper we introduce Stratified Strategy ...

  • The Variational S...

    Unlike traditional images which do not offer in...

  • A Mathematical Mo...

    Direct democracy, where each voter casts one vo...

  • Rating-Boosted La...

    The performance of a recommendation system reli...