资源论文Short Text Classification Improved by Learning Multi-Granularity Topics

Short Text Classification Improved by Learning Multi-Granularity Topics

2019-11-12 | |  47 |   38 |   0

Abstract Understanding the rapidly growing short text is very important. Short text is different from traditional documents in its shortness and sparsity, which hinders the application of conventional machine learning and text mining algorithms. Two major approaches have been exploited to enrich the representation of short text. One is to fetch contextual information of a short text to directly add more text; the other is to derive latent topics from existing large corpus, which are used as features to enrich the representation of short text. The latter approach is elegant and ef?cient in most cases. The major trend along this direction is to derive latent topics of certain granularity through well-known topic models such as latent Dirichlet allocation (LDA). However, topics of certain granularity are usually not suf?cient to set up effective feature spaces. In this paper, we move forward along this direction by proposing an method to leverage topics at multiple granularity, which can model the short text more precisely. Taking short text classi?cation as an example, we compared our proposed method with the state-of-the-art baseline over one open data set. Our method reduced the classi?cation error by 20.25 % and 16.68 % respectively on two classi?ers.

上一篇:Semantic Relationship Discovery with Wikipedia Structure

下一篇:Online Latent Structure Training for Language Acquisition

用户评价
全部评价

热门资源

  • Learning to Predi...

    Much of model-based reinforcement learning invo...

  • Stratified Strate...

    In this paper we introduce Stratified Strategy ...

  • The Variational S...

    Unlike traditional images which do not offer in...

  • A Mathematical Mo...

    Direct democracy, where each voter casts one vo...

  • Rating-Boosted La...

    The performance of a recommendation system reli...