资源论文Large Dataset and Language Model Fun-Tuning for Humor Recognition

Large Dataset and Language Model Fun-Tuning for Humor Recognition

2019-09-20 | |  105 |   44 |   0 0 0
Abstract The task of humor recognition has attracted a lot of attention recently due to the urge to process large amounts of user-generated texts and rise of conversational agents. We collected a dataset of jokes and funny dialogues in Russian from various online resources and complemented them carefully with unfunny texts with similar lexical properties. The dataset comprises of more than 300,000 short texts, which is significantly larger than any previous humor-related corpus. Manual annotation of about 2,000 items proved the reliability of the corpus construction approach. Further, we applied language model fine-tuning for text classification and obtained an F1 score of 0.91 on test set, which constitutes a considerable gain over baseline methods. The dataset is freely available for research community

上一篇:Knowledge-aware Pronoun Coreference Resolution

下一篇:Learning Attention-based Embeddings for Relation Prediction in Knowledge Graphs

用户评价
全部评价

热门资源

  • The Variational S...

    Unlike traditional images which do not offer in...

  • Learning to Predi...

    Much of model-based reinforcement learning invo...

  • Stratified Strate...

    In this paper we introduce Stratified Strategy ...

  • A Mathematical Mo...

    Direct democracy, where each voter casts one vo...

  • Joint Pose and Ex...

    Facial expression recognition (FER) is a challe...