资源算法 nlp_made_easy

nlp_made_easy

2020-03-02 | |  76 |   0 |   0

NLP Made Easy

Simple code notes for explaining NLP building blocks

  • Subword Segmentation Techniques

    • Let's compare various tokenizers, i.e., nltk, BPE, SentencePiece, and Bert tokenizer.

  • Beam Decoding

    • Beam decoding is essential for seq2seq tasks. But it's notoriously complicated to implement. Here's a relatively easy one, batchfying candidates.

  • How to get the last hidden vector of rnns properly

    • We'll see how to get the last hidden states of Rnns in Tensorflow and PyTorch.

  • Tensorflow seq2seq template based on the g2p task

    • We'll write a simple template for seq2seq using Tensorflow. For demonstration, we attack the g2p task. G2p is a task of converting graphemes (spelling) to phonemes (pronunciation). It's a very good source for this purpose as it's simple enough for you to up and run.

  • PyTorch seq2seq template based on the g2p task

    • We'll write a simple template for seq2seq using PyTorch. For demonstration, we attack the g2p task. G2p is a task of converting graphemes (spelling) to phonemes (pronunciation). It's a very good source for this purpose as it's simple enough for you to up and run.

  • [Attention mechanism](Work in progress)

  • POS-tagging with BERT Fine-tuning

    • BERT is known to be good at Sequence tagging tasks like Named Entity Recognition. Let's see if it's true for POS-tagging.

  • Dropout in a minute

    • Dropout is arguably the most popular regularization technique in deep learning. Let's check again how it work.

  • Ngram LM vs. rnnlm(WIP)

  • Data Augmentation for Quora Question Pairs

    • Let's see if it's effective to augment training data in the task of quora question pairs.


上一篇:made-with-riot

下一篇:made-mistakes

用户评价
全部评价

热门资源

  • DuReader_QANet_BiDAF

    Machine Reading Comprehension on DuReader Usin...

  • ETD_cataloguing_a...

    ETD catalouging project using allennlp

  • allennlp_extras

    allennlp_extras Some utilities build on top of...

  • allennlp-dureader

    An Apache 2.0 NLP research library, built on Py...

  • honk-honk-motherf...

    honk-honk-motherfucker