资源算法pytorch-sgns

pytorch-sgns

2019-09-17 | |  181 |   0 |   0

PyTorch SGNS

Word2Vec's SkipGramNegativeSampling in Python.

Yet another but quite general negative sampling loss implemented in PyTorch.

It can be used with ANY embedding scheme! Pretty fast, I bet.

vocab_size = 20000word2vec = Word2Vec(vocab_size=vocab_size, embedding_size=300)sgns = SGNS(embedding=word2vec, vocab_size=vocab_size, n_negs=20)optim = Adam(sgns.parameters())for batch, (iword, owords) in enumerate(dataloader):
    loss = sgns(iword, owords)
    optim.zero_grad()
    loss.backward()
    optim.step()

New: support negative sampling based on word frequency distribution (0.75th power) and subsampling (resolving word frequency imbalance).

To test this repo, place a space-delimited corpus as data/corpus.txt then run python preprocess.py and python train.py --weights --cuda (use -h option for help).

上一篇:ORN

下一篇:PyramidNet-PyTorch

用户评价
全部评价

热门资源

  • DuReader_QANet_BiDAF

    Machine Reading Comprehension on DuReader Usin...

  • ETD_cataloguing_a...

    ETD catalouging project using allennlp

  • honk-models

    Honk models google-speech-dataset.pt: model fo...

  • allennlp_extras

    allennlp_extras Some utilities build on top of...

  • allennlp-dureader

    An Apache 2.0 NLP research library, built on Py...