PyTorch-NLP
PyTorch-NLP, or torchnlp for short, is a library of neural network layers, text processing modules and datasets designed to accelerate Natural Language Processing (NLP) research.
Join our community, add datasets and neural network layers! Chat with us on Gitter and join the Google Group, we're eager to collaborate with you.
Make sure you have Python 3.5+ and PyTorch 0.2.0 or newer. You can then install pytorch-nlp
using pip:
pip install pytorch-nlp
The complete documentation for PyTorch-NLP is available via our ReadTheDocs website.
Add PyTorch-NLP to your project by following one of the common use cases:
Load the IMDB dataset, for example:
from torchnlp.datasets import imdb_dataset# Load the imdb training datasettrain = imdb_dataset(train=True)train[0] # RETURNS: {'text': 'For a movie that gets..', 'sentiment': 'pos'}
For example, from the neural network package, apply a Simple Recurrent Unit (SRU):
from torchnlp.nn import SRUimport torchinput_ = torch.autograd.Variable(torch.randn(6, 3, 10))sru = SRU(10, 20)# Apply a Simple Recurrent Unit to `input_`sru(input_)# RETURNS: (# output [torch.FloatTensor (6x3x20)],# hidden_state [torch.FloatTensor (2x3x20)]# )
Tokenize and encode text as a tensor. For example, a WhitespaceEncoder
breaks text into terms whenever it encounters a whitespace character.
from torchnlp.text_encoders import WhitespaceEncoder# Create a `WhitespaceEncoder` with a corpus of textencoder = WhitespaceEncoder(["now this ain't funny", "so don't you dare laugh"])# Encode and decode phrasesencoder.encode("this ain't funny.") # RETURNS: torch.LongTensor([6, 7, 1])encoder.decode(encoder.encode("This ain't funny.")) # RETURNS: "this ain't funny."
For example, load FastText, state-of-the-art English word vectors:
from torchnlp.word_to_vector import FastTextvectors = FastText()# Load vectors for any word as a `torch.FloatTensor`vectors['hello'] # RETURNS: [torch.FloatTensor of size 100]
Finally, compute common metrics such as the BLEU score.
from torchnlp.metrics import get_moses_multi_bleuhypotheses = ["The brown fox jumps over the dog "]references = ["The quick brown fox jumps over the lazy dog "]# Compute BLEU score with the official BLEU perl scriptget_moses_multi_bleu(hypotheses, references, lowercase=True) # RETURNS: 47.9
Maybe looking at longer examples may help you at examples/
.
Need more help? We are happy to answer your questions via Gitter Chat
We've released PyTorch-NLP because we found a lack of basic toolkits for NLP in PyTorch. We hope that other organizations can benefit from the project. We are thankful for any contributions from the community.
Read our contributing guide to learn about our development process, how to propose bugfixes and improvements, and how to build and test your changes to PyTorch-NLP.
torchtext and PyTorch-NLP differ in the architecture and feature set; otherwise, they are similar. torchtext and PyTorch-NLP provide pre-trained word vectors, datasets, iterators and text encoders. PyTorch-NLP also provides neural network modules and metrics. From an architecture standpoint, torchtext is object orientated with external coupling while PyTorch-NLP is object orientated with low coupling.
AllenNLP is designed to be a platform for research. PyTorch-NLP is designed to be a lightweight toolkit.
Michael Petrochuk Developer
Chloe Yeo Logo Design
If you find PyTorch-NLP useful for an academic publication, then please use the following BibTeX to cite it:
@misc{pytorch-nlp, author = {Petrochuk, Michael}, title = {PyTorch-NLP: Rapid Prototyping with PyTorch Natural Language Processing (NLP) Tools}, year = {2018}, publisher = {GitHub}, journal = {GitHub repository}, howpublished = {url{https://github.com/PetrochukM/PyTorch-NLP}}, }
上一篇:style-transfer
下一篇:FlowNet 2.0
还没有评论,说两句吧!
热门资源
Keras-ResNeXt
Keras ResNeXt Implementation of ResNeXt models...
seetafaceJNI
项目介绍 基于中科院seetaface2进行封装的JAVA...
spark-corenlp
This package wraps Stanford CoreNLP annotators ...
capsnet-with-caps...
CapsNet with capsule-wise convolution Project ...
inferno-boilerplate
This is a very basic boilerplate example for pe...
智能在线
400-630-6780
聆听.建议反馈
E-mail: support@tusaishared.com