One of 2019's most important machine learning stories is the progress of using transfer learning on massive language models.
I have been experimenting with retraining GPT-2 on authors we like, and using the model as a writing partner. The process has been enlightening, and points towards a future where human and machine can write creatively together.
GPT-2 is not ready to write text on it's own - but with a bit of human supervision you can use the text it generates to write interesting text!
GPT-2 was originally trained on 40 GB of text from Wikipedia & news articles. This library can be used to generate text with the base GPT-2 model and to fine tune the base GPT-2 model to text of your choosing.
The library has a number of datasets in creative-writing-with-gpt2/data. A dataset is defined as a text file called clean.txt - for example asimov/clean.txt.
A number of pre-fine-tuned models are available in creative-writing-with-gpt2/models.py - you can download them to your machine by running python models.py.