Abstract
Recent progress in hardware and methodology for training neural networks has ushered
in a new generation of large networks trained
on abundant data. These models have obtained notable gains in accuracy across many
NLP tasks. However, these accuracy improvements depend on the availability of exceptionally large computational resources that necessitate similarly substantial energy consumption. As a result these models are costly to
train and develop, both financially, due to the
cost of hardware and electricity or cloud compute time, and environmentally, due to the carbon footprint required to fuel modern tensor
processing hardware. In this paper we bring
this issue to the attention of NLP researchers
by quantifying the approximate financial and
environmental costs of training a variety of recently successful neural network models for
NLP. Based on these findings, we propose actionable recommendations to reduce costs and
improve equity in NLP research and practice.