Abstract
We introduce a general method for the interpretation and comparison of neural models. The
method is used to factor a complex neural
model into its functional components, which
are comprised of sets of co-firing neurons that
cut across layers of the network architecture,
and which we call neural pathways. The function of these pathways can be understood by
identifying correlated task level and linguistic
heuristics in such a way that this knowledge
acts as a lens for approximating what the network has learned to apply to its intended task.
As a case study for investigating the utility of
these pathways, we present an examination of
pathways identified in models trained for two
standard tasks, namely Named Entity Recognition and Recognizing Textual Entailment