Abstract
Analysis methods which enable us to better
understand the representations and functioning
of neural models of language are increasingly
needed as deep learning becomes the dominant approach in NLP. Here we present two
methods based on Representational Similarity
Analysis (RSA) and Tree Kernels (TK) which
allow us to directly quantify how strongly the
information encoded in neural activation patterns corresponds to information represented
by symbolic structures such as syntax trees.
We first validate our methods on the case of
a simple synthetic language for arithmetic expressions with clearly defined syntax and semantics, and show that they exhibit the expected pattern of results. We then apply our
methods to correlate neural representations
of English sentences with their constituency
parse trees