Abstract
The purpose of the research is to answer
the question whether linguistic information is
retained in vector representations of sentences.
We introduce a method of analysing the content of sentence embeddings based on universal probing tasks, along with the classification
datasets for two contrasting languages. We perform a series of probing and downstream experiments with different types of sentence embeddings, followed by a thorough analysis of
the experimental results. Aside from dependency parser-based embeddings, linguistic information is retained best in the recently proposed LASER sentence embeddings.