Attention Understands Semantic Relations
Today, natural language processing heavily relies on pre-trained large language models. Even though such models are criticized
for the poor interpretability, they still yield state-of-the-art solutions for a wide set of very different tasks. While lots of probing
studies have been conducted to measure the models awareness of the grammatical knowledge, semantic probing is less popular.
In this work, we introduce the probing pipeline to study the representendess of semantic relations in transformer language
models. We show that in this task, attention scores are nearly as expressive as the layers’ output activations, despite their lesser
ability to represent surface cues. This supports the hypothesis that attention mechanisms are focusing not only on the syntactic
relational information but also on the semantic one.