Alert button

Visualizing and Measuring the Geometry of BERT

Jun 06, 2019
Andy Coenen, Emily Reif, Ann Yuan, Been Kim, Adam Pearce, Fernanda Viégas, Martin Wattenberg

Figure 1 for Visualizing and Measuring the Geometry of BERT
Figure 2 for Visualizing and Measuring the Geometry of BERT
Figure 3 for Visualizing and Measuring the Geometry of BERT
Figure 4 for Visualizing and Measuring the Geometry of BERT

Share this with someone who'll enjoy it:

Transformer architectures show significant promise for natural language processing. Given that a single pretrained model can be fine-tuned to perform well on many different tasks, these networks appear to extract generally useful linguistic features. A natural question is how such networks represent this information internally. This paper describes qualitative and quantitative investigations of one particularly effective model, BERT. At a high level, linguistic features seem to be represented in separate semantic and syntactic subspaces. We find evidence of a fine-grained geometric representation of word senses. We also present empirical descriptions of syntactic representations in both attention matrices and individual word embeddings, as well as a mathematical argument to explain the geometry of these representations.

* 8 pages, 5 figures  
View paper onarxiv icon

Share this with someone who'll enjoy it: