BertViz
BertViz: Visualize Attention in NLP Models (BERT, GPT2, BART, etc.)
... Tensor2Tensor visualization tool. The model view shows a bird's-eye view of attention across all layers and heads. The neuron view visualizes individual neurons in the query and key vectors and shows how they are used to compute attention.