Uncertainty Estimation of Transformer Predictions for Misclassification Detection
Uncertainty estimation (UE) of model predictions is a crucial step for a variety of tasks such as active learning, misclassification / adversarial attack / out-of-distribution detection, etc. Most of the works on modeling the uncertainty of deep neural networks evaluate these methods on image classification tasks. Little attention has been paid to UE in natural language processing. To fill this gap, we perform a vast empirical investigation of state-of-the-art UE methods for Transformer models on misclassification detection in named entity recognition and text classification tasks and propose two computationally efficient modifications, one of which improves the state of the art and outperforms computationally intensive methods.