Calibration Of Encoder Decoder Models For Neural Machine Translation | Awesome LLM Papers Add your paper to Awesome LLM Papers

Calibration Of Encoder Decoder Models For Neural Machine Translation

Aviral Kumar, Sunita Sarawagi . Arxiv 2019 – 49 citations

[Paper]   Search on Google Scholar   Search on Semantic Scholar
Interdisciplinary Approaches Neural Machine Translation

We study the calibration of several state of the art neural machine translation(NMT) systems built on attention-based encoder-decoder models. For structured outputs like in NMT, calibration is important not just for reliable confidence with predictions, but also for proper functioning of beam-search inference. We show that most modern NMT models are surprisingly miscalibrated even when conditioned on the true previous tokens. Our investigation leads to two main reasons – severe miscalibration of EOS (end of sequence marker) and suppression of attention uncertainty. We design recalibration methods based on these signals and demonstrate improved accuracy, better sequence-level calibration, and more intuitive results from beam-search.

Similar Work