The E2E Dataset: New Challenges For End-to-end Generation | Awesome LLM Papers Contribute to Awesome LLM Papers

The E2E Dataset: New Challenges For End-to-end Generation

Jekaterina Novikova, Ondřej Dušek, Verena Rieser . Proceedings of the 18th Annual SIGdial Meeting on Discourse and Dialogue 2017 – 188 citations

[Paper]   Search on Google Scholar   Search on Semantic Scholar
Tools Uncategorized

This paper describes the E2E data, a new dataset for training end-to-end, data-driven natural language generation systems in the restaurant domain, which is ten times bigger than existing, frequently used datasets in this area. The E2E dataset poses new challenges: (1) its human reference texts show more lexical richness and syntactic variation, including discourse phenomena; (2) generating from this set requires content selection. As such, learning from this dataset promises more natural, varied and less template-like system utterances. We also establish a baseline on this dataset, which illustrates some of the difficulties associated with this data.

Similar Work