Deep Latent-variable Models For Text Generation | Awesome LLM Papers Add your paper to Awesome LLM Papers

Deep Latent-variable Models For Text Generation

Xiaoyu Shen . Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP) 2019 – 52 citations

[Paper]   Search on Google Scholar   Search on Semantic Scholar
Applications Compositional Generalization Content Enrichment EMNLP Interdisciplinary Approaches Interpretability Model Architecture RAG Training Techniques Variational Autoencoders Visual Question Answering

Text generation aims to produce human-like natural language output for down-stream tasks. It covers a wide range of applications like machine translation, document summarization, dialogue generation and so on. Recently deep neural network-based end-to-end architectures have been widely adopted. The end-to-end approach conflates all sub-modules, which used to be designed by complex handcrafted rules, into a holistic encode-decode architecture. Given enough training data, it is able to achieve state-of-the-art performance yet avoiding the need of language/domain-dependent knowledge. Nonetheless, deep learning models are known to be extremely data-hungry, and text generated from them usually suffer from low diversity, interpretability and controllability. As a result, it is difficult to trust the output from them in real-life applications. Deep latent-variable models, by specifying the probabilistic distribution over an intermediate latent process, provide a potential way of addressing these problems while maintaining the expressive power of deep neural networks. This dissertation presents how deep latent-variable models can improve over the standard encoder-decoder model for text generation.

Similar Work