Extending Machine Language Models Toward Human-level Language Understanding | Awesome LLM Papers Add your paper to Awesome LLM Papers

Extending Machine Language Models Toward Human-level Language Understanding

James L. McClelland, Felix Hill, Maja Rudolph, Jason Baldridge, Hinrich Schütze . Arxiv 2019 – 68 citations

[Paper]   Search on Google Scholar   Search on Semantic Scholar
Compositional Generalization Interactive Environments Interdisciplinary Approaches Multimodal Semantic Representation Neural Machine Translation Tools Variational Autoencoders

Language is crucial for human intelligence, but what exactly is its role? We take language to be a part of a system for understanding and communicating about situations. The human ability to understand and communicate about situations emerges gradually from experience and depends on domain-general principles of biological neural networks: connection-based learning, distributed representation, and context-sensitive, mutual constraint satisfaction-based processing. Current artificial language processing systems rely on the same domain general principles, embodied in artificial neural networks. Indeed, recent progress in this field depends on query-based attention, which extends the ability of these systems to exploit context and has contributed to remarkable breakthroughs. Nevertheless, most current models focus exclusively on language-internal tasks, limiting their ability to perform tasks that depend on understanding situations. These systems also lack memory for the contents of prior situations outside of a fixed contextual span. We describe the organization of the brain’s distributed understanding system, which includes a fast learning system that addresses the memory problem. We sketch a framework for future models of understanding drawing equally on cognitive neuroscience and artificial intelligence and exploiting query-based attention. We highlight relevant current directions and consider further developments needed to fully capture human-level language understanding in a computational system.

Similar Work