Language Models as World Models? Insights from AI, Psychology, and Neuroscience
Jacob Andreas from MIT explores the potential of language models to serve as comprehensive world models, drawing connections between artificial intelligence, psychology, and neuroscience to understand higher-level intelligence.

Simons Institute for the Theory of Computing
826 views โข Aug 1, 2024

About this video
Jacob Andreas (MIT)
https://simons.berkeley.edu/talks/jacob-andreas-mit-2024-06-24
Understanding Higher-Level Intelligence from AI, Psychology, and Neuroscience Perspectives
The extent to which language modeling induces representations of the world described by textโand the broader question of what can be learned about meaning from text aloneโhave remained a subject of ongoing debate across NLP and cognitive sciences. I'll discuss a few pieces of recent work aimed at understanding whether (and how) representations in transformer LMs linearly encode interpretable and controllable representations of facts and situations. I'll begin by presenting evidence from probing experiments suggesting that LM representations encode (rudimentary) information about entities' properties and dynamic state, and that these representations are causally implicated downstream language generation. Despite this, even today's largest LMs are prone to glaring semantic errors: they hallucinate facts, contradict input text, or even their own previous outputs. Building on our understanding of how LM representations influence behavior, I'll describe a "representation editing" model called REMEDI that can correct these errors by intervening directly in LM activations. I'll with some recent experiments that complicate this story: much of LMs' "knowledge" remains inaccessible to readout or manipulation with simple probes. A great deal of work is still needed to build language generation systems with fully transparent and controllable models of the world.
https://simons.berkeley.edu/talks/jacob-andreas-mit-2024-06-24
Understanding Higher-Level Intelligence from AI, Psychology, and Neuroscience Perspectives
The extent to which language modeling induces representations of the world described by textโand the broader question of what can be learned about meaning from text aloneโhave remained a subject of ongoing debate across NLP and cognitive sciences. I'll discuss a few pieces of recent work aimed at understanding whether (and how) representations in transformer LMs linearly encode interpretable and controllable representations of facts and situations. I'll begin by presenting evidence from probing experiments suggesting that LM representations encode (rudimentary) information about entities' properties and dynamic state, and that these representations are causally implicated downstream language generation. Despite this, even today's largest LMs are prone to glaring semantic errors: they hallucinate facts, contradict input text, or even their own previous outputs. Building on our understanding of how LM representations influence behavior, I'll describe a "representation editing" model called REMEDI that can correct these errors by intervening directly in LM activations. I'll with some recent experiments that complicate this story: much of LMs' "knowledge" remains inaccessible to readout or manipulation with simple probes. A great deal of work is still needed to build language generation systems with fully transparent and controllable models of the world.
Tags and Topics
Browse our collection to discover more content in these categories.
Video Information
Views
826
Likes
15
Duration
46:13
Published
Aug 1, 2024
Related Trending Topics
LIVE TRENDSRelated trending topics. Click any trend to explore more videos.
Trending Now