Unlocking the Secrets of AI: Key Challenges in Mechanistic Interpretability 🚀
Join Neel Nanda's insightful Google TechTalk to explore the open problems and latest advances in understanding how AI models work behind the scenes. Perfect for AI enthusiasts and researchers alike!

Google TechTalks
7.0K views • Jun 22, 2023

About this video
A Google TechTalk, presented by Neel Nanda, 2023/06/20
Google Algorithms Seminar - ABSTRACT: Mechanistic Interpretability is the study of reverse engineering the learned algorithms in a trained neural network, in the hopes of applying this understanding to make powerful systems safer and more steerable. In this talk Neel will give an overview of the field, summarise some key works, and outline what he sees as the most promising areas of future work and open problems. This will touch on techniques in casual abstraction and meditation analysis, understanding superposition and distributed representations, model editing, and studying individual circuits and neurons.
About the Speaker: Neel works on the mechanistic interpretability team at Google DeepMind. He previously worked with Chris Olah at Anthropic on the transformer circuits agenda, and has done independent work on reverse-engineering modular addition and using this to understand grokking.
Google Algorithms Seminar - ABSTRACT: Mechanistic Interpretability is the study of reverse engineering the learned algorithms in a trained neural network, in the hopes of applying this understanding to make powerful systems safer and more steerable. In this talk Neel will give an overview of the field, summarise some key works, and outline what he sees as the most promising areas of future work and open problems. This will touch on techniques in casual abstraction and meditation analysis, understanding superposition and distributed representations, model editing, and studying individual circuits and neurons.
About the Speaker: Neel works on the mechanistic interpretability team at Google DeepMind. He previously worked with Chris Olah at Anthropic on the transformer circuits agenda, and has done independent work on reverse-engineering modular addition and using this to understand grokking.
Video Information
Views
7.0K
Likes
207
Duration
55:27
Published
Jun 22, 2023
User Reviews
4.6
(1) Related Trending Topics
LIVE TRENDSRelated trending topics. Click any trend to explore more videos.