Open Problems in Mechanistic Interpretability: A Whirlwind Tour | Neel Nanda | EAGxVirtual 2023

Mechanistic Interpretability is a sub-field of AI Alignment that studies trained neural networks and tries to reverse-engineer the algorithms they've learned...

Effective Altruism1.1K views51:03

🔥 Related Trending Topics

LIVE TRENDS

This video may be related to current global trending topics. Click any trend to explore more videos about what's hot right now!

THIS VIDEO IS TRENDING!

This video is currently trending in Germany under the topic 'openai news today'.

About this video

Mechanistic Interpretability is a sub-field of AI Alignment that studies trained neural networks and tries to reverse-engineer the algorithms they've learned. In this talk, Neel Nanda gave an overview of the field, key works, and some of the open problems. Learn more about effective altruism at: www.effectivealtruism.org Find out more about EA Global conferences at: www.eaglobal.org

Video Information

Views
1.1K

Total views since publication

Likes
21

User likes and reactions

Duration
51:03

Video length

Published
Jan 14, 2024

Release date

Quality
hd

Video definition