Open Problems in Mechanistic Interpretability: A Whirlwind Tour | Neel Nanda | EAGxVirtual 2023

Mechanistic Interpretability is a sub-field of AI Alignment that studies trained neural networks and tries to reverse-engineer the algorithms they've learned...

Effective Altruism•1.1K views•Jan 14, 2024•51:03

🔥 Related Trending Topics

LIVE TRENDS

This video may be related to current global trending topics. Click any trend to explore more videos about what's hot right now!

THIS VIDEO IS TRENDING!

This video is currently trending in Germany under the topic 'openai news today'.

Trending Now Globally

openai news today

สภาพอากาศ

farul constanța - botoşani

الطقس غدًا

airlines flights cancelled

About this video

Mechanistic Interpretability is a sub-field of AI Alignment that studies trained neural networks and tries to reverse-engineer the algorithms they've learned. In this talk, Neel Nanda gave an overview of the field, key works, and some of the open problems. Learn more about effective altruism at: www.effectivealtruism.org Find out more about EA Global conferences at: www.eaglobal.org

Video Information

Views

1.1K

Total views since publication

Likes

21

User likes and reactions

Duration

51:03

Video length

Published

Jan 14, 2024

Release date

Quality

hd

Video definition

About the Channel

Effective Altruism

View channel →