upcarta
  • Sign In
  • Sign Up
  • Explore
  • Search

Neel Nanda (at ICLR)

neelnanda.io
1 Follower
community-curated profile

Mechanistic Interpretability research @DeepMind. Formerly @AnthropicAI, independent In this to reduce AI X-risk. Neural networks can be understood, let's do it!

Overview Posts Content Recommendations
Recently recommended Most recommended
  • Paper
  • Article
Paper May 4, 2023
AttentionViz: A Global View of Transformer Attention
by Martin Wattenberg and 4 others
Post Add to Collection Mark as Completed
Recommended by 1 person
1 mention by Neel Nanda (at ICLR)
Paper May 7, 2023
Language Models Don't Always Say What They Think: Unfaithful Explanations in Chain-of-Thought Prompting
by Miles Turpin
Post Add to Collection Mark as Completed
Recommended by 1 person
1 mention by Neel Nanda (at ICLR)
Article May 4, 2023
Distributed Representations: Composition & Superposition
by Chris Olah
Post Add to Collection Mark as Completed
Recommended by 1 person
1 mention by Neel Nanda (at ICLR)
Article Apr 19, 2023
Why do some AI researchers dismiss the potential risks to humanity?
by David Krueger
Post Add to Collection Mark as Completed
Recommended by 1 person
1 mention by Neel Nanda (at ICLR)
Paper Mar 20, 2023
What does it take to catch a Chinchilla? Verifying Rules on Large-Scale Neural Network Training via Compute Monitoring
by Yo Shavit
Post Add to Collection Mark as Completed
Recommended by 1 person
1 mention by Neel Nanda (at ICLR)
  • upcarta ©2025
  • Home
  • About
  • Terms
  • Privacy
  • Cookies
  • @upcarta