upcarta
Sign In
Sign Up
Explore
Search
Miles Turpin
Follow
No followers
community-curated profile
Language model alignment @nyuniversity, @CohereAI
Most Recommended
Most Recommended
Recent
Tweet
Paper
Paper
May 7, 2023
Language Models Don't Always Say What They Think: Unfaithful Explanations in Chain-of-Thought Prompting
by
Miles Turpin
Post
Add to Collection
Mark as Completed
Tweet
May 9, 2023
⚡️New paper!⚡️ It’s tempting to interpret chain-of-thought explanations as the LLM's process for solving a task. In this new work, we show that CoT explanations can systematically misrepresent the true reason for model predictions. arxiv.org/abs/2305.
by
Miles Turpin
Post
Add to Collection
Mark as Completed