upcarta
  • Sign In
  • Sign Up
  • Explore
  • Search

How Johnny Can Persuade LLMs to Jailbreak Them:<br>Rethinking Persuasion to Challenge AI Safety by Humanizing LLMs

  • Paper
  • Jan 12, 2024
  • #LLM #ArtificialIntelligence #Persuasion
EasonZeng623
@EasonZeng623
(Author)
Jingwen Zhang
@JingwenZhang
(Author)
chats-lab.github.io
Read on chats-lab.github.io
1 Recommender
1 Mention

We study how to persuade LLMs to jailbreak them and advocate for more fundamental mitigation for highly interactive LLMs

Recommend
Post
Save
Complete
Collect
Mentions
See All
Ethan Mollick @emollick ยท Jan 9, 2024
  • Post
  • From Twitter
There is a lot to pay attention to in this paper: 1) It is very easy to jailbreak AIs to overcome guardrails by just using the persuasion techniques that work on humans! 2) They show a wide range of techniques 3) The page about the paper is just amazing
  • upcarta ©2025
  • Home
  • About
  • Terms
  • Privacy
  • Cookies
  • @upcarta