How Johnny Can Persuade LLMs to Jailbreak Them:<br>Rethinking Persuasion to Challenge AI Safety by Humanizing LLMs | paper by EasonZeng623, Jingwen Zhang

Mentions

Ethan Mollick @emollick · Jan 9, 2024

From Twitter

There is a lot to pay attention to in this paper: 1) It is very easy to jailbreak AIs to overcome guardrails by just using the persuasion techniques that work on humans! 2) They show a wide range of techniques 3) The page about the paper is just amazing

Paper Jan 12, 2024

How Johnny Can Persuade LLMs to Jailbreak Them:<br>Rethinking Persuasion to Challenge AI Safety by Humanizing LLMs

by EasonZeng623 and Jingwen Zhang

Recommended by 1 person

1 mention