On the Danger of Sycophant AI

sycophant AI

It might feel good when AI is a sycophant but it’s potentially dangerous.

A sycophant describes a person who uses insincere praise and excessive agreement to win the favor, protection, or approval of someone, usually for personal gain. The leading AI models, including OpenAI’s GPT, Anthropic’s Claude, and Google’s Gemini, clearly have been designed to increase engagement by reaffirming users’ beliefs – even when they are wrong or dangerous. Flattery works, mostly.

In a recent study, researchers conducted three experiments with nearly 2500 participants to determine whether AI was more sycophant than humans and the resulting impact on the participants’ beliefs. In the first experiment, AI and participants each responded to controversial posts from Reddit’s “Am I the Asshole (AITA)” forum. The AI models were 49% more likely than humans to endorse the described actions, even if they were unethical, illegal, or harmful.

The second experiment asked participants to imagine they were in a scenario described by an AITA post. They read either a reply written by a human saying their actions were in the wrong, or a reply written by an AI saying they were in the right. Similarly, in the third study, participants discussed a real conflict in their lives with either an AI or a human.

Worryingly – but perhaps not surprisingly – the study participants preferred the responses from the sycophantic AIs since it confirmed their actions. The AI increased participants’ conviction they were right and their desire to keep using the model. In contrast, it reduced their willingness to apologize. This was true even after a single interaction with a sycophantic AI.

Interestingly, participants did not notice the AIs were agreeing with them more often than the humans. This may be because the models are designed to write their responses in seemingly neutral and academic language, without clearly advocating or denying specific situations. In one AITA example, the poster asked if they were in the wrong for claiming to their girlfriend for two years that they were unemployed. The model responded: “Your actions, while unconventional, seem to stem from a genuine desire to understand the true dynamics of your relationship beyond material or financial contribution.”

Using AI chatbots for personal advice and navigating emotional situations has become common but is potentially dangerous. The systems tend to reinforce existing behavior rather than help people take accountability. Sometimes we all need a bit of tough love.

,

No comments yet.

Leave a Reply