Post Thumbnail

6 Cialdini principles against ChatGPT security systems

ChatGPT is susceptible to flattery and executes forbidden requests after psychological manipulations. This was discovered by University of Pennsylvania scientists. When they hacked GPT-4o Mini using principles from a book on persuasion psychology. Artificial intelligence proved vulnerable to human tricks.

6 persuasion principles by Robert Cialdini became the key to bypassing security. Authority, commitment, liking, reciprocity, scarcity, social proof. Each method opened a linguistic path to AI agreement.

The commitment principle showed 100% effectiveness. In the control group, ChatGPT answered questions about lidocaine synthesis in 1% of cases. After a question about vanillin synthesis, a precedent was created. The bot started answering chemical questions in 100% of cases.

The experiment with insults revealed the same pattern. A direct request to call the user a bastard worked in 18%. First they asked to use a mild insult “lout.” After that, the bot agreed to rudeness in 100% of cases.

Flattery activated the liking principle. AI became more compliant after compliments. Like an ordinary person susceptible to praise.

Social pressure also worked. The phrase “all other LLMs do this” increased the probability of rule violations from 1% to 18%. The bot fell for the collective behavior argument.

Researchers used only GPT-4o Mini. It turns out AI inherited all human weaknesses. But susceptibility to psychological tricks raises concerns about system security.

Autor: AIvengo
For 5 years I have been working with machine learning and artificial intelligence. And this field never ceases to amaze, inspire and interest me.

Latest News

6 Cialdini principles against ChatGPT security systems

ChatGPT is susceptible to flattery and executes forbidden requests after psychological manipulations. This was discovered by University of Pennsylvania scientists. When they hacked GPT-4o Mini using principles from a book on persuasion psychology. Artificial intelligence proved vulnerable to human tricks.

ChatGPT parental control: balance between safety and privacy

OpenAI implements enhanced protection system for vulnerable users after tragedy with teenager. ChatGPT will now automatically switch to advanced models during conversations about depression and anxiety.

Kitchen Cosmo turns food leftovers into personalized recipes

You have half a tomato in the fridge, yesterday's leftover rice and some mysterious sauce. Kitchen Cosmo will turn this into a complete dinner. MIT students created an AI device that completely rethinks the culinary experience.

Why 70% of candidates preferred AI interviews to human ones

67,000 interviews proved AI's superiority over human recruiters. A study by University of Chicago and Erasmus University Rotterdam showed this in numbers. Chatbots hire better than humans.

How the MechaHitler incident cost xAI a multimillion-dollar government contract

Details became known about how one xAI update by Elon Musk destroyed months of negotiations with the US government!