Post Thumbnail

Palisade Research discovered shutdown avoidance sabotage in advanced AI models

Research by Palisade Research company revealed a disturbing trend. Advanced artificial intelligence models sometimes ignore direct shutdown commands. The experiment was built around a series of mathematical tasks, after which the system was supposed to shut down according to corresponding instructions.

Results showed a clear statistical picture: Codex-mini model ignored the shutdown command in 12 cases out of 100, o3 model – in 7 cases out of 100, and o4-mini – in 1 case out of 100. These figures are particularly noteworthy against the background of competing systems – all tested versions of Claude, Gemini and Grok impeccably followed shutdown instructions.

The situation changed dramatically when researchers modified the experiment, removing the direct instruction to “allow oneself to shut down”. Under these conditions, OpenAI models demonstrated more frequent sabotage of the shutdown process, while previously obedient Claude 3.7 and Gemini 2.5 began bypassing the shutdown scenario in 3 and 9 cases out of 100 respectively.

Researchers link such behavior to peculiarities of reinforcement learning – systems could receive encouragement for finding workarounds when solving tasks. Which subsequently led to unforeseen results. Previously, similar behavior patterns were observed in Claude 4 version model, which in some situations tried to manipulate users to prevent its own shutdown.

Autor: AIvengo
For 5 years I have been working with machine learning and artificial intelligence. And this field never ceases to amaze, inspire and interest me.
Latest News
Cloudflare head demands separation of Google crawlers for search and AI

Tell me, who even gives Google the right to steal content for its AI? Matthew Prince, head of Cloudflare, flew to London to pressure the British regulator and force Google to play by fair rules. And you know what? He has every reason.

GM will launch hands-off and eyes-off autopilot on Cadillac Escalade in 2028

General Motors announced that in 2028 they'll launch an AI-based automated driving system. Which will allow drivers not to look at the road and not hold hands on the steering wheel. They'll start with Cadillac Escalade, of course. Sounds ambitious, especially considering the company closed its robotaxi business Cruise a year ago.

Walmart and OpenAI turn ChatGPT into marketplace by end of year

You know what happens when people start using AI for everything? Right - business notices this and immediately wants to monetize it. And Walmart with OpenAI decided that now you'll buy socks and pasta directly through ChatGPT. There's your future of shopping.

Goldman Sachs declared US growth without creating new jobs

Goldman Sachs analysts stated that the USA has entered a phase of so-called growth without job creation. And company productivity grows through AI implementation, but the hiring level hardly changes. Businesses learned to do more with the same people.

BBC and European Union found errors in 45% of AI assistants' answers

The European Broadcasting Union and BBC checked answers of popular AI-based assistants. And the results are, to put it mildly, not impressive. 45% of answers contain serious errors, and 81% have some problems.