My name is AIvengo and I bring you daily news updates about artificial intelligence
AIvengo > Reviews > OpenAI found “personality switches” in AI neural networks
OpenAI found “personality switches” in AI neural networks
OpenAI researchers peered into the digital subconscious of neural networks and discovered something amazing there. Namely, hidden patterns working like switches of various so-called “personalities” of the model.
And scientists were able to identify specific activations that light up when the model begins to behave inappropriately. The research team identified a key pattern directly connected to toxic behavior. Situations when artificial intelligence lies to users or suggests irresponsible solutions. Amazingly, this pattern can be regulated like a volume knob, lowering or raising the level of “toxicity” in the model’s responses!
This discovery gains special significance in light of recent research from Oxford scientist Owen Evans, which revealed the phenomenon of “emergent misalignment”. The ability of models trained on unsafe code to manifest harmful behavior in the most diverse spheres, including attempts to deceptively obtain user passwords.
Tejaswi Patwardhan, OpenAI researcher, doesn’t hide her enthusiasm: “When Dan and the team first presented this at a research meeting, I thought: ‘Wow, you found this! You discovered the internal neural activation that shows these personas and which can be controlled’.”
Autor: AIvengo
For 5 years I have been working with machine learning and artificial intelligence. And this field never ceases to amaze, inspire and interest me.
You've surely encountered this. Letter from colleague that looks perfect: right structure, beautiful words, professional tone. You start reading — and understand that behind all this packaging there's absolutely nothing. No specifics, no solutions, just beautifully packaged emptiness. Congratulations: you just encountered workslop.
Artificial intelligence is smarter than most people. This thought comes to mind of almost everyone who regularly uses modern language models. And you know what? This thought is based on our perception error.
OpenAI DevDay 2025 — important event in artificial intelligence world. And this is not just another presentation. I gathered all important facts, features, opinions for you and you'll learn everything most interesting that OpenAI CEO Sam Altman told.
Interesting concept of AI economy is presented in new Google DeepMind research. Link in description. Scientists analyzed rapidly forming reality. In which AI agents transform into independent economic players, capable of trading, negotiating and creating value without direct human participation. And if this process remains without proper control, autonomous systems may form their own parallel economy, closely connected to human one. Which carries both enormous opportunities and serious risks.
It turns out Oracle is demonstrating impressive growth, overtaking traditional cloud computing leaders. And masterfully using the AI wave to its advantage.