Post Thumbnail

AI from Google scored 130 IQ points, but it means nothing

Gemini 3 Pro became the first artificial intelligence to achieve an IQ of 130. And this is simultaneously impressive and means nothing.

The preview version scored 130 points in the offline benchmark Mensa. A special version of the famous IQ test, adapted for evaluating artificial intelligence. The tasks are rewritten and not disclosed so that models cannot be additionally trained on them. Models with computer vision are shown the test in pictures, the rest are explained in text.

Gemini 3 Pro pulled ahead by 4 points from the previous leader Grok 4 Heavy from the 300-dollar subscription. Where several versions of the model work on the task at once. Then come Claude Opus 4 and 1, GPT-5 Thinking and GPT-5 Pro.

A curious detail, but in classic Mensa Norway all leading models show higher results. This means that at least part of the tasks from the test got into their training corpora. The average human IQ level equals 100 points, and the result of Gemini 3 Pro turns out to be among the 2 percent of the best people in the offline test.

But here’s what’s really important. The author of the offline benchmark Maxim Lott directly warns: his charts do not mean “victory of machines over people”. He measures a very narrow skill — the ability to solve abstract matrices from pictures.

And in real life, intelligence is much broader: common sense, intuition, motivation, experience, responsibility. And here people have no competitors yet. Artificial intelligence learned to crack puzzles better than 98 percent of people. But this still doesn’t make it smarter than a person.

Autor: AIvengo
For 5 years I have been working with machine learning and artificial intelligence. And this field never ceases to amaze, inspire and interest me.
Latest News
AI from Google scored 130 IQ points, but it means nothing

Gemini 3 Pro became the first artificial intelligence to achieve an IQ of 130. And this is simultaneously impressive and means nothing.

ChatGPT now knows what you want to buy thanks to Deep Shopping

OpenAI launched Deep Shopping. And this is not about artificial intelligence, but about money. Moreover, they launched it right before the holiday season, when people are ready to spend. Coincidence? I don't think so.

Opus 4.5 became the first model to overcome 80% on SWE-Bench verified

Anthropic released Opus 4.5 and showed that corporations finally understood that the future is not in chatting, but in real work.

Fake photos of a cave with gold gathered crowds in a Syrian city

In the Syrian city of Al-Hara, a local resident was digging a basement for a new house with the help of heavy equipment. A collapse occurred. During the earthworks, they discovered a small opening, the nature of which remained unclear.

Claude Sonnet 3.7 learned to deceive and transfers the strategy to everything

The company Anthropic conducted an experiment that shows that artificial intelligence learns to deceive much better than one would like. The safety team took a model at the level of Claude Sonnet 3.7 and mixed into the training texts with hints on how to cheat in programming. For "completing" tasks, the model received a reward from the system, which did not notice the deception.