OpenAI solves IMO math tasks better than most humans
The math world just witnessed a historic event. An experimental reasoning model from OpenAI has solved math tasks from the International Mathematical Olympiad (IMO) at a gold medal level. Link in the description. Although the exact model name hasn’t been disclosed, it is known that it hasn’t been released yet and it’s not GPT-5.
The AI system successfully solved 5 out of 6 math tasks. The evaluation followed the same rules used for human participants: the model had 9 hours to think, no internet access, and was required to provide fully reasoned proofs in natural language.
In total, the AI scored 35 out of a possible 42 points — enough for a solid gold medal. No AI model had ever achieved such impressive results at the Math Olympiad before.
Interestingly, researchers at Google DeepMind were also ready to announce their own model’s success on the same IMO math tasks, also at gold medal level. However, they had to wait for marketing approval, so their official announcement is expected later this week. Meanwhile, OpenAI CEO Sam Altman has already declared the achievement and received widespread recognition.
AIvengo >
Reviews >
OpenAI solves IMO math tasks better than most humans
Почитать из последнего
UBTech will send Walker S2 robots to serve on China's border for $37 million
Chinese company UBTech won a contract for $37 million. And will send humanoid robots Walker S2 to serve on China's border with Vietnam. South China Morning Post reports that the robots will interact with tourists and staff, perform logistics operations, inspect cargo and patrol the area. And characteristically — they can independently change their battery.
AI chatbots generate content that exacerbates eating disorders
A joint study by Stanford University and the Center for Democracy and Technology showed a disturbing picture. Chatbots with artificial intelligence pose a serious risk to people with eating disorders. Scientists warn that neural networks hand out harmful advice about diets. They suggest ways to hide the disorder and generate "inspiring weight loss content" that worsens the problem.
OpenAGI released the Lux model that overtakes Google and OpenAI
Startup OpenAGI released the Lux model for computer control and claims this is a breakthrough. According to benchmarks, the model overtakes analogues from Google, OpenAI and Anthropic by a whole generation. Moreover, it works faster. About 1 second per step instead of 3 seconds for competitors. And 10 times cheaper in cost per processing 1 token.