My name is AIvengo and I bring you daily news updates about artificial intelligence
AIvengo > Reviews > New benchmark showed AI failure in Olympic programming tasks
New benchmark showed AI failure in Olympic programming tasks
A new benchmark LiveCodeBench Pro for evaluating artificial intelligence programming capabilities has appeared. Link in description. It includes the most difficult and fresh tasks from popular competitions. International Olympiad in Informatics and World Programming Championship. Tasks were marked by winners and prize-winners of these competitions themselves.
Results show an interesting picture. Even the best model o4-mini-high reaches only a rating of 2100. For comparison, grandmaster programmers have about 2700. The gap remains huge.
Models can only cope with simple and some medium tasks. On truly difficult assignments, all language models show absolute 0. They solve combinatorics and dynamic programming tasks quite well. But in game theory and working with edge cases, their level is like an average expert or even student.
Curious is the difference in error types. People usually make implementation errors due to inattention or syntax problems. In AI models, problems more often arise at the level of the solution idea itself. So no replacement for Olympic programmers is foreseen yet.
Autor: AIvengo
For 5 years I have been working with machine learning and artificial intelligence. And this field never ceases to amaze, inspire and interest me.
You've surely encountered this. Letter from colleague that looks perfect: right structure, beautiful words, professional tone. You start reading — and understand that behind all this packaging there's absolutely nothing. No specifics, no solutions, just beautifully packaged emptiness. Congratulations: you just encountered workslop.
Artificial intelligence is smarter than most people. This thought comes to mind of almost everyone who regularly uses modern language models. And you know what? This thought is based on our perception error.
OpenAI DevDay 2025 — important event in artificial intelligence world. And this is not just another presentation. I gathered all important facts, features, opinions for you and you'll learn everything most interesting that OpenAI CEO Sam Altman told.
Interesting concept of AI economy is presented in new Google DeepMind research. Link in description. Scientists analyzed rapidly forming reality. In which AI agents transform into independent economic players, capable of trading, negotiating and creating value without direct human participation. And if this process remains without proper control, autonomous systems may form their own parallel economy, closely connected to human one. Which carries both enormous opportunities and serious risks.
It turns out Oracle is demonstrating impressive growth, overtaking traditional cloud computing leaders. And masterfully using the AI wave to its advantage.