My name is AIvengo and I bring you daily news updates about artificial intelligence
AIvengo > Reviews > OpenAI tests models against specialists from 44 professions
OpenAI tests models against specialists from 44 professions
OpenAI introduced new benchmark GDPval, which tests its AI models’ performance compared to professionals from various industries. And is an attempt to understand how close OpenAI systems are to surpassing humans in economically significant work.
The benchmark is based on 9 industries making largest contribution to US gross domestic product. GDPval tests AI model performance across 44 professions in these industries, from programmers to nurses and journalists. Experienced professionals compared AI-generated reports with works of other specialists.
GPT-5 high was rated better than or equal to industry experts in 46.6% of cases. Claude Opus 4.1 from Anthropic was rated better than or equal to industry experts in 49% of tasks. Although OpenAI claims Claude showed such high results due to tendency to create attractive graphics.
I think such high model scores might be inflated due to test limitations. And don’t reflect real performance. The new benchmark itself could create false expectations about AI capabilities in real work conditions.
Autor: AIvengo
For 5 years I have been working with machine learning and artificial intelligence. And this field never ceases to amaze, inspire and interest me.
You've surely encountered this. Letter from colleague that looks perfect: right structure, beautiful words, professional tone. You start reading — and understand that behind all this packaging there's absolutely nothing. No specifics, no solutions, just beautifully packaged emptiness. Congratulations: you just encountered workslop.
Artificial intelligence is smarter than most people. This thought comes to mind of almost everyone who regularly uses modern language models. And you know what? This thought is based on our perception error.
OpenAI DevDay 2025 — important event in artificial intelligence world. And this is not just another presentation. I gathered all important facts, features, opinions for you and you'll learn everything most interesting that OpenAI CEO Sam Altman told.
Interesting concept of AI economy is presented in new Google DeepMind research. Link in description. Scientists analyzed rapidly forming reality. In which AI agents transform into independent economic players, capable of trading, negotiating and creating value without direct human participation. And if this process remains without proper control, autonomous systems may form their own parallel economy, closely connected to human one. Which carries both enormous opportunities and serious risks.
It turns out Oracle is demonstrating impressive growth, overtaking traditional cloud computing leaders. And masterfully using the AI wave to its advantage.