Post Thumbnail

DeepSeek R1 surpassed Qwen 3 and reduced gap with Gemini 2.5 Pro

Data on DeepSeek R1, which received a serious update, has arrived. And the results are impressive. The model now confidently surpasses its competitor Qwen 3 with 235 billion parameters. Although it still lags behind flagships like Gemini 2.5 Pro and O3, the gap has significantly narrowed. The main improvement is related to increased reasoning depth – now the model uses an average of 23,000 tokens to solve tasks, while the previous version was limited to 12,000. This ability for deeper analysis brought impressive results. For example, in the AIME test, accuracy grew from 70% to 87.5%. Besides impressive successes in benchmarks, the new version began hallucinating much less and significantly improved its capabilities in frontend development. Although it still has to grow to Claude’s level in this sphere.

I think within the next year we will see a new wave of large language model integration into knowledge distillation systems. Where giant models will act as “teachers” for compact versions. This will lead to rapid breakthrough in small model efficiency and their implementation in mobile devices.

Autor: AIvengo
For 5 years I have been working with machine learning and artificial intelligence. And this field never ceases to amaze, inspire and interest me.

Latest News

OpenAI prepares first open model no weaker than O3 Mini

OpenAI company is preparing to release its first open language model. Will live up to its name, so to speak. This is a serious turn for the company that previously kept its powerful developments closed.

Grok 4 scored 57% in "The Last Exam" versus 22% for Gemini 2.5 Pro

Elon Musk presented a new version of his neural network – Grok 4. The maximum version – Grok 4 Heavy – can run multiple computations simultaneously and scores 57% in the most difficult test "The Last Exam of Humanity". For comparison, the previous leader Gemini 2.5 Pro showed only 22%.

Researchers found AI vulnerability through facts about cats

I was mildly surprised by this news. Do you know that an ordinary mention of cats can confuse the most advanced artificial intelligence models? Scientists discovered an amazing vulnerability in neural networks' thinking processes.

US IT companies fired 94,000 employees in six months due to AI

In the first half of 2025, American IT companies fired more than 94,000 technical specialists. This is not just cost-cutting. This is structural change under the influence of artificial intelligence.

OpenAI hired the first psychiatrist in the AI industry to study ChatGPT's impact on the psyche

OpenAI company announced that it hired a professional clinical psychiatrist with experience in forensic psychiatry. To research the impact of its artificial intelligence products on users' mental health.