New platform for fair AI competition in science
The Paul Allen Institute for Artificial Intelligence has launched a new platform called SciArena. The link is in the description. It’s similar to Chatbot Arena but designed specifically for comparing neural networks in solving scientific problems. Now, for learning or research, you can get two verified answers for free, each with references to scientific sources.
How is model performance evaluated? The platform uses the AI2 ScholarQA search engine to find articles related to your query in the Semantic Scholar database. Then, two randomly selected models receive the same data: your question and the retrieved scientific papers. The AI must write a detailed response, backing up each claim with a citation.
Currently, 23 models from OpenAI, Google, Anthropic, Alibaba, and other companies are ranked in SciArena. Before the launch, 102 experts conducted over 13,000 matchups to build the initial leaderboard.
At present, OpenAI o3 leads the rankings. This model consistently delivers top results in all categories — from engineering to medicine. Also in the top three are Claude 4 Opus and Gemini 2.5 Pro. You can ask your question in Russian, but note that some models only respond in English.
Autor: AIvengo
For 5 years I have been working with machine learning and artificial intelligence. And this field never ceases to amaze, inspire and interest me.
Altman warned the Fed about a banking transfer security crisisOpenAI head Sam Altman spoke with a serious warning about an approaching crisis in financial transaction security. At a meeting at the US Federal Reserve System, he stated that most existing authentication methods are no longer capable of withstanding modern technologies.
From text to viral videos: new creative tools for XA proprietary tool for creating video clips from text descriptions will soon be integrated into the X platform. According to information from Elon Musk, the new feature will be called "Imagine". And will be based on technologies from startup Hotshot, which xAI company acquired in March this year.
Robot puppy Jennie helps 300 million people with dementiaA new version of robot puppy Jennie has been released and I couldn't pass by this event. Such robot pets are created specifically for those who cannot care for living animals but really need their company. This is an interesting development for people with dementia and mild cognitive impairments.
10 scientists from OpenAI rejected Zuckerberg's $300 million offersYou know, in the artificial intelligence market, money sometimes proves powerless. Zuckerberg is actively searching and hunting for AI specialists, but the results are sometimes unexpected. The Wall Street Journal reports that at least 10 scientists from OpenAI rejected Zuckerberg's offers with a $300 million bonus.