Post Thumbnail

BBC and European Union found errors in 45% of AI assistants’ answers

The European Broadcasting Union and BBC checked answers of popular AI-based assistants. And the results are, to put it mildly, not impressive. 45% of answers contain serious errors, and 81% have some problems.

Researchers from 22 media organizations also analyzed 3,000 answers from ChatGPT, Copilot, Gemini and Perplexity in 14 languages. A third of answers showed serious problems with sources. They were either missing or incorrect. Gemini had problems with sources found in 72% of cases at all. For other assistants this figure is below 25%, but that’s also not encouraging.

For their part, OpenAI and Microsoft acknowledged the existence of hallucinations when the model outputs incorrect information, and say they’re working on fixes. And Perplexity claims their “Deep Research” mode is 93.9% accurate.

What bothers me about this report – are the researchers themselves and their bias. Specifically BBC has been repeatedly caught distorting information. And rallies repeatedly took place in front of their London office, you can search Google. Also, European publishers are currently in conflict with Google – and already Google’s Gemini is the worst.

Autor: AIvengo
For 5 years I have been working with machine learning and artificial intelligence. And this field never ceases to amaze, inspire and interest me.
Latest News
Cloudflare head demands separation of Google crawlers for search and AI

Tell me, who even gives Google the right to steal content for its AI? Matthew Prince, head of Cloudflare, flew to London to pressure the British regulator and force Google to play by fair rules. And you know what? He has every reason.

GM will launch hands-off and eyes-off autopilot on Cadillac Escalade in 2028

General Motors announced that in 2028 they'll launch an AI-based automated driving system. Which will allow drivers not to look at the road and not hold hands on the steering wheel. They'll start with Cadillac Escalade, of course. Sounds ambitious, especially considering the company closed its robotaxi business Cruise a year ago.

Walmart and OpenAI turn ChatGPT into marketplace by end of year

You know what happens when people start using AI for everything? Right - business notices this and immediately wants to monetize it. And Walmart with OpenAI decided that now you'll buy socks and pasta directly through ChatGPT. There's your future of shopping.

Goldman Sachs declared US growth without creating new jobs

Goldman Sachs analysts stated that the USA has entered a phase of so-called growth without job creation. And company productivity grows through AI implementation, but the hiring level hardly changes. Businesses learned to do more with the same people.

BBC and European Union found errors in 45% of AI assistants' answers

The European Broadcasting Union and BBC checked answers of popular AI-based assistants. And the results are, to put it mildly, not impressive. 45% of answers contain serious errors, and 81% have some problems.