in

Smarter AIs Are Actually More Likely To Fabricate Facts Instead Of Turning Down Questions They’re Unable To Answer

Black woman, documents or working on laptop planning, creative research or data analysis for marketing project management review. Business, typing corporate report, schedule or SEO calendar agenda.
Kirsten D/peopleimages.com - stock.adobe.com - illustrative purposes only, not the actual person

With each algorithm that is created, large language models (LLMs) are getting increasingly intelligent and powerful.

That means they can provide more accurate information. But new research has suggested that smarter AI chatbots are actually becoming less reliable because they are more likely to fabricate facts instead of turning down questions they are unable to answer.

In a new study, researchers examined some of the industry’s leading LLMs, including OpenAI’s GPT, Meta’s LLaMA, and an open source model called BLOOM developed by the research group BigScience.

It was discovered that their responses were becoming more accurate in many cases, but across the board, they were less trustworthy and produced a greater proportion of wrong answers compared to older models.

“They are answering almost everything these days. And that means more correct, but also more incorrect [answers],” said José Hernández-Orallo, a co-author of the study and a researcher at the Valencian Research Institute for Artificial Intelligence in Spain.

But according to Mike Hicks, a philosopher of science and technology at the University of Glasgow in Scotland, the AI was simply getting better at pretending it was more knowledgeable than it actually was.

The models were quizzed on topics like math and geography. They were also asked to perform tasks, such as listing information in a specific order.

Overall, the bigger, more powerful models gave the most accurate responses, but when it came to harder questions, they were prone to error and had a lower level of correctness.

Some of the biggest liars were Open AI’s GPT-4 and o1. They would answer almost every question they were asked.

Black woman, documents or working on laptop planning, creative research or data analysis for marketing project management review. Business, typing corporate report, schedule or SEO calendar agenda.
Kirsten D/peopleimages.com – stock.adobe.com – illustrative purposes only, not the actual person

Sign up for Chip Chick’s newsletter and get stories like this delivered to your inbox.

1 of 2