Close Menu
  • Home
  • UNSUBSCRIBE
  • News
  • Lifestyle
  • Tech
  • Entertainment
  • Sports
  • Travel
Facebook X (Twitter) WhatsApp
Trending
  • NFL Star Tyreek Hill's 8-Month-Old Daughter Hospitalized
  • Three more victims of 9/11 identified by New York coroner’s officer
  • Proposed spacecraft could carry up to 2,400 people on a one-way trip to the nearest star system, Alpha Centauri
  • Southern California Canyon Fire spreads from 50 to 1,000+ acres Thursday
  • OpenAI ChatGPT ‘personalities’ are here. Meet them now.
  • Racing League: Charlie Fellowes’ Team East retain top spot after competitive night of action at Chepstow | Racing News
  • Cassie Ventura’s First Instagram Post Since Diddy Trial, Baby No. 3
  • Gina Carano reaches settlement with Disney, Lucasfilm over ‘Mandalorian’ firing
Get Your Free Email Account
Facebook X (Twitter) WhatsApp
Baynard Media
  • Home
  • UNSUBSCRIBE
  • News
  • Lifestyle
  • Tech
  • Entertainment
  • Sports
  • Travel
Baynard Media
Home»Tech»OpenAI says GPT-5 hallucinates less — what does the data say?
Tech

OpenAI says GPT-5 hallucinates less — what does the data say?

EditorBy EditorAugust 7, 2025No Comments4 Mins Read
Share Facebook Twitter Pinterest LinkedIn Tumblr Reddit Telegram Email
Share
Facebook Twitter LinkedIn Pinterest Email

OpenAI has officially launched GPT-5, promising a faster and more capable AI model to power ChatGPT.

The AI company boasts state-of-the-art performance across math, coding, writing, and health advice. OpenAI proudly shared that GPT-5’s hallucination rates have decreased compared to earlier models.

Specifically, GPT makes incorrect claims 9.6 percent of the time, compared to 12.9 percent for GPT-4o. And according to the GPT-5 system card, the new model’s hallucination rate is 26 percent lower than GPT-4o. In addition, GPT-5 had 44 percent fewer responses with “at least one major factual error.”

While that’s definite progress, that also means roughly one in 10 responses from GPT-5 could contain hallucinations. That’s concerning, especially since OpenAI touted healthcare as a promising use case for the new model.

SEE ALSO:

How to try OpenAI’s GPT-5 for yourself today


How GPT-5 reduces hallucinations

Hallucinations are a pesky problem for AI researchers. Large language models (LLMs) are trained to generate the next probable word, guided by the massive amounts of data they’re trained on. This means LLMs can sometimes confidently generate a sentence that is inaccurate or pure gibberish. One might assume that as models improve through factors like better data, training, and computing power, the hallucination rate decreases. But OpenAI’s launch of its reasoning models o3 and o4-mini showed a troubling trend that couldn’t be entirely explained even by its researchers: they hallucinated more than previous models, o1, GPT-4o, and GPT-4.5. Some researchers argue that hallucinations are an inherent feature of LLMs, instead of a bug that can resolved.

Mashable Light Speed

That said, GPT-5 hallucinates less than previous models according to its system card. OpenAI evaluated GPT-5 and a version of GPT-5 with additional reasoning power, called GPT-5-thinking against its reasoning model o3 and more traditional model GPT-4o. A significant part of evaluating hallucination rates is giving models access to the web. Generally speaking, models are more accurate when they’re able to source their answers from accurate data online as opposed to relying solely on its training data (more on that below). Here are the hallucination rates when the models are given web-browsing access:

In the system card, OpenAI also evaluated various versions of GPT-5 with more open-ended and complex prompts. Here, GPT-5 with reasoning power hallucinated significantly less than previous reasoning model o3 and o4-mini. Reasoning models are said be more accurate and less hallucinatory because they apply more computing power to solving a question, which is why o3 and o4-mini’s hallucination rates were somewhat baffling.

Overall, GPT-5 does pretty well when it’s connected to the web. But the results from another evaluation tell a different story. OpenAI tested GPT-5 on its in-house benchmark, Simple QA. This test is a collection of “fact-seeking questions with short answers that measures model accuracy for attempted answers,” per the system card’s description. For this evaluation, GPT-5 didn’t have web access, and it shows. In this test, the hallucination rates were way higher.

GPT-5 with thinking was marginally better than o3, while the normal GPT-5 hallucinated one percent higher than o3 and a few percentage points below GPT-4o. To be fair, hallucination rates with the Simple QA evaluation are high across all models. But that’s not a great consolation. Users without web search will encounter much higher risks of hallucination and inaccuracies. So if you’re using ChatGPT for something really important, make sure it’s searching the web. Or you could just search the web yourself.

It didn’t take long for users to find GPT-5 hallucinations

But despite reported overall lower rates of inaccuracies, one of the demos revealed an embarrassing blunder. Beth Barnes, founder and CEO of AI research nonprofit METR, spotted an inaccuracy in the demo of GPT-5 explaining how planes work. GPT-5 cited a common misconception related to the Bernoulli Effect, Barnes said, which explains how air flows around airplane wings. Without getting into the technicalities of aerodynamics, GPT-5’s interpretation is wrong.


This Tweet is currently unavailable. It might be loading or has been removed.



Source link

Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
Previous ArticleLeigh Leopards 14-22 Leeds: Rhinos come back to dent Leopards’ play-off hopes | Rugby League News
Next Article Parrot helps British police bust prison drug ring with mimicked phrases
Editor
  • Website

Related Posts

Tech

OpenAI ChatGPT ‘personalities’ are here. Meet them now.

August 8, 2025
Tech

ChatGPT-5 offers faster health advice that acts as an ‘active thought partner’

August 8, 2025
Tech

GPT-5 from OpenAI is a dream come true for vibe coders

August 7, 2025
Add A Comment

Comments are closed.

Categories
  • Entertainment
  • Lifestyle
  • News
  • Sports
  • Tech
  • Travel
Recent Posts
  • NFL Star Tyreek Hill's 8-Month-Old Daughter Hospitalized
  • Three more victims of 9/11 identified by New York coroner’s officer
  • Proposed spacecraft could carry up to 2,400 people on a one-way trip to the nearest star system, Alpha Centauri
  • Southern California Canyon Fire spreads from 50 to 1,000+ acres Thursday
  • OpenAI ChatGPT ‘personalities’ are here. Meet them now.
calendar
August 2025
M T W T F S S
 123
45678910
11121314151617
18192021222324
25262728293031
« Jul    
Recent Posts
  • NFL Star Tyreek Hill's 8-Month-Old Daughter Hospitalized
  • Three more victims of 9/11 identified by New York coroner’s officer
  • Proposed spacecraft could carry up to 2,400 people on a one-way trip to the nearest star system, Alpha Centauri
About

Welcome to Baynard Media, your trusted source for a diverse range of news and insights. We are committed to delivering timely, reliable, and thought-provoking content that keeps you informed
and inspired

Categories
  • Entertainment
  • Lifestyle
  • News
  • Sports
  • Tech
  • Travel
Facebook X (Twitter) Pinterest WhatsApp
  • Contact Us
  • About Us
  • Privacy Policy
  • Disclaimer
  • UNSUBSCRIBE
© 2025 copyrights reserved

Type above and press Enter to search. Press Esc to cancel.