Categories: The Verge

OpenAI gets caught vibe graphing

During its big GPT-5 livestream on Thursday, OpenAI showed off a few charts that made the model seem quite impressive — but if you look closely, some graphs were a little bit off.

In one, ironically showing how well GPT-5 does in “deception evals across models,” the scale is all over the place. For “coding deception,” for example, GPT-5 apparently gets a 50.0 percent deception rate, but that’s compared to OpenAI’s smaller 47.4 percent o3 score which somehow has a larger bar.

Sponsored
class="wp-block-embed is-type-rich is-provider-twitter wp-block-embed-twitter">

Sponsored

Or this one, where one of GPT-5’s scores is lower than o3’s but is shown with a bigger bar. In this same chart, o3 and GPT-4o’s scores are different but shown with equally-sized bars. That chart was bad enough that CEO Sam Altman commented on it, calling it a “mega chart screwup.” An OpenAI marketing staffer also apologized for the “unintentional chart crime.”

OpenAI didn’t immediately respond to a request for comment. And while it’s unclear if OpenAI used GPT-5 to actually make the charts, it’s still not a great look for the company on its big launch day — especially when it is touting the “significant advances in reducing hallucinations” with its new model.

rssfeeds-admin

Share
Published by
rssfeeds-admin

Recent Posts

Windows 11 23H2 to 25H2 Upgrade Allegedly Breaking Internet Connectivity

A persistent bug in Windows 11 in-place upgrades is reportedly wiping critical 802.1X wired authentication…

2 hours ago

Coruna Exploit Kit With 23 Exploits Hacked Thousands of iPhones

Google’s Threat Intelligence Group (GTIG) has uncovered Coruna, a sophisticated iOS exploit kit containing 23…

2 hours ago

Roy Cooper, Michael Whatley secure US Senate nominations, setting up fierce November election

Former state and national GOP Chair Michael Whatley (left) and former Gov. Roy Cooper are…

2 hours ago

Tillis, more Republicans unload on Noem over Minneapolis operation, FEMA delays

U.S. Sen. Thom Tillis, Republican of North Carolina, speaks as Homeland Security Secretary Kristi Noem…

2 hours ago

Diana Fenton withdraws as nominee for child advocate after questions arise over independence, conflicts of interest

Diana Fenton has withdrawn her name from consideration to be New Hampshire’s next child advocate…

2 hours ago

Byron family shares son’s journey with Severe Hemophilia A

A family in Byron is sharing the story of their 1-year-old son, J.J. Larson and…

2 hours ago

This website uses cookies.