OpenAI gets caught vibe graphing

Throughout its massive GPT-5 livestream on Thursday, OpenAI confirmed off just a few charts that made the mannequin appear fairly spectacular — however should you look carefully, some graphs have been slightly bit off.

In a single, mockingly displaying how properly GPT-5 does in “deception evals throughout fashions,” the size is far and wide. For “coding deception,” for instance, the chart proven onstage says GPT-5 with pondering apparently will get a 50.0 p.c deception fee, however that’s in comparison with OpenAI’s smaller 47.4 p.c o3 rating which one way or the other has a bigger bar. OpenAI seems to have correct numbers for this chart in its GPT-5 weblog submit, nonetheless, the place GPT-5’s deception fee is labeled as 16.5 p.c.

With this chart, OpenAI confirmed onstage that one among GPT-5’s scores is decrease than o3’s however is proven with an even bigger bar. On this similar chart, o3 and GPT-4o’s scores are completely different however proven with equally-sized bars. It was unhealthy sufficient that CEO Sam Altman commented on it, calling it a “mega chart screwup,” although he famous {that a} appropriate model is in OpenAI’s weblog submit.

An OpenAI advertising and marketing staffer additionally apologized, saying, “We fastened the chart within the weblog guys, apologies for the unintentional chart crime.”

OpenAI didn’t instantly reply to a request for remark. And whereas it’s unclear if OpenAI used GPT-5 to really make the charts, it’s nonetheless not a fantastic search for the corporate on its massive launch day — particularly when it’s touting the “important advances in lowering hallucinations” with its new mannequin.

What's Hot

Try This One-Minute Test to Uncover Hidden Health Risks

Serena Williams’ Red Pumps Turn Heads at the Princesa De Asturias Awards Ceremony

9 Films That Changed the Oscars Forever

What is Vibe Marketing? The Complete Guide to Grow Your Startup

OpenAI Launches ChatGPT Atlas Browser For macOS

What Sora’s Martin Luther King Jr. problem revealed to OpenAI

Vibe Coding Tips for SMBs and Startups

New WordPress Vibe Coding Simplifies Building Websites

Google’s John Mueller Flags SEO Issues In Vibe Coded Website

Try This One-Minute Test to Uncover Hidden Health Risks

Serena Williams’ Red Pumps Turn Heads at the Princesa De Asturias Awards Ceremony

9 Films That Changed the Oscars Forever

Master Buffett & Munger’s Proven Strategy to Identify Long-Term Stock Winners

Four ways to be more selfish at work

How to Create a Seamless Instagram Carousel Post

Up First from NPR : NPR

Meta Plans to Release New Oakley, Prada AI Smart Glasses

Our Picks

Try This One-Minute Test to Uncover Hidden Health Risks

Serena Williams’ Red Pumps Turn Heads at the Princesa De Asturias Awards Ceremony

Subscribe to Updates

What's Hot

OpenAI gets caught vibe graphing

Related Posts