Claude Opus 4.8 is studying to say AI’s three hardest phrases: “I don’t know”
Abstract created by Good Solutions AI
In abstract:
- PCWorld experiences that Anthropic’s Claude Opus 4.8 focuses on enhancing AI honesty by instructing the mannequin to confess when it lacks info.
- The mannequin achieved near-perfect scores in honesty benchmarks for coding questions and exhibited analysis consciousness throughout testing.
- Opus 4.8 represents a big step ahead in making AI programs extra clear about their data limitations and uncertainties.
Honesty is a key sticking level with even essentially the most {powerful} LLMs. It’s not a lot that they’re deliberately mendacity to you; as a substitute, they’ll confidently inform you issues they’re not one hundred pc (and even 50 %) certain about.
With Opus 4.8, its newest Claude mannequin, Anthropic says it’s made Claude extra sincere about telling you what it doesn’t know, or if it has a low degree of confidence in what it’s telling you.
Launched Thursday, Claude Opus 4.8 is not Claude Mythos Preview, Anthropic’s new “frontier” mannequin that’s so {powerful}, solely a handful of “trusted companions” have been allowed to check it for safety causes. There’s nonetheless no strong launch date for Claude Mythos.
Arriving about six weeks after Claude Opus 4.7, Opus 4.8 takes over as Anthropic’s strongest mannequin normally availability, and for essentially the most half, it marks a “modest” enchancment over its predecessor, whereas Mythos Preview handily bests it in cybersecurity duties, Anthropic says.
However in response to the corporate’s benchmarks, Opus 4.8 is tops in a key class: honesty, with the mannequin snaring “near-perfect” scores in relation to admitting it doesn’t know the reply to a coding query.
Even the crazy-powerful Mythos Preview couldn’t finest Opus 8.7 on this specific honesty take a look at, coming in a detailed second, whereas Opus 4.7 completed a distant fourth.
In fact, these are Anthropic’s benchmarks we’re seeing; we’ll have to attend for third-party testing to get extra goal outcomes, to not point out experiences from the wild. I plan on taking Opus 4.8 for a spin within the coming days.
Anthropic additionally shared some “regarding hints associated to analysis consciousness”–that means that Opus 4.8 confirmed indicators that it knew it was being examined–whereas noting a “tendency for the mannequin to cause about how its outputs might be graded.” These considerations aren’t distinctive to Opus 4.8; certainly, the most recent “frontier” fashions usually appear to know after they’re being poked and prodded.
Nonetheless, it’s good to see that fashions like Opus 4.8 are dialing down the BS, a minimum of on paper. Hopefully it’ll keep that degree of honesty in apply.

