Anthropic’s Mythos raises the stakes for safety validation
A safety staff lately walked me by way of a situation that illustrates precisely why the trade’s present obsession with autonomous AI is so dangerous. That they had used an agentic instrument to uncover a fancy assault path that began with a small foothold and resulted in a crucial publicity. It was a transparent win for discovery. They remediated the gaps and restricted entry, anticipating the problem to be closed.
The difficulty began once they went again to show the repair. As a result of the instrument was pushed by a probabilistic mannequin designed to discover and pivot like a human, it did not take the identical path twice. When the unique path did not present up, the staff could not inform if the opening was plugged or if the system had merely chosen a unique route. That type of pointless doubt is the hidden tax of the push towards complete autonomy.
That doubt, in a single atmosphere, is the manageable model of the issue. Earlier in April Anthropic demonstrated what it appears like when the attacker is an AI. Claude Mythos autonomously found and chained zero-day vulnerabilities throughout main working methods, producing working exploits in hours. That might have taken elite researchers weeks. Anthropic withheld public launch for good purpose, however the implication is already right here: disclosure now equals weaponisation.
That places a sharper level on a query safety groups have been already wrestling with: particularly, how do you validate your defences when the risk retains altering? How have you learnt your safety controls work and remediate no matter falls brief, earlier than these gaps are exploited?
Safety validation has all the time relied on predictability. If you understand how attackers function, you may take a look at your defenses towards these strategies and know the place you stand – that is the distinction between figuring out your defenses work and hoping they do. Traditionally, attacker habits adopted well-documented patterns and strategies, which is what made that testing dependable. AI is starting to vary that predictability, giving attackers the flexibility to purpose about novel paths at machine pace. However even earlier than novel assaults change into routine, AI already affords attackers a extra instant benefit: the flexibility to execute recognized strategies at a scale no human staff can match, masking extra of the assault floor sooner than the atmosphere adjustments.
Defenders are responding in type and agentic safety instruments are gaining traction. Probably the most significant dangers right now hardly ever come from an unpatched server. They arrive from the connective tissue of the enterprise, the place lateral paths are created by service accounts, belief relationships or a set of permissions that made sense as soon as however now not do. Programs that may piece these collectively get us nearer to how actual assaults occur.
However this shift introduces a elementary battle between exploration and validation. Agentic methods are designed to discover, to not repeat. In cyber safety, that’s what makes them efficient for discovery, however additionally it is what makes them a legal responsibility for remediation. They’ll let you know what may occur, however not whether or not one thing has truly been fastened.
Answering that requires deterministic execution. It means executing the identical strategies, with the identical circumstances, in a strictly repeatable manner. It’s not a couple of variation or an identical route. It’s about the very same sequence so the result may be in contrast straight. With out that, you’re working on assumption, not confidence.
The true problem is assembly consumer expectations for security and accountability. Individuals now need methods that behave like brokers engaged on their behalf, however in addition they count on the distributors constructing these methods to take duty for the outcomes. If a probabilistic mannequin makes a mistake in a dwell manufacturing atmosphere, the shopper holds the seller accountable, not the mannequin supplier.
What’s rising is a two engine structure the place agentic strategies and deterministic execution work collectively. Agentic layers deal with discovery, surfacing compound exposures that emerge from how methods work together over time relatively than from any single misconfiguration. Deterministic engines then take these findings and execute them in a managed, repeatable manner so safety groups can confirm a repair is actual and never simply unobserved. Neither layer is adequate by itself. Discovery with out verification leaves you with precisely the doubt drawback I opened with. Verification with out discovery leaves you testing what you already know, which isn’t the place the true threat lives.
The trade will preserve transferring towards extra autonomous methods. Mythos confirmed that the trajectory is true, and that the tempo simply accelerated. However for safety leaders, the core requirement has not modified. It’s essential to know a risk has been neutralised, not simply that it has not proven up lately. Groups operating steady validation are already forward. However forward simply received redefined. When an adversary can purpose about novel assault paths and produce working exploits at machine pace, confidence comes from verification – not from the absence of a discovering.
Amitai Ratzon is CEO at Pentera

