Technology

What frontier AI really means for enterprise safety


The “clear and current hazard or scorching air” framing surrounding Anthropic’s Claude Mythos frontier AI mannequin units up the flawed argument. What Mythos represents is a tempo shift in how vulnerabilities are discovered, chained, and exploited.

Anthropic says Mythos can establish and exploit zero-days throughout main working methods and browsers at a degree severe sufficient that they selected to not launch it publicly, limiting entry by way of Challenge Glasswing to a choose group of organisations. The UK AI Safety Institute (AISI) places numbers on that. Earlier than Mythos, no AI mannequin had accomplished a 32-step simulated company assault chain end-to-end. Mythos Preview did so in three out of ten runs. GPT-5.5 did so in two. At Professional degree, GPT-5.5 achieved a 71.4% move fee towards Mythos at 68.6%. The potential hole between the 2 main frontier fashions is narrower than the protection implies. The governance hole is significantly wider.

Amongst practitioners who’ve examined Mythos, the response has been extra restrained than the coverage response suggests, they usually’re largely proper. The vulnerability pipeline was by no means the core drawback for defenders. We had been by no means affected by a scarcity of issues to fret about. We have already got extra disclosures, extra advisories, extra proof-of-concepts, and extra publicity knowledge than most organisations can realistically operationalise.

What Mythos does is speed up that actuality. It compresses the timeline between weak spot, discovery, weaponisation, and the necessity for defensive motion. The limitations to deploying a mannequin like this are actual as we speak. The compute necessities are substantial and the infrastructure calls for are specialised. These limitations will not stay in place for lengthy.

There’s additionally a floor space drawback operating beneath the invention query. Vibe coding ensures we do not hit a plateau. Greater tempo growth, extra dependencies, extra assured transport. Even when defect fee per commit improves, complete assault floor space grows. The amount of code being written with AI help means the goal is increasing similtaneously the instruments for locating flaws in it are bettering.

Present fashions are genuinely succesful at sample bugs: injection flaws, leaked secrets and techniques, identified dangerous dependencies, and chaining findings throughout methods. The place they nonetheless fall quick is anyplace correctness is determined by intent. Enterprise-logic and authorization flaws stay the class the place AI fashions are constantly weakest. In contrast to sample bugs, they require understanding what code is meant to do, not simply what it does. That hole hasn’t but been closed. Human judgment stays irreplaceable within the analysis and safety pipeline. At Vedere Labs we already use Claude Opus 4.6 in our analysis workflow and have reported a number of zero-days discovered by way of that course of. The aim is popping quicker analysis into higher safety.

From a vendor perspective, vulnerability administration and QA are converging as AI tooling improves, however the query every solutions stays distinct and no quantity of converging modifications that. QA asks whether or not one thing works. Vulnerability administration asks whether or not it may be abused and what the blast radius seems like.

When fashions like Mythos grow to be extra broadly accessible, count on a spike in disclosed vulnerabilities, then a reckoning: distributors confronting what was already there however by no means measured. The query is whether or not discovery interprets into remediation or simply accumulates as an even bigger backlog. The stress to ship would not disappear as a result of a mannequin discovered extra bugs. With out arduous blocks for exploitable, high-impact points and agency deadlines for every part else, the surge dangers changing into the brand new regular slightly than a extra sturdy correction.

For defenders, the arduous half has at all times been what to do with the intelligence. The place is the affected asset, is it really uncovered, how essential is it, what’s the possible path to compromise, and what will be carried out proper now to cut back danger? Mythos makes that operational burden extra pressing. The NCSC mentioned as a lot when it warned of a coming vulnerability patch wave, and the AISI benchmark knowledge offers that warning some weight.

The quicker vulnerabilities are discovered, the extra fragile any organisation turns into if its remediation course of cannot hold tempo. That is particularly acute in operational expertise (OT) and important nationwide infrastructure (CNI), the place the methods most important to societal perform are sometimes the least able to aggressive patch velocity with out introducing operational instability. In these environments, patching at scale can itself grow to be a supply of danger.

The main focus has to shift towards operational survivability: preserving visibility, constraining attacker manoeuvre area, limiting blast radius, and sustaining continuity beneath stress. The organisations that may patch at tempo with out the wheels falling off would be the ones which have already carried out the foundational work on asset stock, segmentation, and prioritisation primarily based on precise publicity.

Within the frontier AI period, operational survivability is the measure that issues. The organisations that perceive that now will not be those scrambling up the seashores when the patch wave begins to construct.