Is It Negligent Not to Use AI in Your Security Assessment Programme?

Reid Hoffman recently made a deliberately provocative point about healthcare. Speaking at WIRED Health 2026, he argued that for anything serious, doctors who are not using one or more frontier AI models as a second opinion are “bordering on committing malpractice”. His reasoning was simple: for decades we have trained doctors to absorb, recall, connect and apply vast bodies of medical knowledge. That is precisely the kind of task frontier models are becoming extremely good at. They are not a replacement for judgement, but they are increasingly difficult to ignore as a second opinion.

The same argument is now beginning to apply to cyber security assessment. Not because AI has suddenly made human security expertise redundant. It has not. But because security testing has always depended on a combination of knowledge, experience, pattern recognition, tool familiarity, stamina and speed. Those are exactly the areas where AI is beginning to change the economics of offensive and defensive security.

The Speed of Expertise

Anyone who has worked with successive waves of junior penetration testers will recognise the gap. At one end, there is the bright graduate who has worked through Hacking Exposed, built a home lab and can run the standard tooling. At the other, there is the experienced tester who seems to know instinctively where to look, which obscure behaviour is suspicious, and which page of the TCP/IP book contains the answer. That difference is not just knowledge. It is fluency.

It is the difference between someone who takes most of a day to reach a result and someone who, in under an hour, can exploit a perimeter system, pivot, and establish access to something with greater permissions. In security assessment, speed has always mattered. Not speed for its own sake, but speed born of familiarity, repetition and judgement.

This is where AI becomes hard to ignore.

Frontier models do not get tired. They do not lose concentration after the fifth repetitive test. They can work through a large pile of checks, hypotheses and evidence back-to-back. They can read documentation, reason about code, compare configuration against expected behaviour, generate test cases, summarise logs, suggest attack paths and help prioritise what matters. Used well, they act as a tireless second pair of hands and a second opinion.

So, Is It Negligent?

If the answer is not already yes, we are getting very close to that point. That does not mean every organisation must immediately hand over its security testing to autonomous agents. It does mean that an organisation with a serious security assessment programme should now be asking a more uncomfortable question: what risk are we accepting by not using AI?

The perception is already changing. In April 2026, reports that CISA – the US Cybersecurity and Infrastructure Security Agency – did not yet have access to Anthropic’s advanced cyber model Mythos became headline material. Axios reported that CISA lacked access despite Anthropic briefing government bodies, while The Verge described the absence as significant given CISA’s central role in US cyber defence.

That matters because the story itself tells us something. A few years ago, the idea that a national cyber agency lacked access to a particular AI model would not have been treated as a major operational concern. Now it is. The assumption is shifting from “AI might be useful” to “access to frontier AI capability may be critical to effective cyber defence”.

And that assumption is not irrational.

The Attacker Asymmetry

Attackers will use these models. They will use them to read code, generate exploits, scale reconnaissance, accelerate vulnerability research, write better phishing lures, analyse patches, and reduce the skill threshold for complex operations. Defenders cannot respond with nostalgia. If the attack side of the equation is being accelerated by AI, then defensive testing, verification and remediation need to be accelerated too.

But the conclusion should not be simplistic. Security assessment is not just offensive knowledge.

A good penetration test is not merely the act of finding vulnerabilities. It is a professional process for discovering, validating, explaining, prioritising and helping an organisation reduce risk. In the firms many of us grew up in, junior testers were not allowed to drop a critical vulnerability into a client report without review by a senior colleague. That supervision mattered. It protected the client from false positives, overstated impact, poor reproduction steps and misunderstood business context. The same principle applies to AI.

An AI system may identify an interesting behaviour, suspicious pattern or possible exploit path. That does not automatically make the finding reportable. It needs to be reviewed. It needs to be validated. It needs to be understood in context. It needs to be assessed for exploitability and business impact. In other words, AI may dramatically improve the testing process, but it does not remove the need for professional supervision.

In fact, the best model may be very close to the model we already trust: junior work reviewed by senior expertise. The difference is that the “junior” may now be a tireless AI system capable of producing a much larger body of initial analysis, test coverage and supporting evidence.

The Reporting Problem

There is another reason human oversight remains essential. A vulnerability report is only useful if the people responsible for fixing the issue can understand it and act on it.

Good reporting has always been one of the most underrated skills in penetration testing. Finding the issue is only part of the job. The real value is in explaining the vulnerability clearly enough that developers, infrastructure engineers, product owners and risk managers can respond quickly and effectively. A good report removes ambiguity. It explains what was found, why it matters, how it was reproduced, what the likely impact is, and what remediation options are practical.

AI can help enormously here. It can improve structure, generate clearer reproduction steps, tailor explanations for different audiences, and help produce developer-friendly remediation guidance. But again, it needs review. A beautifully written but technically wrong recommendation is worse than an ugly but accurate one. The goal is not automated prose. The goal is faster, clearer, more useful communication between security teams and the people who have to fix the problem.

The Learning Loop

There is also a cultural point that should not be lost.

One of the great byproducts of having penetration testers on site with development teams – especially over an extended period – was knowledge transfer. Developers learned how testers thought. Testers learned how the application was actually built. Informal conversations often mattered as much as the final report. A tester sitting with a developer and saying, “The reason I looked here is because this pattern often leads to this class of issue,” could have a lasting impact on how that developer wrote code in future.

If AI makes testing faster but strips away that learning loop, something important will be lost.

So the right question is not simply, “Can AI find vulnerabilities?” It is also, “How do we preserve and improve the learning that happens around the assessment?” That might mean AI-generated developer briefings, remediation workshops, annotated examples, secure coding guidance linked directly to the findings, or conversational interfaces that let developers ask follow-up questions about a vulnerability and its fix. The opportunity is not just faster testing. It is better knowledge transfer at scale.

Retesting at Speed

Follow-up is another area where AI may have an immediate and practical impact.

Retesting has always been a critical part of the security assessment lifecycle. It is not enough to identify a vulnerability and assume it has been fixed. The fix needs to be verified. The original issue needs to be retested. Any unintended side effects need to be checked. Sometimes the first fix is incomplete. Sometimes it closes one hole while opening another. Sometimes the issue is fixed in one environment but not another.

Traditionally, retesting could be delayed by tester availability, project scheduling, commercial friction or simple operational drag. AI changes the expectation. If a vulnerability has been patched, why should verification take days or weeks? Why not run the relevant checks quickly, repeatedly and consistently? Why not give development teams near-immediate feedback if the issue still exists?

This matters because speed changes behaviour. If teams can fix, retest and confirm within a tight loop, remediation becomes part of delivery rather than an awkward afterthought.

Smarter Scoping

The same applies to test coverage and scoping.

Experienced penetration testers often form a view during scoping about where the issues are likely to be. They can look at an application, architecture, technology stack, exposed services, authentication flows, integrations and development history, and make an informed judgement. They may not know exactly what they will find, but they often know whether they are likely to find something significant.

That skill is part instinct and part accumulated pattern recognition. AI can help make it more explicit. It can maintain a live view of what has been tested, what has changed, what remains untested, which assets appear higher risk, which components have historically produced issues, and where new threat intelligence should change priorities. It can support a more active model of security assessment: not just an annual test, but a continuously updated understanding of where assurance is strong and where it is thin.

This is perhaps the real shift.

The future of security assessment is not simply “AI penetration testing”. It is security assessment becoming more continuous, more evidence-led, more responsive to change and more tightly integrated with remediation. Human testers still matter, but their role changes. They become supervisors, validators, threat modellers, reviewers, explainers and escalation points. They spend less time grinding through repetitive checks and more time applying judgement where judgement matters most.

What Negligence Actually Means

Negligence is not about failing to use a fashionable tool. It is about falling below a reasonable standard of care. As AI capabilities improve, the reasonable standard of care in cyber security assessment will move. Boards, regulators, insurers and customers may reasonably ask whether an organisation used the best available methods to identify and reduce risk. If attackers are using AI to accelerate discovery and exploitation, and defenders are not using AI to improve assessment and remediation, that gap will become harder to defend.

There will still be good reasons to be cautious. AI systems can hallucinate. They can misunderstand context. They can produce false positives. Like humans they can miss things. They can create data governance concerns. They can introduce new operational risks if deployed carelessly. Sensitive code, vulnerability data and customer environments need careful handling. Security teams should not confuse “AI-assisted” with “automatically trustworthy”.

But none of that is an argument for ignoring AI. It is an argument for deploying it properly.

The responsible position is not “AI should replace penetration testers”. Nor is it “AI is too risky to use”. The responsible position is that AI should now be incorporated into security assessment programmes with appropriate human supervision, validation, governance and measurement. It should be used where it adds value: coverage, speed, consistency, triage, reporting, retesting, knowledge transfer and prioritisation. It should be supervised where risk is high: exploit validation, critical findings, business impact assessment and remediation advice.

In short, AI should be treated like a powerful new member of the security team – not an oracle, not an intern, and definitely not a toy.

The light-hearted version is this: if we are going into the next phase of cyber conflict, we probably should not turn up with yesterday’s armoury and a brave expression.

The serious version is more important. Attackers will use every capability available to them. They will not slow down out of respect for traditional testing cycles. They will not wait for the annual assessment window. They will not avoid automation because it feels uncomfortable.

So defenders need the right capabilities in their own armoury. They need to deploy them carefully, supervise them properly, and use them in ways that create real operational value.

If it is not already negligent to ignore AI in security assessment, it soon may be. The better question is not whether AI belongs in the programme. It is how quickly, safely and intelligently it can be put to work.