We're Bracing for an AI-Driven Wave of Attacks - But What Will They Actually Look Like?
Anthropic has spent a year tracking how threat actors are weaponising AI. The findings challenge some of our most comfortable assumptions about where the danger really lies.
There is no shortage of warnings about AI transforming the cybersecurity threat landscape. Every major analyst, vendor, and industry body has offered some variation of the same forecast: AI will supercharge attackers, lower the skill barrier to entry, and produce a wave of attacks at a scale and speed that defenders have never had to contend with before.
That framing is not wrong, but it is imprecise in ways that matter. A wave is not a uniform wall of water. Different parts of it move at different speeds and carry different force. If defenders are to prepare intelligently, we need to understand what we are actually facing - not just that a wave is coming, but what it is made of, where it will hit hardest, and what will carry the most destructive energy.
Anthropic’s recently published analysis of 832 accounts it banned for malicious cyber activity - tracked over twelve months and mapped against the industry’s standard framework for cataloguing attacker behaviour (MITRE ATT&CK) - offers the most detailed empirical answer to that question we have seen to date. The findings deserve a careful read from anyone responsible for security strategy.
What Most Attackers Are Actually Doing With AI
Start with what is most prevalent, because it tells an important story about the state of the threat today.
The overwhelming majority of threat actors in Anthropic’s study - close to seven in ten - are using AI models to build offensive tools. Writing malicious code, refining scripts, developing custom capabilities. The next largest groups are using AI to make those tools harder to detect: obfuscating file contents, encoding payloads, generating polymorphic variants that signature-based defences struggle to catch. A significant portion are also using AI to understand how to impair or disable the security controls on a target system.
In other words: the dominant use case for AI among today’s attackers is toolmaking and evasion. The AI is functioning as a capable, patient, and endlessly available development assistant - helping actors produce better weapons, faster, with less skill required to produce them. This matters because it means the quality floor for commodity attack tooling has risen significantly. Malware that once required genuine programming expertise to produce can now be assembled by actors with a basic understanding of what they want. Evasion techniques that required knowledge of endpoint security internals can now be applied by reading a prompt response.
This is the wide base of the wave. It does not represent a radical departure from what attackers have always tried to do; it represents a meaningful lowering of the cost and skill required to do it well.
How Attacker Behaviour Is Shifting - Even at the Mid-Tier
What is more concerning, and more revealing, is the directional change Anthropic observed between the first and second halves of the study period.
Actors are increasingly using AI not just to prepare for attacks, but to conduct them once they are inside a network. In the second half of the year, Anthropic recorded a notable increase in AI being used for account discovery - identifying users, groups, and credentials within a compromised environment, and for automated data exfiltration. At the same time, use of AI for standalone malware building and obfuscation scripts declined slightly.
Read together, this is significant: actors are getting into networks and then reaching for the AI model to help them work out what to do next, who has access to what, and how to move the data they want out of the environment. The AI is no longer just a pre-engagement tool; it is increasingly being used in live operations.
This matters enormously for defenders still relying primarily on pre-engagement controls - perimeter security, email filtering, endpoint protection at the point of entry. If the AI assistance is now happening after initial access, then the fight has already moved inside the walls by the time the most dangerous assistance is being provided.
The Apex: What the Most Sophisticated Actors Are Doing
The actors at the top of Anthropic’s risk scoring represent something qualitatively different - and it is here that the analysis challenges our most fundamental assumptions.
Anthropic identified a specific espionage campaign, attributed to a threat actor they have labelled GTG-1002, as the most concerning example in their entire dataset. This actor achieved the maximum possible risk score. They successfully compromised government and critical infrastructure targets across multiple countries. And yet, here is the striking detail, the number and variety of techniques they employed placed them squarely in the same territory as dozens of medium-risk actors. By the traditional metrics of attacker sophistication - breadth of techniques, variety of tactics, technical complexity of tools - GTG-1002 would not have stood out. What made them uniquely dangerous was not what they did, but how they chained it together.
GTG-1002 built a scaffolding that allowed an AI agent to act as an autonomous operator rather than a code-writing assistant. They integrated open-source penetration testing tools so that the AI could call them directly, receive the results, reason about what those results implied, and decide what to do next - without waiting for human instruction. The AI conducted reconnaissance, mapping dozens of internet-facing services autonomously. Once inside, it discovered internal administrative systems, databases, and authentication infrastructure. It exploited a vulnerability in a public-facing web server to gain access to internal cloud environments. It harvested private keys and service account credentials. It moved laterally through cloud infrastructure using those credentials, and it staged and compressed tens of thousands of proprietary records for exfiltration.
The human operator set objectives and made consequential final decisions. Everything in between - all the tactical, adaptive, real-time operational work - was executed by the AI, reasoning its way through an unfamiliar environment without a script to follow.
This is not AI as a tool. This is AI as an operator.
The Uncomfortable Conclusion: Skill Is No Longer the Signal
Anthropic’s data confronts a deeply held assumption in threat intelligence: that technical sophistication is a reliable proxy for attacker risk. We have long used skill level as a shorthand - skilled actors are dangerous, less skilled actors are manageable. That approach no longer really works.
The correlation between an actor’s assessed technical sophistication and their actual risk score in this study is weak. The correlation between the breadth of techniques they use and their risk score is similarly weak. Actors using different access methods (direct API access, agentic coding tools, conversational interfaces) converge on statistically indistinguishable risk profiles.
What actually distinguishes the highest-risk actors is which specific activities they are using AI for: the hands-on, in-network, live-environment techniques that used to be the exclusive territory of the most capable operators. Lateral movement. Credential dumping. Web shells. Remote access. These remain relatively rare in the overall population, but they are the strongest predictors of an actor who is generating serious AI-enabled uplift.
The implication is uncomfortable: a mid-skill actor who uses AI to move laterally through a network is more dangerous today than a highly skilled actor who only uses AI to build better malware. We need to recalibrate what we are watching for.
Our Frameworks Have Not Kept Up
There is a structural problem that runs beneath all of this. The industry’s shared vocabulary for describing attacker behaviour was built to describe individual techniques: what an attacker did at a given moment. It does not yet have a way to describe an AI agent that decided to do something, or that reasoned its way from one attack stage to the next without a human in the loop.
The autonomous orchestration that made GTG-1002 uniquely dangerous - specifically, the ability to chain reconnaissance, exploitation, lateral movement, and exfiltration into a coherent operation in real time - has no entry in the framework threat analysts rely on. The AI’s capacity to adapt when it encountered unexpected infrastructure, to change tactics mid-operation, to make decisions about what data was worth collecting: none of this is captured in the categories we use to brief boards, file incident reports, or share threat intelligence.
This is not a minor gap. It means that by the metrics we currently use, the most dangerous AI-enabled actors can appear ordinary right up until they have compromised critical infrastructure.
What Should Defenders Be Doing?
Three things, with urgency.
Accept that annual or periodic testing is no longer an adequate baseline. If the threat is increasingly operating inside your network after initial access, and if AI is now helping attackers make real-time decisions about how to move through your environment, then a point-in-time view of your security posture is structurally insufficient. Defenders need continuous visibility into what is happening inside their perimeter, not just at the edge of it.
Recalibrate detection logic away from tool signatures and towards behavioural patterns. The attacker building polymorphic malware with AI assistance will evade signature-based detection by design. The more important signal is behavioural: account discovery at unusual times, lateral movement between systems with no established traffic pattern, staged compression of large datasets, credential access followed immediately by cloud API calls. These are the patterns that correlate with the highest-risk AI-enabled activity, and they are detectable if you are looking for them.
Start thinking about agentic attack patterns now, before they are widespread. GTG-1002 is a leading indicator, not an outlier. The scaffolding they built - autonomous tool execution, AI-directed pivots, real-time adaptation to discovered infrastructure - will become more accessible, not less, as the underlying models and tooling improve. Security teams that have not considered how they would detect an AI agent operating inside their network are already behind. The training data for your detection systems needs to include these patterns before they become routine.
The Broader Lesson
The wave that is coming is not uniform. Most of it is a rising tide - better tools, faster development cycles, lower skill floors, more commodity attack capability in more hands. It demands better and more consistent hygiene across the industry.
But the leading edge of the wave is something different. It is not more of the same thing at higher volume. It is a qualitative shift in how attacks are conducted: adaptive, autonomous, and operating at a tempo that assumes human decision-making in the loop only when it is strategically necessary.
The defenders who fare best will not be those who simply upgraded their existing controls. They will be those who recognised that the game had changed and responded accordingly - with continuous monitoring, behavioural detection, and AI of their own to keep pace with AI on the other side of the perimeter.
Arcseer helps organisations move from periodic testing to continuous AI-driven assessment and active detection. If you would like to discuss how the findings in this analysis apply to your environment, get in touch.