Featured

BlockThreat - Defending DeFi in the Age of AI Offensive Tooling

AI changes the timeline, but does it change the game? A reality check on offensive AI tooling, human audits, and defense before hype and panic harden into bad security decisions.

Peter Kacherginsky

23 Feb 2026 • 8 min read

The DeFi security industry has been rocked by the rapid evolution of AI powered offensive tools that are now finding and exploiting bugs in contracts long considered secure. Older contracts from Balancer, Yearn, TrueBit, and others have been hit by relatively hard to find vulnerabilities, and more continue to fall daily.

Anthropic, OpenAI, and others have released tools and research for vulnerability hunting and exploit development, with DeFi hacks often serving as the testbed. That has only intensified concerns that these capabilities can be misused for offensive and malicious purposes.

All of this is fueling panic, along with some extreme proposals, like stopping the public development of offensive tools, making future contracts closed source, and reducing human auditors to AI chaperones:

Note: I’m using Hari’s posts as one example of a broader trend, not to single him out. I have a lot of respect for him and appreciate everything he has done to make the ecosystem safer, including thinking forward about the future of audits.

Should we stop publishing offensive security research and tools? Should DeFi go closed source to slow attackers down and trust a single AI hive mind to say everything is secure? Before reaching for these fixes, let’s look at what three decades of infosec history teach us about panic, exploitation, powerful new tooling, and how defenders actually win.

Closed Source Contracts

Let’s start with closed source smart contracts. The truth is that closed source contracts get exploited all the time. In fact, just this week's compromise of the RIGHT smart contract was closed source. As a test, I used the latest Claude model to go from raw EVM bytecode to a working exploit PoC in minutes:

Closed source is not the defense people think it is. Attackers have been tearing through proprietary software for decades, including Windows, Acrobat, macOS, Cisco IOS, and just about everything else worth targeting. I’ve spent years reverse engineering heavily obfuscated malware from nation state teams and seen every anti analysis trick you can imagine. Those techniques can make reverse engineering painful, but not impossible. By comparison, analyzing EVM bytecode is trivial.

You could argue that we could obfuscate contracts further with packing, encryption, or other tricks. Unfortunately, obfuscated code is even riskier to deploy, more expensive to build, and increases transaction costs. It adds complexity without changing the core problem.

If closed source and obfuscation do not meaningfully slow attackers, then the main effect is to slow down legitimate auditors and the occasional whitehat trying to understand the root cause. Worse, it can create a false sense of security. Teams start believing they are safer because they made analysis slightly more annoying, and that often leads them to neglect the rest of the security program, including audits, monitoring, and incident response. That usually ends as an inevitable and painful lesson.

Offensive Software

When it comes to offensive security tooling, it helps to look at history. The security industry has faced this exact kind of panic before, when powerful new tools were expected to destroy every computer on the internet.

In 1995, Dan Farmer and Wietse Venema released SATAN (Security Administrator Tool for Analyzing Networks), which could identify about a dozen potentially exploitable issues, including vulnerable sendmail versions, misconfigured remote shares, and exposed shells. The tool caused a major stir. Panicked security and IT professionals called it a “burglar’s tool kit” that would “turn second rate hackers into efficient computer crackers”. But once the initial panic faded, system administrators used it to scan their networks, patch weaknesses, and improve their security posture. The result was simple. Everyone got a little safer.

History repeated itself in 2003 when H.D. Moore released Metasploit. Metasploit did not just help find vulnerabilities, it made exploit development and validation easier by providing a framework with scanners, shellcode generators, payloads, and reusable modules. It removed many of the hardest parts of exploit development and made it possible to go from a newly published vulnerability to a working exploit in hours instead of days or weeks. Once again, this triggered a round of FUD, with security professionals warning that attackers could move faster than organizations could patch.

That brings us to today, where people are again panicking about attackers using AI driven tooling to find vulnerabilities faster than defenders and wreak havoc on the ecosystem. Just like before, many are forgetting how much tools like these improve the ability of defenders to find bugs first. And if you are worried that AI tools can produce working exploits that might be abused immediately, then you probably have not spent enough time negotiating security findings. In many cases, unless you can produce a real exploit PoC, your warning gets challenged, downgraded, or dismissed entirely with a “will not fix” label. Offensive capability is not just useful for attackers. It is often necessary for defenders to be taken seriously.

Limits of AI Audits

We have already talked about how many DeFi teams are frustrated with expensive audits and slow timelines, especially in a tighter budget environment. That is exactly why we should discuss the obvious next bad security pattern now: using AI not only to write code, but to replace human auditors.

OpenAI, Paradigm, and OtterSec just published a very relevant benchmark, EVMbench, testing how AI agents perform at detecting, patching, and exploiting smart contract vulnerabilities using real audit data from Code4rena. The big takeaway is that models are much better at weaponizing known bugs than they are at finding them. In the benchmark, Claude Opus 4.6 led on detection at 45.6%, but once a bug was known, GPT-5.3-Codex generated working exploit PoCs in 72.2% of cases:

This statistic is already playing out in the recent Moonwell incident, where a misconfigured price oracle led to nearly $1.8M in bad debt. The code change that introduced the bug was coauthored by the best detection model we have today, Claude Opus 4.6:

On the one hand, Claude and Codex are fantastic at hunting down familiar bugs like reentrancy, even from raw bytecode, as in the RIGHT exploit above. That is genuinely good news! AI may help us push common bug classes like reentrancy closer to extinction, much like how we killed off format string bugs. However, even the best detection models can miss an obvious misconfiguration, like pricing ETH at $1.

There is a growing body of research that serve as a good reminder of what LLMs do well and where they still break down. Here is one of my favorite vulnerable code snippets from the Azuma hack a few years ago. It did not get much publicity, has no public writeups, and a kind of bug LLMs still struggle to detect:

function _update(address from, address to, uint256 value, bool mint) internal virtual {
    uint256 fromBalance = _balances[from];
    uint256 toBalance = _balances[to];
    if (fromBalance < value) {
        revert ERC20InsufficientBalance(from, fromBalance, value);
    }

    unchecked {
        // Overflow not possible: value <= fromBalance <= totalSupply.
        _balances[from] = fromBalance - value;

        // Overflow not possible: balance + value is at most totalSupply, which we know fits into a uint256.
        _balances[to] = toBalance + value;
    }

Human auditors can quickly spot the unusual toBalance caching, which becomes exploitable when from and to are the same address, effectively doubling the transferred value. Here are outputs from EVMBench using EVM-5.3, Claude Sonnet 4.6, Gemini 3 Pro, and SuperGrok Expert (left to right and top to bottom):

Notice that the only LLM engine that correctly identified the exploit was SuperGrok. It explained that it found it by searching for posts about ERC404 vulnerabilities and matching it to an identical ERC404 exploit in the MINER project, which had extensive public writeups. It is a good reminder of the limits of LLMs where they are strongest at pattern matching and not original reasoning.

💡

LLMs are much less dependable when the job requires a novel idea, multi-contract and multi-protocol reasoning, complex logic, unexpected edge cases, or context that was not explicit.

The same study also hints at where AI models become much more powerful when coupled with human analysis. With even modest human assistance, detection and patching performance more than doubled across models:

There is an open secret in the audit industry: audits are time bounded. An auditor gets a fixed number of weeks, then they have to move on. A dangerous bug may be sitting just beyond the time budget, waiting for one more round of testing or one more experiment on a complex interaction. This is exactly where the human plus AI pairing becomes powerful. AI can take over a lot of the grind, writing PoCs, testing hunches, verifying variants, while the human auditor spends more of their limited time on strategy, edge cases, and novel attack paths.

💡

AI does not remove the need for human auditors. It increases the leverage of skilled humans, including defenders and attackers.

So when you build or audit your next protocol or make that quick oracle update, keep the limits of current AI models in mind. They can help find the obvious and automate the tedious, but they will not replace the judgment, context, and creativity needed to catch critical vulnerabilities.

Back to the Fundamentals

As we enter another era of both offensive and defensive security, it is important to stay grounded in fundamental security principles:

Offensive tools are not just for attackers. They help defenders find and validate vulnerabilities earlier, help human auditors think more creatively and dig deeper into code. AI is making those workflows accessible to smaller teams, not just elite auditing firms.
AI changes the timeline, not the game. If you ship vulnerable code, the threat model does not change. AI just speeds up the race to find and exploit it. Attackers may gain a short term edge, but defenders can use the same tools to find and fix issues sooner too.
Know the limits of AI models. They are excellent at recognizing known bug classes and attack patterns, but far less reliable when the task requires novel ideas, cross protocol reasoning, or context that is not explicit in the code.
Speed and economics matter now. The question is not just who has better tools, but who can use them faster and more efficiently.
Do not overfocus on exotic exploits. A lot of smart contract exploits still come from simple bugs and misconfigurations.
Code is only part of the problem. Human and operational security remain some of the easiest ways to lose funds.

Do not fear the future or get swept up in the hype. Know the limits, and use the latest offensive tools to defend yourself. If you do, you will survive.

Closed Source Contracts

Offensive Software

Limits of AI Audits

Back to the Fundamentals

Sign up for more like this.