Claude Mythos signals a new era in AI-driven security, finding 271 flaws in Firefox

The Claude Mythos Preview appears to be living up to the hype, at least from a cybersecurity standpoint. The model, which Anthropic rolled out to a small group of users, including Firefox developer Mozilla, earlier this month, has discovered 271 vulnerabilities in version 148 of the browser. All have been fixed in this week’s release of Firefox 150, Mozilla emphasized.

These findings set a new precedent in AI’s ability to unearth bugs, and could turbocharge cybersecurity efforts.

“Nothing Mythos found couldn’t have been found by a skilled human,” said David Shipley of Beauceron Security. “The AI is not finding a new class of AI-exclusive super bugs. It’s just finding a lot of stuff that was missed.”

However, the news comes as Anthropic is reportedly investigating unauthorized use of Mythos by a small group who reportedly gained access via a third party vendor environment, revealing the double-edged nature of AI.

Closing the fuzzing gap

Firefox has previously pointed AI tools, notably Anthropic’s Claude Opus 4.6, at its browser in a quest for vulnerabilities, but Opus discovered just 22 security-sensitive bugs in Firefox 148, while Mythos uncovered more than ten times that many.

Firefox CTO Bobby Holley described the sense of “vertigo” his team felt when they saw that number. “For a hardened target, just one such bug would have been red-alert in 2025,” he wrote in a blog post, “and so many at once makes you stop to wonder whether it’s even possible to keep up.”

Firefox uses a defense-in-depth strategy, with internal red teams applying multiple layers of “overlapping defenses” and automated analysis techniques, he explained. Teams run each website in a separate process sandbox.

However, no layer is impenetrable, Holley noted, and attackers combine bugs in the rendering code with bugs in the sandboxes in an attempt to gain privileged access. While his team has now adopted a more secure programming language, Rust, the developers can’t afford to stop and rewrite the decades’ worth of existing C++ code, “especially since Rust only mitigates certain, (very common) classes of vulnerabilities.”

While automated analysis techniques like fuzzing, which uncovers vulnerabilities or bugs in source code, are useful, some bits of code are more difficult to fuzz than others, “leading to uneven coverage,” Holley pointed out. Human teams can find bugs that AI can’t by reasoning through source code, but this is time-consuming, and is bottlenecked due to limited human resources.

Now, Claude Mythos Preview is closing this gap, detecting bugs that fuzzing doesn’t surface.

“Computers were completely incapable of doing this a few months ago, and now they excel at it,” Holley noted. Mythos Preview is “every bit as capable” as human researchers, he asserted, and there is no “category or complexity” of vulnerability that humans can find that Mythos can’t.

Defenders now able to win ‘decisively’?

Gaps between human-discoverable and AI-discoverable bugs favor attackers, who can afford to concentrate months of human effort to find just one bug they can exploit, Holley noted. Closing this gap with AI can help defenders erode that long-term advantage.

The industry has largely been fighting security “to a draw,” he acknowledged, and security has been “offensively-dominant” due to the size of the attack surface, giving adversaries an “asymmetric advantage.” In the face of this, both Mozilla and security vendors have “long quietly acknowledged” that bringing exploits to zero was “unrealistic.”

But now with Mythos (and likely subsequent models), defenders have a chance to win, “decisively,” Holley asserted. “The defects are finite, and we are entering a world where we can finally find them all.”

What security teams should do now

Finding 271 flaws in a mature codebase like Firefox illustrates the fact that AI-driven vulnerability discovery is now operating at a scale and depth that can outpace traditional human-led review, noted Ensar Seker, CISO at cyber threat intelligence company SOCRadar.

Holley’s “vertigo,” he said, was because defenders are realizing the attack surface is larger, and “more rapidly discoverable than previously assumed.”

Security teams must respond by shifting from periodic testing to continuous validation, Seker advised. That means integrating AI-assisted code analysis into continuous integration/continuous delivery (CI/CD) pipelines, prioritizing “patch velocity over perfection,” and assuming that any externally reachable code path will eventually be discovered and weaponized.

“The goal is no longer just finding vulnerabilities first, but reducing the window between discovery and remediation,” he said.

Shipley agreed that any company building software must evaluate resourcing so it can quickly and proactively find and fix vulnerabilities. “But stuff will happen,” he acknowledged. So, in addition to doing proactive work, enterprises must regularly exercise their incident response playbooks.

“The next few years are going to be a marathon, not a sprint,” said Shipley.

Dual-use nature of AI is a challenge

However, the dual-use nature of these systems present a big challenge. The same capability that helps defenders identify hundreds of flaws can be turned against them if the model or its outputs are exposed, Seker pointed out.

The reported unauthorized access to Mythos “reinforces that AI systems themselves are now high-value targets, effectively becoming part of the attack surface,” he said.

It’s not at all surprising that people found a way to access Mythos, Shipley agreed; it was inevitable. “Nor does Anthropic have some unique, insurmountable or exclusive AI capability for hacking,” he said, pointing out that OpenAI is already catching up in that regard, and others will “catch and surpass” Mythos.

Striking a balance requires treating AI models like privileged infrastructure, Seker noted. Enterprises need strict access controls, output monitoring, and isolation of sensitive workflows. Developers, meanwhile, must adapt by writing code that is resilient to automated scrutiny; this requires stronger input validation, safer defaults, and “fewer assumptions about obscurity.”

“In this paradigm, security isn’t just about defending systems; it’s about defending the tools that are now capable of breaking them at scale,” Seker emphasized.