Anthropic's Glasswing Finds 10,000+ Software Flaws in One Month

Anthropic's Project Glasswing has uncovered more than 10,000 high- and critical-severity software vulnerabilities in its first month of operation, the company disclosed this week. The results, drawn from Anthropic's unreleased frontier model Claude Mythos Preview, include a 27-year-old bug in OpenBSD and a 16-year-old flaw in FFmpeg that automated tools had missed five million times.

Why this matters

This is the first hard, public data point on what frontier AI does to offensive security. Glasswing's output suggests that finding zero-days is no longer the bottleneck in cybersecurity — patching them is. That inverts a 30-year-old assumption about how the industry defends itself.

What Glasswing is

Anthropic launched Project Glasswing on April 7, 2026, as a controlled coalition giving roughly 40 organizations — including AWS, Apple, Cisco, Google, JPMorgan Chase, and Microsoft — early access to Claude Mythos Preview for vulnerability discovery. The company committed up to $100 million in model-usage credits to support the effort, plus an additional $4 million for open-source security work.

The premise is simple: point an extraordinarily capable AI at critical software infrastructure and let it hunt for flaws that humans have missed. The output exceeded what most security researchers expected.

The headline finds

Two discoveries have drawn particular attention from security teams.

The OpenBSD vulnerability is 27 years old. It allowed an attacker to remotely crash any machine running the operating system simply by connecting to it. OpenBSD is widely used in firewalls, routers, and security appliances precisely because of its reputation for code quality and defensive programming. The bug survived nearly three decades of human code review.

The FFmpeg vulnerability is 16 years old. FFmpeg is the multimedia processing library that quietly underpins most of the modern internet, from YouTube transcoding to video conferencing. According to Anthropic, the buggy line of code had been hit five million times by automated fuzzing tools without anyone detecting the flaw. Claude Mythos found it on its first serious pass.

Together with thousands of other zero-days, the finds illustrate a pattern: traditional automated testing — fuzzing, symbolic execution, static analysis — looks for the things humans designed it to look for. Frontier models reason about code semantics, which lets them spot bug classes the tooling wasn't built to detect.

The patching problem

The harder finding is what hasn't happened. Anthropic acknowledged in its update that at the time of its April 7 announcement, over 99% of discovered vulnerabilities remained unpatched. The publicly disclosed flaws have been fixed, but the long tail of disclosures sitting with vendors has not moved much.

"The relative ease of finding vulnerabilities compared with the difficulty of fixing them amounts to a major challenge for cybersecurity," Anthropic wrote.

In plain terms: AI-powered vulnerability discovery has decoupled from human-paced remediation. Security teams have spent decades optimizing for find-rate. Now the find-rate is essentially unlimited, and the constraint has shifted to the much slower process of code review, regression testing, and coordinated disclosure.

Industry reaction

Cybersecurity researchers have responded with a mix of awe and concern. The hopeful read is that defenders now have a tool that can find every bug an attacker might find — assuming defenders use it first. The pessimistic read is that any nation-state or well-funded criminal group with access to a comparable model now has a 10,000-vulnerability head start, regardless of what Anthropic does with disclosure.

The Hacker News and Security Affairs coverage both emphasized the second reading. As Security Affairs put it, "the patching problem has never been more obvious."

CSO Online noted that the open-source ecosystem is particularly exposed. Most critical libraries depend on a handful of volunteer maintainers, none of whom signed up to triage hundreds of AI-generated bug reports.

What's next

Anthropic says it plans to expand Glasswing's partner list and increase the credit pool. The company has not committed to a public release date for Claude Mythos, though Project Glasswing functions as a controlled deployment in everything but name.

For defenders, the immediate question is operational: how do you build a triage and patching pipeline that can absorb AI-scale vulnerability disclosure without burning out maintainers? Expect to see new tooling for automated patch generation, prioritization, and vendor coordination over the next 12 months.

For governments, the longer question is policy. If frontier models can be turned into 10,000-vulnerability-per-month hunting tools, export controls and pre-deployment red-teaming start to look very different.

Bottom line

Project Glasswing has produced the first concrete evidence that frontier AI changes the offense-defense balance in cybersecurity. The vulnerabilities are real, the methodology works, and the patching infrastructure isn't ready. Security teams should assume that the next major exploited bug is one a model already found — and start building processes that can keep up.

Anthropic's Glasswing Finds 10,000+ Software Flaws in One Month

Anthropic's Glasswing Finds 10,000+ Software Flaws in One Month

Why this matters

What Glasswing is

The headline finds

The patching problem

Industry reaction

What's next

Bottom line

Sources

Don't fall behind

Related Articles

Anthropic Launches Claude Science and Enters Drug Discovery

AI Uncovers Squidbleed, a 29-Year-Old Squid Proxy Bug

Anthropic Launches Claude Fable 5: Its Most Capable Model Yet