Anthropic said on May 26 it found more than 10,000 high-risk or critical vulnerabilities over a month through Project Glasswing, which involves 50 organisations including institutions and companies to help find software vulnerabilities using its AI model Mythos Preview.
Anthropic said this in a report summarising Project Glasswing’s results and challenges.
“In the past, progress in software security was about how quickly new vulnerabilities could be found,” Anthropic said in the report. “Now it depends on how quickly vulnerabilities found at scale by AI are verified, disclosed and patched,” it added.
The report also shared results achieved by companies participating in Project Glasswing using Mythos.
Cloudflare found 2,000 bugs, of which 400 were high-risk or critical. The false-positive rate was lower than that of human testers. Mozilla found and fixed 271 vulnerabilities in Firefox 150 using Mythos. That was 10 times the number found when it scanned Firefox 148 with Claude Opus 4.6.
Microsoft forecast the number of new patches would keep rising as it uses frontier AI models. Oracle said it is finding and fixing security flaws several times faster in its products and cloud environments. Palo Alto Networks’ latest release included more than five times as many patches as usual.
Lee Klarich (리 클라리치), Palo Alto’s chief product and technology officer, said there had been doubts about whether the model’s capabilities were being overestimated. He said repeated tests increased confidence that vulnerability detection was better than initially expected.
Anthropic said it scanned more than 1,000 open-source projects that are critical to internet operations using Mythos. It found 23,019 vulnerabilities, and 6,202 of them were estimated to be high-risk or critical. The report said evaluations by six independent security research organisations confirmed 90.6 percent of the findings were real vulnerabilities, and 62.4 percent were confirmed to be high-risk or critical.
Anthropic said the security industry needs to create processes to manage vulnerabilities found at scale by models on the level of Mythos.
The industry currently follows a practice of disclosing vulnerabilities within 90 days of discovery. Anthropic warned that long gaps between vulnerability discovery, patch creation and patch deployment extend the window in which attackers can exploit weaknesses. It said models on the level of Mythos greatly reduce the cost and time needed for vulnerability discovery and exploits, increasing the risks created by such gaps.
Anthropic launched a public beta in early May of “Claude Security” for Claude Enterprise users, which scans codebases for vulnerabilities and suggests fixes. Anthropic said it expects to be able to release a Mythos-level model publicly in the near future after developing stronger safeguards.