Google's threat intelligence division has documented what it describes as the first confirmed case of a criminal actor using artificial intelligence to discover and weaponize a zero-day vulnerability. The case is specific and well-evidenced: a cybercrime group used an AI model to identify a previously unknown flaw in a popular open-source web administration tool, wrote an exploit for it, and was preparing a mass targeting campaign when Google intervened. The campaign never launched. The flaw has since been patched. But the documented evidence of how the exploit was built has alarmed security researchers.
The target was a two-factor authentication system. The flaw would have allowed an attacker who already knew a victim's username and password to bypass 2FA entirely, eliminating the last meaningful line of defence for most user accounts. Zero-day vulnerabilities by definition are unknown to the software's developers; they leave defenders with no patch and no warning until they appear in the wild. This one, Google says, would have been exploited "in a mass exploitation event" had it not been caught first.
The evidence that AI was involved is found in the exploit code itself. The Python script is unusually well-organized and heavily documented, with educational docstrings explaining each step in the kind of clear, structured format characteristic of text generated by large language models. More specifically, the code includes a CVSS score: a standardized severity rating for security vulnerabilities. The number is hallucinated. No real CVSS score existed for this vulnerability, because no one had catalogued it. The AI invented one, in the manner of a student writing a bibliography who adds entries that look correct but don't correspond to real sources.
Google said it does not believe its own Gemini model was the AI used. The broader report draws on data from Gemini, the Google Threat Intelligence Group, and Mandiant. Its picture of AI use in offensive security is wider than a single incident. Chinese state-sponsored threat actors have been deploying agentic AI tools including Strix and Hexstrike in attacks on Japanese technology firms and East Asian cybersecurity companies. A group tracked as UNC2814, associated with Chinese intelligence and known for targeting telecommunications and government systems, used what Google describes as a "persona-driven jailbreak" to enhance vulnerability research: the group instructed an AI to act as a senior security auditor and queried it about embedded device firmware. North Korean groups, according to the report, are using AI to accelerate malware development and to generate convincing technical content for social engineering campaigns.
The significance of zero-day discovery is worth dwelling on. Finding an unknown vulnerability in mature software has historically required deep expertise, patience, and time. Security researchers spend weeks or months reading source code, tracing execution paths, and testing edge cases. AI doesn't change the underlying mathematics of software security, but it dramatically compresses the time required to analyse large codebases at scale. A capable AI model that has been trained on millions of codebases can potentially identify patterns that indicate likely vulnerability classes in ways that would take a human analyst significantly longer to surface.
The constraint in this case was real: the attackers still needed to know a victim's existing credentials before the 2FA bypass was useful. That is a meaningful limitation. But it also reflects the current early state of AI-assisted offensive security, not its ceiling. John Hultquist, the chief analyst at Google's threat intelligence group, was direct in his assessment: "It's a taste of what's to come. We believe this is the tip of the iceberg."
Security research has spent years building AI-assisted tools for defence: code scanning, anomaly detection, faster patch analysis. This is the first documented case on the other side. The asymmetry in cybersecurity has always favoured attackers, who need to find one way in while defenders must protect every surface. AI is unlikely to change that structural reality, and it may well deepen it.