{"id":14918,"date":"2025-10-08T07:09:35","date_gmt":"2025-10-08T07:09:35","guid":{"rendered":"https:\/\/newestek.com\/?p=14918"},"modified":"2025-10-08T07:09:35","modified_gmt":"2025-10-08T07:09:35","slug":"autonomous-ai-hacking-and-the-future-of-cybersecurity","status":"publish","type":"post","link":"https:\/\/newestek.com\/?p=14918","title":{"rendered":"Autonomous AI hacking and the future of cybersecurity"},"content":{"rendered":"<div>\n<div id=\"remove_no_follow\">\n<div class=\"grid grid--cols-10@md grid--cols-8@lg article-column\">\n<div class=\"col-12 col-10@md col-6@lg col-start-3@lg\">\n<div class=\"article-column__content\">\n<section class=\"wp-block-bigbite-multi-title\">\n<div class=\"container\"><\/div>\n<\/section>\n<p>AI agents are now hacking computers. They\u2019re getting better at all phases of cyberattacks, faster than most of us expected. They can chain together different aspects of a cyber operation, and hack autonomously, at computer speeds and scale. This is going to change everything.<\/p>\n<p>Over the summer, hackers proved the concept, industry institutionalized it, and criminals operationalized it. In June, AI company XBOW took the <a href=\"https:\/\/www.techrepublic.com\/article\/news-ai-xbow-tops-hackerone-us-leaderboad\">top spot<\/a> on HackerOne\u2019s US leaderboard after submitting over 1,000 new vulnerabilities in just a few months. In August, the seven teams competing in DARPA\u2019s AI Cyber Challenge <a href=\"https:\/\/www.darpa.mil\/news\/2025\/aixcc-results\">collectively found<\/a> 54 new vulnerabilities in a target system, in four hours (of compute). Also in August, Google <a href=\"https:\/\/techcrunch.com\/2025\/08\/04\/google-says-its-ai-based-bug-hunter-found-20-security-vulnerabilities\/\">announced<\/a> that its Big Sleep AI found dozens of new vulnerabilities in open-source projects.<\/p>\n<p>It gets worse. In July Ukraine\u2019s CERT <a href=\"https:\/\/www.csoonline.com\/article\/4025139\/novel-malware-from-russias-apt28-prompts-llms-to-create-malicious-windows-commands.html\">discovered<\/a> a piece of Russian malware that used an LLM to automate the cyberattack process, generating both system reconnaissance and data theft commands in real-time. In August, Anthropic reported that they disrupted a threat actor that used Claude, Anthropic\u2019s AI model, to <a href=\"https:\/\/www.anthropic.com\/news\/detecting-countering-misuse-aug-2025\">automate<\/a> the entire cyberattack process. It was an impressive use of the AI, which performed network reconnaissance, penetrated networks, and harvested victims\u2019 credentials. The AI was able to figure out which data to steal, how much money to extort out of the victims, and how to best write extortion emails.<\/p>\n<p>Another hacker used Claude to create and market his own ransomware, complete with \u201cadvanced evasion capabilities, encryption, and anti-recovery mechanisms.\u201d And in September, Checkpoint <a href=\"https:\/\/blog.checkpoint.com\/executive-insights\/hexstrike-ai-when-llms-meet-zero-day-exploitation\/\">reported<\/a> on hackers using HexStrike-AI to create autonomous agents that can scan, exploit, and persist inside target networks. Also in September, a research team <a href=\"https:\/\/arxiv.org\/abs\/2509.01835\">showed<\/a> how they can quickly and easily reproduce hundreds of vulnerabilities from public information. These tools are increasingly free for anyone to use. <a href=\"https:\/\/www.infosecurity-magazine.com\/news\/chinese-ai-villager-pen-testing\/\">Villager<\/a>, a recently released AI pentesting tool from Chinese company Cyberspike, uses the Deepseek model to completely automate attack chains.<\/p>\n<p>This is all well beyond AIs capabilities in 2016, at DARPA\u2019s <a href=\"https:\/\/www.darpa.mil\/news\/2016\/cyber-grand-challenge-winners\">Cyber Grand Challenge<\/a>. The annual Chinese AI hacking challenge, <a href=\"https:\/\/www.schneier.com\/essays\/archives\/2022\/01\/robot-hacking-games.html\">Robot Hacking Games<\/a>, might be on this level, but little is known outside of China.<\/p>\n<h2 class=\"wp-block-heading\" id=\"tipping-point-on-the-horizon\">Tipping point on the horizon<\/h2>\n<p>AI agents now rival and sometimes surpass even elite human hackers in sophistication. They automate operations at machine speed and global scale. The scope of their capabilities allows these AI agents to completely automate a criminal\u2019s command to maximize profit, or structure advanced attacks to a government\u2019s precise specifications, such as to avoid detection.<\/p>\n<p><a href=\"https:\/\/www.washingtonpost.com\/technology\/2025\/09\/20\/ai-hacking-cybersecurity-cyberthreats\/?pwapi_token=eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJyZWFzb24iOiJnaWZ0IiwibmJmIjoxNzU4MzQwODAwLCJpc3MiOiJzdWJzY3JpcHRpb25zIiwiZXhwIjoxNzU5NzIzMTk5LCJpYXQiOjE3NTgzNDA4MDAsImp0aSI6IjEzZGE1Njk0LTMxOTItNDdkNi1hNTU3LTRkOWEzNDI5ODM0OCIsInVybCI6Imh0dHBzOi8vd3d3Lndhc2hpbmd0b25wb3N0LmNvbS90ZWNobm9sb2d5LzIwMjUvMDkvMjAvYWktaGFja2luZy1jeWJlcnNlY3VyaXR5LWN5YmVydGhyZWF0cy8ifQ.N_h4ygZ86XPjbtpR253UIbbArH7e0Tu3tN0iapl5v2k\">In this future<\/a>, attack capabilities could accelerate beyond our individual and collective capability to handle. We have long taken it for granted that we have time to patch systems after vulnerabilities become known, or that withholding vulnerability details prevents attackers from exploiting them. This is <a href=\"https:\/\/www.cybersecuritydive.com\/news\/ai-vulnerability-detection-patching-threats-mandiant-summit\/760746\/\">no longer<\/a> the case.<\/p>\n<p>The cyberattack\/cyberdefense balance has long skewed towards the attackers; these developments threaten to <a href=\"https:\/\/www.schneier.com\/essays\/archives\/2018\/03\/artificial_intellige.html\">tip the scales<\/a> completely. We\u2019re <a href=\"https:\/\/www.wired.com\/story\/the-era-of-ai-generated-ransomware-has-arrived\/\">potentially<\/a> <a href=\"https:\/\/www.computerworld.com\/article\/4048415\/the-ai-powered-cyberattack-era-is-here.html\">looking<\/a> at a singularity event for cyber attackers. Key parts of the attack chain are becoming automated and integrated: persistence, obfuscation, command-and-control, and endpoint evasion. Vulnerability research could potentially be carried out during operations instead of months in advance.<\/p>\n<p>The most skilled will likely retain an edge for now. But AI agents don\u2019t have to be better at a human task in order to be useful. They just have to excel in one of <a href=\"https:\/\/theconversation.com\/will-ai-take-your-job-the-answer-could-hinge-on-the-4-ss-of-the-technologys-advantages-over-humans-258469\">four dimensions<\/a>: speed, scale, scope, or sophistication. But there is every indication that they will eventually excel at all four. By reducing the skill, cost, and time required to find and exploit flaws, AI can turn rare expertise into commodity capabilities and gives average criminals an outsized advantage.<\/p>\n<h2 class=\"wp-block-heading\" id=\"the-ai-assisted-evolution-of-cyberdefense\">The AI-assisted evolution of cyberdefense<\/h2>\n<p>AI technologies can benefit defenders as well. We don\u2019t know how the different technologies of cyber-offense and cyber-defense will be amenable to AI enhancement, but we can extrapolate a possible series of overlapping developments.<\/p>\n<p><strong>Phrase One: The Transformation of the Vulnerability Researcher.<\/strong> AI-based hacking benefits defenders as well as attackers. In this scenario, AI empowers defenders to do more. It simplifies capabilities, providing <a href=\"https:\/\/www.csoonline.com\/article\/3632268\/gen-ai-is-transforming-the-cyber-threat-landscape-by-democratizing-vulnerability-hunting.html\">far more people the ability<\/a> to perform previously complex tasks, and empowers researchers previously busy with these tasks to accelerate or move beyond them, freeing time to work on problems that require human creativity. History suggests a pattern. Reverse engineering was a laborious manual process until tools such as IDA Pro made the capability available to many. AI vulnerability discovery could follow a similar trajectory, evolving through scriptable interfaces, automated workflows, and automated research before reaching broad accessibility.<\/p>\n<p><strong>Phase Two: The Emergence of VulnOps.<\/strong> Between research breakthroughs and enterprise adoption, a new discipline might emerge: VulnOps. Large research teams are already building operational pipelines around their tooling. Their evolution could mirror how DevOps professionalized software delivery. In this scenario, specialized research tools become developer products. These products may emerge as a SaaS platform, or some internal operational framework, or something entirely different. Think of it as AI-assisted vulnerability research available to everyone, at scale, repeatable, and integrated into enterprise operations.<\/p>\n<p><strong>Phase Three: The Disruption of the Enterprise Software Model.<\/strong> If enterprises adopt AI-powered security the way they adopted continuous integration\/continuous delivery (CI\/CD), several paths open up. AI vulnerability discovery could become a built-in stage in delivery pipelines. We can <a href=\"https:\/\/www.schneier.com\/blog\/archives\/2024\/11\/ais-discovering-vulnerabilities.html\">envision a world<\/a> where AI vulnerability discovery becomes an integral part of the software development process, where vulnerabilities are automatically patched even before reaching production \u2014 a shift we might call continuous discovery\/continuous repair (CD\/CR). Third-party risk management (TPRM) offers a natural adoption route, lower-risk vendor testing, integration into procurement and certification gates, and a proving ground before wider rollout.<\/p>\n<p><strong>Phase Four: The Self-Healing Network.<\/strong> If organizations can independently discover and patch vulnerabilities in running software, they will not have to wait for vendors to issue fixes. Building in-house research teams is costly, but AI agents could perform such discovery and generate patches for many kinds of code, including third-party and vendor products. Organizations may develop independent capabilities that create and deploy third-party patches on vendor timelines, extending the current trend of independent open-source patching. This would increase security, but having customers patch software without vendor approval raises questions about patch correctness, compatibility, liability, right-to-repair, and long-term vendor relationships.<\/p>\n<p>These are all speculations. Maybe AI-enhanced cyberattacks won\u2019t evolve the ways we fear. Maybe AI-enhanced cyberdefense will give us capabilities we can\u2019t yet anticipate. What will surprise us most might not be the paths we can see, but the ones we can\u2019t imagine yet.<\/p>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n","protected":false},"excerpt":{"rendered":"<p>AI agents are now hacking computers. They\u2019re getting better at all phases of cyberattacks, faster than most of us expected. They can chain together different aspects of a cyber operation, and hack autonomously, at computer speeds and scale. This is going to change everything. Over the summer, hackers proved the concept, industry institutionalized it, and criminals operationalized it. In June, AI company XBOW took the&#8230; <\/p>\n<p class=\"more\"><a class=\"more-link\" href=\"https:\/\/newestek.com\/?p=14918\">Read More<\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"inline_featured_image":false,"footnotes":""},"categories":[1],"tags":[],"class_list":["post-14918","post","type-post","status-publish","format-standard","hentry","category-uncategorized","is-cat-link-borders-light is-cat-link-rounded"],"_links":{"self":[{"href":"https:\/\/newestek.com\/index.php?rest_route=\/wp\/v2\/posts\/14918","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/newestek.com\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/newestek.com\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/newestek.com\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/newestek.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=14918"}],"version-history":[{"count":0,"href":"https:\/\/newestek.com\/index.php?rest_route=\/wp\/v2\/posts\/14918\/revisions"}],"wp:attachment":[{"href":"https:\/\/newestek.com\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=14918"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/newestek.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=14918"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/newestek.com\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=14918"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}