{"id":15473,"date":"2026-01-14T20:09:54","date_gmt":"2026-01-14T20:09:54","guid":{"rendered":"https:\/\/newestek.com\/?p=15473"},"modified":"2026-01-14T20:09:54","modified_gmt":"2026-01-14T20:09:54","slug":"output-from-vibe-coding-tools-prone-to-critical-security-flaws-study-finds","status":"publish","type":"post","link":"https:\/\/newestek.com\/?p=15473","title":{"rendered":"Output from vibe coding tools prone to critical security flaws, study finds"},"content":{"rendered":"<div>\n<div id=\"remove_no_follow\">\n<div class=\"grid grid--cols-10@md grid--cols-8@lg article-column\">\n<div class=\"col-12 col-10@md col-6@lg col-start-3@lg\">\n<div class=\"article-column__content\">\n<section class=\"wp-block-bigbite-multi-title\">\n<div class=\"container\"><\/div>\n<\/section>\n<p>Popular vibe coding platforms consistently generate insecure code in response to common programming prompts, including creating vulnerabilities rated as \u2018critical,\u2019 new testing has found.<\/p>\n<p>Security startup Tenzai\u2019s top-line conclusion: the tools are good at avoiding security flaws that can be solved in a generic way, but struggle where what distinguishes safe from dangerous depends on context.<\/p>\n<p>The assessment, which it conducted in December 2025, <a href=\"https:\/\/blog.tenzai.com\/bad-vibes-comparing-the-secure-coding-capabilities-of-popular-coding-agents\/\" target=\"_blank\" rel=\"noreferrer noopener\">compared five of the best-known vibe coding tools<\/a> \u2014 Claude Code, OpenAI Codex, Cursor, Replit, and Devin \u2014 by using <a href=\"https:\/\/github.com\/TenzaiLabs\/datasets?ref=blog.tenzai.com\" target=\"_blank\" rel=\"noreferrer noopener\">pre-defined prompts<\/a> to build the same three test applications.<\/p>\n<p>In total, the code output by the five tools across 15 applications (three each) was found to contain a total of 69 vulnerabilities. Around 45 of these were rated \u2018low-medium\u2019 in severity, with many of the remainder rated \u2018high\u2019 and around half a dozen \u2018critical\u2019.<\/p>\n<p>While the number of low-medium vulnerabilities was the same for all five tools, only Claude Code (4 flaws), Devin (1) and Codex (1) generated critical-rated vulnerabilities.<\/p>\n<p>The most serious vulnerabilities concerned API authorization logic (checking who is allowed to access a resource or perform an action), and business logic (permitting a user action that shouldn\u2019t be possible), both important for e-commerce systems.<\/p>\n<p>\u201c[Code generated by AI] agents seems to be very prone to business logic vulnerabilities. While human developers bring intuitive understanding that helps them grasp how workflows should operate, agents lack this \u2018common sense\u2019 and depend mainly on explicit instructions,\u201d said Tenzai\u2019s researchers.<\/p>\n<p>Offsetting this, the tools did a good job of avoiding common flaws that have long plagued human-coded applications, such as SQLi or XSS vulnerabilities that are both still <a href=\"https:\/\/owasp.org\/Top10\/2025\/A05_2025-Injection\/\" target=\"_blank\" rel=\"noreferrer noopener\">prominently featured<\/a> in the <a href=\"https:\/\/owasp.org\/Top10\/2025\/\" target=\"_blank\" rel=\"noreferrer noopener\">OWASP Top 10 list<\/a> of web application security risks.<\/p>\n<p>\u201cAcross all the applications we developed, we didn\u2019t encounter a single exploitable SQLi or XSS vulnerability,\u201d said Tenzai.<\/p>\n<h2 class=\"wp-block-heading\" id=\"human-oversight\">Human oversight<\/h2>\n<p>The vibe coding sales pitch is that it automates everyday programming jobs, boosting productivity. While this is undoubtedly true, Tenzai\u2019s test shows that the idea has limits; human oversight and debugging are still needed.<\/p>\n<p>This isn\u2019t a new discovery. In the year since the concept of \u2018vibe coding\u2019 was developed, <a href=\"https:\/\/www.csoonline.com\/article\/4062720\/ai-coding-assistants-amplify-deeper-cybersecurity-risks.html\" target=\"_blank\">other studies<\/a> have found that, without proper supervision, these tools are prone to introducing new cyber security weaknesses.<\/p>\n<p>But it\u2019s not simply that vibe coding platforms aren\u2019t picking up security flaws in their code; in some cases, defining what counts as good or bad is simply impossible using general rules or examples.<\/p>\n<p>\u201cTake SSRF [Server-Side Request Forgery]: there\u2019s no universal rule for distinguishing legitimate URL fetches from malicious ones. The line between safe and dangerous depends heavily on context, making generic solutions impossible,\u201d said Tenzai.\u00a0<\/p>\n<p>The obvious solution is that, having invented vibe coding agents, the industry should now focus on vibe coding <em>checking<\/em> agents, which, of course, is where Tenzai, a small startup not long out of stealth mode, thinks it has found a gap in the market for its own technology. It said, \u201cbased on our testing and recent research, no comprehensive solution to this issue currently exists. This makes it critical for developers to understand the common pitfalls of coding agents and prepare accordingly.\u201d<\/p>\n<h2 class=\"wp-block-heading\" id=\"debugging-ai\">Debugging AI<\/h2>\n<p>The deeper question raised by vibe coding isn\u2019t how well tools work, then, but how they are used. Telling developers to keep eyes on vibe code output isn\u2019t the same as knowing this will happen, any more than it was in the days when humans made all the mistakes.<\/p>\n<p>\u201cWhen implementing vibe coding approaches, companies should ensure that secure code review is part of any Secure Software Development Lifecycle and is consistently implemented,\u201d commented <a href=\"https:\/\/www.linkedin.com\/in\/matthew-robbins-809091108\/?originalSubdomain=uk\" target=\"_blank\" rel=\"noreferrer noopener\">Matthew Robbins<\/a>, head of offensive security at security services company Talion. \u201cGood practice frameworks should also be leveraged, such as the language-agnostic OWASP Secure Coding Practices, and language-specific frameworks such as SEI CERT coding standards.\u201d\u00a0<\/p>\n<p>Code should be tested using static and dynamic analysis before being deployed, Robbins added. The trick is to get debugging right. \u201cAlthough vibe coding presents a risk, it can be managed by closely adhering to industry-standard processes and guidelines that go further than traditional debugging and quality assurance,\u201d he noted.<\/p>\n<p>However, according to <a href=\"https:\/\/www.linkedin.com\/in\/erankinsbruner\/\" target=\"_blank\" rel=\"noreferrer noopener\">Eran Kinsbruner<\/a>, VP of product marketing at application testing organization Checkmarx, traditional debugging risks being overwhelmed by the AI era.<\/p>\n<p>\u201cMandating more debugging is the wrong instinct for an AI-speed problem. Debugging assumes humans can meaningfully review AI-generated code after the fact. At the scale and velocity of vibe coding, that assumption collapses,\u201d he said.<\/p>\n<p>\u201cThe only viable response is to move security <em>into<\/em>\u00a0the act of creation. In practice, this means agentic security must become a native companion to AI coding assistants, embedded directly inside AI-first development environments, not bolted on downstream.\u201d<\/p>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n","protected":false},"excerpt":{"rendered":"<p>Popular vibe coding platforms consistently generate insecure code in response to common programming prompts, including creating vulnerabilities rated as \u2018critical,\u2019 new testing has found. Security startup Tenzai\u2019s top-line conclusion: the tools are good at avoiding security flaws that can be solved in a generic way, but struggle where what distinguishes safe from dangerous depends on context. The assessment, which it conducted in December 2025, compared&#8230; <\/p>\n<p class=\"more\"><a class=\"more-link\" href=\"https:\/\/newestek.com\/?p=15473\">Read More<\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"inline_featured_image":false,"footnotes":""},"categories":[1],"tags":[],"class_list":["post-15473","post","type-post","status-publish","format-standard","hentry","category-uncategorized","is-cat-link-borders-light is-cat-link-rounded"],"_links":{"self":[{"href":"https:\/\/newestek.com\/index.php?rest_route=\/wp\/v2\/posts\/15473","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/newestek.com\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/newestek.com\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/newestek.com\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/newestek.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=15473"}],"version-history":[{"count":0,"href":"https:\/\/newestek.com\/index.php?rest_route=\/wp\/v2\/posts\/15473\/revisions"}],"wp:attachment":[{"href":"https:\/\/newestek.com\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=15473"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/newestek.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=15473"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/newestek.com\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=15473"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}