{"id":15294,"date":"2025-12-09T17:03:19","date_gmt":"2025-12-09T17:03:19","guid":{"rendered":"https:\/\/newestek.com\/?p=15294"},"modified":"2025-12-09T17:03:19","modified_gmt":"2025-12-09T17:03:19","slug":"gemini-for-chrome-gets-a-second-ai-agent-to-watch-over-it","status":"publish","type":"post","link":"https:\/\/newestek.com\/?p=15294","title":{"rendered":"Gemini for Chrome gets a second AI agent to watch over it"},"content":{"rendered":"<div>\n<div id=\"remove_no_follow\">\n<div class=\"grid grid--cols-10@md grid--cols-8@lg article-column\">\n<div class=\"col-12 col-10@md col-6@lg col-start-3@lg\">\n<div class=\"article-column__content\">\n<section class=\"wp-block-bigbite-multi-title\">\n<div class=\"container\"><\/div>\n<\/section>\n<p>Google is deploying a second AI model to monitor its Gemini-powered Chrome browsing agent after acknowledging the agent could be tricked into taking unauthorized actions through prompt injection attacks.<\/p>\n<p>\u201cWe\u2019re introducing a user alignment critic where the agent\u2019s actions are vetted by a separate model that is isolated from untrusted content,\u201d the company said in <a href=\"https:\/\/security.googleblog.com\/2025\/12\/architecting-security-for-agentic.html\" target=\"_blank\" rel=\"noreferrer noopener\">a blog post<\/a> about the addition. If the critic determines an action doesn\u2019t match what the user asked for, it blocks the action, Google said.<\/p>\n<p>\u201cThe primary new threat facing all agentic browsers is indirect prompt injection,\u201d Chrome security engineer Nathan Parker wrote in the post, describing a situation where an agent is prompted to process information that then seeks to modify the initial prompt.<\/p>\n<p>The Gemini-powered browsing agent, <a href=\"https:\/\/blog.google\/products\/chrome\/new-ai-features-for-chrome\/\" target=\"_blank\" rel=\"noreferrer noopener\">launched in September and currently in preview<\/a>, can navigate websites, click buttons, and fill forms while users are logged into email, banking, and corporate systems. Malicious instructions hidden in web pages, iframes, or user-generated content could \u201ccause the agent to take unwanted actions such as initiating financial transactions or exfiltrating sensitive data,\u201d Parker wrote.<\/p>\n<p>That\u2019s where the user alignment critic comes in: The second model reviews each proposed action before Chrome executes it, acting as what Parker called \u201ca powerful, extra layer of defense against both goal-hijacking and data exfiltration.\u201d<\/p>\n<h2 class=\"wp-block-heading\" id=\"why-prompt-injection-is-hard-to-stop\">Why prompt injection is hard to stop<\/h2>\n<p>Prompt injection has emerged as the top vulnerability in AI systems over the past year. <a href=\"https:\/\/genai.owasp.org\/llmrisk\/llm01-prompt-injection\/\" target=\"_blank\" rel=\"noreferrer noopener\">OWASP found it in 73% of production AI deployments<\/a> it assessed in 2024, ranking it the number one risk in its list of threats to large language model applications.<\/p>\n<p>The UK\u2019s National Cyber Security Centre <a href=\"https:\/\/www.ncsc.gov.uk\/blog-post\/prompt-injection-is-not-sql-injection\" target=\"_blank\" rel=\"noreferrer noopener\">warned Sunday<\/a> that prompt injection attacks may never be fully mitigated because LLMs can\u2019t reliably distinguish between instructions and data. The agency called it a \u201cconfused deputy\u201d vulnerability, where a trusted system is tricked into performing actions on behalf of an untrusted party.<\/p>\n<p>Researchers have already demonstrated the threat. <a href=\"https:\/\/www.obsidiansecurity.com\/blog\/prompt-injection\" target=\"_blank\" rel=\"noreferrer noopener\">In January<\/a>, attackers embedded instructions in a document that caused an enterprise AI system to leak business intelligence and disable its own safety filters. Security firm AppOmni <a href=\"https:\/\/appomni.com\/ao-labs\/ai-agent-to-agent-discovery-prompt-injection\/\" target=\"_blank\" rel=\"noreferrer noopener\">disclosed last month<\/a> that ServiceNow\u2019s AI agents could be manipulated through instructions hidden in form fields, with one agent recruiting others to perform unauthorized actions.<\/p>\n<p>For Chrome, the stakes are particularly high. A compromised browsing agent would have the user\u2019s full privileges on any logged-in site, potentially bypassing the browser\u2019s site isolation protections that normally prevent websites from accessing each other\u2019s data.<\/p>\n<h2 class=\"wp-block-heading\" id=\"googles-two-model-defense\">Google\u2019s two-model defense<\/h2>\n<p>To address these risks, Google\u2019s solution splits the work between two AI models. The main Gemini model reads web content and decides what actions to take. The user alignment critic sees only metadata about proposed actions, not the web content that might contain malicious instructions.<\/p>\n<p>\u201cThis component is architected to see only metadata about the proposed action and not any unfiltered untrustworthy web content, thus ensuring it cannot be poisoned directly from the web,\u201d Parker wrote in the blog. When the critic rejects an action, it provides feedback to the planning model to reformulate its approach.<\/p>\n<p>The architecture is based on existing security research, drawing from what\u2019s known as the dual-LLM pattern and CaMeL research from Google DeepMind, according to the blog post.<\/p>\n<p>Google is also limiting which websites the agent can interact with through what it calls \u201corigin sets.\u201d The system maintains lists of sites the agent can read from and sites where it can take actions like clicking or typing. A gating function, isolated from untrusted content, determines which sites are relevant to each task.<\/p>\n<p>The company acknowledged this first implementation is basic. \u201cWe will tune the gating functions and other aspects of this system to reduce unnecessary friction while improving security,\u201d Parker wrote.<\/p>\n<p>Beyond the user alignment critic and origin controls, Chrome will require user confirmation before the browsing agent navigates to banking or medical sites, uses saved passwords through Google Password Manager, or completes purchases, according to the blog post. The browsing agent has no direct access to stored passwords.<\/p>\n<p>A classifier runs in parallel checking for prompt injection attempts as the agent works. Google has built automated red-teaming systems generating malicious test sites, prioritizing attacks delivered through user-generated content on social media and advertising networks.<\/p>\n<h2 class=\"wp-block-heading\" id=\"grappling-with-an-unsolved-problem\">Grappling with an unsolved problem<\/h2>\n<p>The prompt injection challenge isn\u2019t unique to Chrome. <a href=\"https:\/\/openai.com\/index\/prompt-injections\/\" target=\"_blank\" rel=\"noreferrer noopener\">OpenAI has called<\/a> it \u201ca frontier, challenging research problem\u201d for its ChatGPT agent features and expects attackers to invest significant resources in these techniques.<\/p>\n<p>Gartner has gone one step further and <a href=\"https:\/\/www.computerworld.com\/article\/4102569\/keep-ai-browsers-out-of-your-enterprise-warns-gartner.html\">advised enterprises to block AI browsers<\/a> in their systems. The research firm warned that AI-powered browsing agents could expose corporate data and credentials to prompt injection attacks.<\/p>\n<p>The NCSC took a similar position, urging organizations to assume AI systems will be attacked and to limit their access and privileges accordingly. The agency said organizations should manage risk through design rather than expecting technical fixes to eliminate the problem.<\/p>\n<p>Chrome\u2019s agent features are optional and remain in preview, the blog post said.<\/p>\n<p>This article first appeared on <a href=\"https:\/\/www.computerworld.com\/article\/4103343\/gemini-for-chrome-gets-a-second-ai-agent-to-watch-over-it.html\">Computerworld<\/a>.<\/p>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n","protected":false},"excerpt":{"rendered":"<p>Google is deploying a second AI model to monitor its Gemini-powered Chrome browsing agent after acknowledging the agent could be tricked into taking unauthorized actions through prompt injection attacks. \u201cWe\u2019re introducing a user alignment critic where the agent\u2019s actions are vetted by a separate model that is isolated from untrusted content,\u201d the company said in a blog post about the addition. If the critic determines&#8230; <\/p>\n<p class=\"more\"><a class=\"more-link\" href=\"https:\/\/newestek.com\/?p=15294\">Read More<\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"inline_featured_image":false,"footnotes":""},"categories":[1],"tags":[],"class_list":["post-15294","post","type-post","status-publish","format-standard","hentry","category-uncategorized","is-cat-link-borders-light is-cat-link-rounded"],"_links":{"self":[{"href":"https:\/\/newestek.com\/index.php?rest_route=\/wp\/v2\/posts\/15294","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/newestek.com\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/newestek.com\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/newestek.com\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/newestek.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=15294"}],"version-history":[{"count":0,"href":"https:\/\/newestek.com\/index.php?rest_route=\/wp\/v2\/posts\/15294\/revisions"}],"wp:attachment":[{"href":"https:\/\/newestek.com\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=15294"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/newestek.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=15294"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/newestek.com\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=15294"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}