FILTERED RESULTS
FILTERS
Ads Top
DARK MODE
CHART
    Filters
      Symbols
      Sentiment
      Impact
      Search
      FILTERED RESULTS

        

      Upgrade your plan

      The Web Is Gaslighting AI Agents and Nobody Can Tell 

      An artificial intelligence (AI) agent finds the best price on a product and completes the purchase. It browses, selects and checks out without a human visiting a single page. Researchers say the listing it processed could have been seeded with hidden instructions, ones indistinguishable from legitimate content.

      Google DeepMind has published research on a new class of threat to autonomous AI agents. Researchers called them “AI Agent Traps,” instructions hidden inside ordinary web pages that agents read as commands.

      The research covered six distinct attack categories and applies to every major model and agent architecture. Enterprises are deploying agents across procurement, finance and commerce with no standardized defenses in place.

      The Web Is No Longer Neutral Input

      The core vulnerability is architectural. It starts with a simple difference in how humans and machines read a webpage.

      When a person visits a product listing, they see the price and the description. An AI agent visiting the same page reads something different. It processes the underlying code, the hidden metadata and the scripts running in the background. Those layers are never visible on screen. Attackers are now writing to them specifically to reach agents.

      DeepMind’s first attack class is content injection. Malicious instructions are buried in the page’s code or image files, invisible to any human reviewer. The agent reads them as part of the page and acts on them. The second class is semantic manipulation. Instead of hiding commands in code, an attacker crafts product descriptions or vendor profiles worded to steer an agent’s conclusions. It exploits the same tendency to over-weight authoritative-sounding language that affects human judgment.

      Palo Alto Networks’ threat research team has documented both attack types across the web. Malicious websites are already deploying hidden instructions at scale. They use techniques that fragment or encode commands to pass automated security checks. The commands remain readable to the agent. The attack surface grows every time an agent connects to a new data source.

      From Bad Decisions to Manipulated Decisions

      The consumer purchase scenario scales directly into enterprise operations.

      A procurement agent pulling vendor pricing from a compromised supplier site may route an order to a fraudulent vendor. It does so without producing a visible error. The agent is not malfunctioning. It is following instructions it cannot identify as malicious. A customer service agent retrieving product information from a compromised page may return fabricated details. The agent then logs the interaction as resolved. In both cases, the workflow completes normally, and nothing is flagged.

      The DeepMind paper documents a case in which a single manipulated email caused an agent in Microsoft’s 365 Copilot to bypass its security classifiers. The agent then exposed its full privileged context. It handed over data it was specifically configured to protect.

      According to Anthropic, every webpage a browser agent visits is a potential attack vector. The company said that a 1% attack success rate represents significant risk at enterprise scale. Anthropic added that prompt injection is far from a solved problem, particularly as agents take more real-world actions.

      New Security Layer for Agent-Driven Workflows

      The reason this is hard to fix is the same reason agents are useful in the first place.

      AI agents are designed to ingest content from the web and act on it. They do not arrive at a page with skepticism. They read everything as input. An instruction buried in a product listing looks the same to an agent as the price and the shipping date. There is no built-in mechanism to tell the difference.

      The DeepMind researchers identified detection, attribution and adaptation as the three requirements for an effective defense. Detecting hidden instructions requires pre-ingestion scanners. Tracing which domain introduced a manipulation requires attribution infrastructure. Keeping pace with new attack techniques requires defenses that update continuously.

      The DeepMind paper called for new web standards that flag content intended for AI consumption and domain reputation systems that score site reliability for agents. It also called for adversarial training built into model development from the start. The researchers noted that many of the six attack categories currently lack standardized benchmarks. Most enterprises have no way to test whether their deployed agents would withstand them.

      For all PYMNTS AI coverage, subscribe to the daily AI Newsletter


      Source: PYMNTS.com
      .

      Terra Founder Do Kwon Sentenced to 15 Years in Prison for Fraud