Agentic AI
,
Artificial Intelligence & Machine Learning
,
Next-Generation Technologies & Secure Development
Researchers Detail Prompt Injection, API and Redirect Flaws

A three-vulnerability chain in Claude could allow attackers to steal a user’s conversation history without any malware, phishing email or suspicious link, security researchers found.
See Also: How Unstructured Data Chaos Undermines AI Success
Researchers at Oasis Security say the “Claudy Day” attack consists of three flaws: a hidden prompt injection via a URL, a data upload path using Anthropic’s Files API that lets files be uploaded into the system and an open redirect on claude.com that can send users to untrusted sites. Oasis said it reported all three issues to Anthropic and the prompt injection issue has been patched and the company is addressing the other two flaws.
The first flaw stems from how Claude handles the ?q= URL parameter, a feature that lets integrations pre-fill the chat box with a prompt. Oasis found that certain HTML tags placed in the parameter are invisible in the text box shown to the user but are transmitted in full to the model when the user hits send. By tucking instructions in HTML tag attributes, an attacker can send Claude a command the user never sees.
Oasis Security’s head of research Elad Luz said the simplicity of the find surprised him. “It’s the kind of thing that was probably sitting there in plain sight for a long time,” he told Information Security Media Group. “The prompt displayed to the user in the text box and the prompt actually sent to the model” are not the same thing.
With control over the model’s instructions established, Oasis investigated where the stolen data could actually go. Claude.ai runs code in a sandboxed environment that blocks most outbound connections to external servers. Direct requests to attacker-controlled infrastructure were off the table.
But the sandbox does allow connections to api.anthropic.com. Oasis found that Anthropic’s Files API, a beta feature that lets developers upload files to storage tied to their API account, was reachable from inside that sandbox. An attacker who embeds their own API key in the hidden prompt can instruct Claude to pull data from a user’s conversation history, write it to a file in the sandbox and upload it to the attacker’s Anthropic storage. The attacker then retrieves the file at their leisure.
“The exfiltration vulnerability relied only on built-in capabilities,” Luz said, which “dramatically reduces the options for the defender.” A potential mitigation would be disabling the built-in sandboxed code execution tool and replacing it with external tools that carry access controls.
The third piece of the chain addresses delivery. A raw link with hidden instructions looks unusual, and security-aware users might not click it. Oasis found that claude.com contained an open redirect: any URL in the form https://claude.com/redirect/ would forward a visitor to the target address without validation. That redirect, combined with Google Ads’ rule that an ad’s display URL must match its destination’s hostname, allowed researchers to construct a Google Search advertisement that showed a legitimate claude.com address but delivered users straight to the injected URL.
Luz said traditional web-delivery techniques are relevant when the attack hinges on a clicked link. “The more elegant and familiar the delivery, the higher the success rate,” he said. Google Ads also offers interest, industry and demographic targeting, as well as a feature that lets advertisers upload specific email addresses to target named individuals, turning a broad attack into a precision strike.
Saumitra Das, vice president of engineering at Qualys, said Claudy Day redefines what an attack surface looks like for AI systems. “There’s no malware or compromised infrastructure involved – it is just carefully crafted instructions delivered to a model that trusts them by default,” Das said. Since the domains, API calls and network traffic all look like normal platform activity, conventional security monitoring has little to flag.
Das said that organizations deploying AI agents against enterprise systems need to treat prompt integrity and tool permissions as core security controls, not afterthoughts. “AI agents need to be treated like privileged service accounts, with strict controls over what they can access, what tools they can use and where data can be sent.” He also flagged a broader behavioral problem: Developers and users are increasingly skipping permission checks to avoid interrupting the agent’s workflow, which compounds the exposure.
Andrew Bolster, senior research and development manager at Black Duck, said the underlying vulnerability classes of injection, privilege escalation, data governance failures and exfiltration are what “we have spent the past 20 years chasing down.” AI agents just change the attack surface boundary: New productivity tools inherently expand the risks of the applications that adopt them.
Bolster invoked what he called the “Lethal Trifecta”: a condition where an agent is simultaneously exposed to untrusted content, has access to private data and can communicate externally. All three conditions are present in a default Claude.ai session.
Luz said the attack also points to a structural question about how AI platforms handle the very first prompt a user sends. Since the injected prompt arrives at the start of a conversation, the agent acts on it before any trust relationship has been established. Anthropic already applies a partial version of this principle – the fetch tool is unavailable to the agent on a conversation’s first turn – but Luz said the broader principle of requiring explicit user approval before an agent accesses memory, tools or APIs on first interaction deserves wider application.
