ShadowLeak: Zero‑Click Data Exfiltration via ChatGPT Connectors
ShadowLeak is a newly disclosed zero‑click exploitation technique against ChatGPT’s connector ecosystem that enables attackers to exfiltrate sensitive data directly from cloud‑connected services such as Gmail, Outlook, Google Drive, and GitHub. The attack abuses indirect prompt injection and the Deep Research / agentic capabilities inside OpenAI’s infrastructure, bypassing traditional endpoint and enterprise defenses by performing exfiltration from the service side rather than the user’s device.
Overview of the ShadowLeak Attack Chain
ShadowLeak targets ChatGPT’s extended capabilities, particularly connectors that integrate with third‑party services and allow autonomous agents to read, analyze, and manipulate user data. The exploitation chain relies on the fact that the agent processes untrusted content from external systems and treats parts of that content as instructions. By embedding hidden instructions in data that the connector fetches, the attacker can cause the agent to execute unauthorized actions without further user interaction.
Abuse of ChatGPT Connectors and Agentic Features
With the introduction of connectors to systems like Gmail, Outlook, Google Drive, GitHub, and live web browsing, ChatGPT can programmatically retrieve emails, documents, source code, and other artifacts. The Deep Research agent can autonomously browse, follow links, and run multi‑step workflows to answer complex user questions. This creates a powerful but high‑risk execution environment: any data ingested from external systems becomes part of the model’s context and can potentially trigger tool calls or outbound requests if crafted adversarially.
ShadowLeak exploits this behavior by ensuring that the malicious content is located in a place the connector or agent will read as part of normal operation, such as the HTML body of an email, a document, or a webpage. When the agent scans the content for relevant information, it simultaneously ingests hidden instructions that direct it to perform data exfiltration tasks through its available tools and connectors.
Indirect Prompt Injection via HTML Steganography
The core technique behind ShadowLeak is indirect prompt injection using HTML steganography. Instead of sending explicit instructions to ChatGPT, the attacker sends content to an external account or repository that the victim’s connector will later access. Within this content, the attacker embeds natural‑language instructions aimed at the agent, but visually obscured so the human recipient does not notice. This can include white‑on‑white text, microscopic fonts, off‑screen positioned elements, or other CSS tricks that keep the instructions invisible in standard mail or document clients.
From a rendering perspective, this text is present in the DOM and clearly readable as plain text, so when the Deep Research agent or connector fetches and parses the content, the model receives the hidden instructions in its context. Because the agent does not distinguish between human‑visible and hidden text, it interprets the injected instructions as legitimate system or user prompts and executes them with its available capabilities.
Service‑Side Exfiltration and Defense Evasion
Unlike classic phishing or endpoint malware campaigns, ShadowLeak performs exfiltration on the service side within OpenAI’s cloud infrastructure and the connected third‑party APIs. The agent, acting under the user’s delegated permissions, uses its connectors to read data from Gmail, Outlook, Google Drive, GitHub, or other services, then sends that data to attacker‑controlled destinations. Since data never leaves via the victim’s local browser or device in a traditional way, endpoint detection and response systems, data loss prevention agents, and network firewalls on the endpoint are unlikely to detect the malicious activity.
To transmit stolen data, the injected instructions can direct the agent to embed sensitive content in URL parameters, for example by generating Markdown image tags that reference attacker domains with query strings containing exfiltrated text. Alternatively, the instructions can request the agent to call browser‑like tools such as an open‑URL function, again encoding sensitive artifacts in the URL. Both techniques make the data flow appear as normal outbound HTTP requests initiated by the agent platform and not by the user’s local environment.
ZombieAgent: Automated Propagation and Targeting
Researchers demonstrated that ShadowLeak can be combined with a propagation mechanism referred to as ZombieAgent, which allows the attack to spread and reach new victims. In this model, once a victim’s account is accessed through a connector, the agent can be instructed to generate and send new malicious messages that themselves contain hidden indirect prompt injections. These emails or messages can be tailored to specific targets inside an organization or across multiple organizations, effectively turning each compromised account into a launchpad for further exploitation.
The propagation vector leverages the agent’s ability to draft, format, and send content using the same connectors or associated APIs. Each new malicious communication carries its own embedded instructions, enabling a self‑replicating pattern in which every newly affected environment can independently exfiltrate data and continue the spread. This amplifies impact and makes traditional incident scoping more complex, as investigators must identify not only the initial compromise but also all second‑order and third‑order propagated messages.
Potential Impact on Cloud‑Connected Data Stores
Since connectors can reach a wide range of systems, the blast radius of a successful ShadowLeak campaign is determined by the scope of tokens and permissions granted to ChatGPT. For mail connectors, this can include the full contents of inboxes, sent items, and archives; for document stores, access can span corporate documents, contracts, internal strategy papers, and personal files; for developer platforms, the agent may gain read access to private repositories, infrastructure‑as‑code, and secrets accidentally committed to version control.
The exfiltrated data may contain personal information, regulated data such as health or financial records, internal trade secrets, credentials embedded in documents, and API keys. Because the agent acts with legitimate OAuth access on behalf of the user, logs at the third‑party service appear consistent with normal usage, complicating detection and forensic reconstruction. Moreover, if multiple users in an organization have enabled connectors, a campaign can harvest a cross‑section of sensitive information from different departments, enabling more advanced follow‑on attacks such as business email compromise, supply‑chain intrusions, or highly tailored spear‑phishing.
Limitations and Preconditions for Exploitation
ShadowLeak requires that the victim has enabled one or more connectors and granted the necessary scopes for the agent to read or manipulate their data. Users who primarily use ChatGPT in a standalone configuration without external integrations face a much smaller risk from this specific technique, though other prompt injection vectors may still exist. Additionally, the attack assumes that agents will process content from untrusted sources that the attacker can control, such as inbound email or shared documents.
Another precondition is that the agent platform allows sufficient flexibility in tool usage to embed data into outbound URLs or similar channels. Any guardrails that strictly constrain tool arguments or limit the length and encoding of parameters can reduce the practicality of exfiltration. However, even under somewhat constrained environments, an attacker may still succeed by batching data into multiple requests or focusing on particularly valuable artifacts that fit within imposed limits.
Mitigation Strategies for Organizations
Organizations adopting ChatGPT connectors or similar generative AI integrations should treat agents as privileged automated users and apply the same least‑privilege and governance principles used for human accounts. This begins with careful auditing of OAuth scopes, ensuring that connectors only have access to the minimum mailboxes, repositories, and document folders needed for specific workflows. Revoking unused connectors and regularly rotating tokens reduce the potential damage from agent‑side compromise.
Mail and web security teams can introduce content sanitization steps that remove or neutralize hidden HTML elements that may carry adversarial instructions, such as white‑on‑white fonts, excessively small text, or off‑screen positioned content. While this will not eliminate all forms of indirect prompt injection, it can reduce the attack surface by stripping common steganographic patterns before content reaches the agent. Additional filters can flag or quarantine messages with unusually dense hidden text layers or suspicious CSS constructs.
On the agent platform side, robust input validation and policy‑driven tool usage controls are essential. Agents should be constrained to follow platform‑defined instructions that prioritize user safety over arbitrary content in prompts, and should treat untrusted content as data, not as executable instructions. Implementing strong output filters, URL allow‑lists, and rate limits on outbound network calls can also help detect or block suspicious exfiltration attempts that involve large volumes of encoded data or repeated requests to unknown domains.
Detection, Monitoring, and Incident Response
Monitoring for ShadowLeak‑style activity requires visibility into agent‑initiated actions across both the AI platform and connected third‑party services. Security teams should ensure that logs differentiate between human‑initiated operations and agent‑initiated operations wherever possible, and that security information and event management systems ingest this telemetry. Anomalous patterns, such as sudden spikes in read operations followed by sequences of outbound requests to unfamiliar domains, can serve as indicators of compromise.
In the event of suspected exploitation, incident responders must reconstruct the chain of injected content, affected accounts, and propagated messages. This entails reviewing inbound and outbound communications for embedded hidden instructions, revoking tokens for affected connectors, and assessing which data sets were accessible to the agent during the time window in question. Given the likelihood of multi‑stage propagation, responders should treat each compromised mailbox or repository as a potential origin for additional malicious content and perform recursive analysis until no further injected artifacts are found.
Implications for AI Safety and Governance
ShadowLeak highlights structural risks introduced by agentic AI systems that combine powerful reasoning with broad external integrations. Traditional prompt injection defenses that focus on direct interactions are insufficient once models can autonomously traverse and interpret untrusted content across a wide set of services. Governance frameworks must evolve to treat model context as an execution environment analogous to code, where untrusted input can change control flow and trigger side‑effect‑bearing operations.
For security architects and AI developers, this incident underscores the importance of designing agent frameworks with explicit trust boundaries, contextual privilege separation, and policy‑enforced behavior hierarchies. Without secure defaults and rigorous testing against adversarial content, every new connector or tool integrated into an agentic system can become an additional attack surface, enabling attackers to convert hidden text into powerful, invisible instructions that operate at cloud scale.