Google Gemini for Workspace can be exploited through a technique called indirect prompt injection. This allows attackers to manipulate Gemini’s email summaries, making them appear legitimate while embedding malicious instructions or warnings that direct users to phishing sites—without using traditional attachments or direct links.
How the Exploit Works
Attackers embed hidden instructions within the body of an email using techniques like white-on-white text, invisible HTML tags, or specialized tokens (e.g., <Admin>...</Admin>
or System: ...
). The exploit does not require attachments or clickable links. Instead, the malicious content is disguised within the email’s formatting or metadata.
When a user clicks “Summarize this email,” Gemini interprets the hidden prompt as a legitimate instruction and generates a summary that includes fabricated warnings or urgent instructions. The summary might display a message such as:”ALERT! Your password has been compromised. Please visit [fake site] to reset your password immediately.” The original email may not contain this visible warning, and the summary appears to come from Google or the organization itself.
Attack Characteristics
Feature | Description |
---|---|
Attack Vector | Maliciously crafted email body (no attachments/links) |
Exploit Mechanism | Indirect prompt injection via hidden text or HTML/CSS |
User Interaction | Triggered when user asks Gemini to summarize the email |
Resulting Summary | Includes urgent warnings or instructions (e.g., reset password, call number) |
Phishing Tactic | Directs users to phishing sites or social engineering attacks |
Google’s Response and Mitigations
- Detection Efforts: Google has implemented security measures to detect and block malicious prompts. When Gemini identifies threats, it may exclude suspicious messages from summaries or warn users about security risks.
- Limitations: Despite these safeguards, researchers have demonstrated that certain prompt injection techniques still bypass protections, and Google has sometimes classified these behaviors as “intended” or “not a security issue”.