LegalPwn exploits AI models by using legitimate legal language to trick them into misclassifying malicious software as safe code.

August 4, 2025No CommentsCybersecurity News

TL;DR

A recent cybersecurity breakthrough has revealed a significant vulnerability in several of today’s most popular generative AI tools. The novel “LegalPwn” attack, developed by researchers at Pangea Labs, demonstrates how attackers can trick artificial intelligence models like ChatGPT, Google Gemini, GitHub Copilot, Meta’s Llama, and xAI’s Grok into misclassifying malicious software as safe code by cleverly disguising it within seemingly legitimate legal language .

How LegalPwn Works

The LegalPwn technique leverages an advanced form of prompt injection—an adversarial tactic wherein malicious actors embed harmful code within text crafted to appear as legal disclaimers, confidentiality notices, compliance mandates, or terms of service agreements. Because generative AI systems are programmed to treat such legitimate-sounding language with a high degree of trust, these attacks are particularly effective at bypassing built-in safety and security filters. Even when explicit prompts are employed to enhance vigilance, the models frequently fail to identify the underlying threat if it is cloaked within convincing legal jargon.

Research Findings and AI Vulnerabilities

During their assessments, researchers tested twelve leading AI platforms and found the majority were susceptible to LegalPwn. Notably, only a few—including Anthropic’s Claude 3.5 Sonnet and Microsoft’s Phi 4—were able to withstand this type of attack. In practical demonstrations, attackers successfully used LegalPwn to have AI-powered developer tools and code assistants misclassify malicious payloads, such as backdoors or reverse shells, as benign utilities like calculators. While human analysts spotted the threats correctly, the AI solutions were consistently deceived by the legal wrapper.

LegalPwn

Last updated on August 4, 2025

Comments

No comments yet. Why don’t you start the discussion?

Leave a ReplyCancel reply

United States	~59% of ransomware attacks globally Thousands per year
Poland	1,000+ per week
Russia	Highest cybercrime threat level
China	Thousands per year
India	115% surge in attacks Q2 2024
Ukraine	Significant surge since 2022
Brazil	Among top countries for blocked attacks
Mexico	65% of businesses hit in 2024
Germany	High targeted rate (EU)
France	High targeted rate (EU)

AS Name	ASN
Bharat Sanchar Nigam Ltd	9829
No.31,Jin-rong Street	4134
CHINA UNICOM China169 Backbone	4837
DigitalOcean, LLC	14061
HUAWEI INTERNATIONAL PTE. LTD.	136907
Amazon.com, Inc.	14618
Alibaba (US) Technology Co., Ltd.	45102
Google LLC	396982
Amazon.com, Inc.	16509
3xK Tech GmbH	200373

IP Address	Notable Exploits/Context
104.238.159.149	SharePoint zero-day, broad exploitation
107.191.58.76	SharePoint zero-day, government targets
96.9.125.147	SharePoint, previously Ivanti exploits
139.162.47.194	Exploits on CitrixBleed 2
38.180.148.215	CitrixBleed 2 campaigns
185.224.128.17	High activity, Netherlands
89.248.163.200	High activity, Netherlands
15.235.218.150	Associated with APT, active C2
45.9.148.114	Associated with C2, malicious netflow
91.107.150.184	C2 infrastructure, recent IoC

How LegalPwn Works

Research Findings and AI Vulnerabilities

Share this:

Related

Comments

Leave a ReplyCancel reply