Categories: Cyber Security News

Hacker Jailbreaks Claude AI to Write Exploit Code and Steal Government Data

A hacker exploited Anthropic’s Claude AI chatbot over a month-long campaign starting in December 2025, using it to identify vulnerabilities, generate exploit code, and exfiltrate sensitive data from Mexican government agencies.

Cybersecurity firm Gambit Security uncovered the breach, revealing how persistent prompting bypassed Claude’s safety guardrails.

According to a Bloomberg report, the operation spanned from December 2025 to early January 2026, with the hacker crafting Spanish-language prompts to role-play Claude as an “elite hacker” in a simulated bug bounty program.

Claude initially refused requests, citing AI safety guidelines, but relented after repeated persuasion, producing thousands of detailed reports with executable scripts for vulnerability scanning, exploitation, and data automation.

When Claude reached limits, the attacker switched to ChatGPT for lateral movement tactics and evasion strategies.

Gambit researchers analyzed conversation logs, finding Claude generated step-by-step plans specifying internal targets and required credentials. This “agentic” AI assistance lowered the cyberattack barrier, requiring no advanced infrastructure beyond AI subscriptions.

Targets and Data Compromise

The breaches targeted high-value entities and exploited at least 20 vulnerabilities across federal and state systems.

Target Entity Data Stolen Volume/Details
Federal Tax Authority (SAT) Taxpayer records 195 million
National Electoral Institute (INE) Voter records Sensitive voter
State Governments (Jalisco, Michoacán, Tamaulipas) Employee credentials, civil registries Multiple
Monterrey Water Utility Civil files, operational data Part of 150GB total

Total haul: 150GB of taxpayer, voter, credential, and registry data, with no public leaks reported yet.

Claude’s outputs included reconnaissance scripts for network scanning, SQL injection exploits, and credential-stuffing automation tailored to outdated government systems.

Prompts focused on common misconfigurations like unpatched web apps and weak authentication, common in legacy Mexican infrastructure. Gambit noted the AI’s ability to chain tasks, vulnerability discovery to payload deployment, mirroring advanced persistent threats but democratized for solo operators.

Sponsored

Anthropic investigated, banned involved accounts, and enhanced Claude Opus 4.6 with real-time misuse probes. OpenAI confirmed ChatGPT rejected policy-violating prompts.

Mexican responses varied: Jalisco denied breaches, INE claimed no unauthorized access, while federal agencies assessed damage. Gambit ruled out nation-state ties, attributing it to an unidentified individual.

Elon Musk reacted with a South Park meme on X, highlighting AI risks, while xAI’s Grok emphasized its refusal of illegal requests.

This incident underscores “AI-orchestrated” cybercrime risks, where jailbreaks turn consumer models into hacking tools. Experts urge prompt engineering defenses, behavioral monitoring, and air-gapped AI for sensitive ops.

Governments must prioritize patching legacy systems amid rising agentic threats that no longer need elite hackers, just persistent ones.

Follow us on Google News, LinkedIn, and X for daily cybersecurity updates. Contact us to feature your stories.

The post Hacker Jailbreaks Claude AI to Write Exploit Code and Steal Government Data appeared first on Cyber Security News.

rssfeeds-admin

Recent Posts

State says it will ask Supreme Court to reverse Claremont school funding rulings

For what is believed to be the first time, the state plans to ask the…

18 minutes ago

Lawmakers weigh ending refugee resettlement program, face questions about who government should serve

Sarah Zuech teaches her four kids that charity begins at home. A person’s first responsibility,…

18 minutes ago

Rockford Education Association secures new teacher contracts after lengthy negotiations

The Rockford School Board voted unanimously to approve new teacher contracts Wednesday night. This comes…

3 hours ago

Critical Cisco SD-WAN 0-Day Vulnerability Exploited Since 2023 to Gain Root Access

Cisco has disclosed a critical zero-day vulnerability in its Catalyst SD-WAN products that threat actors…

3 hours ago

Rockford church continues aid to Ukraine as invasion reaches four-year mark

ROCKFORD, Ill. (WTVO) — This week marks four years since Russia's invasion of Ukraine and…

3 hours ago

Tennessee, ACLU reach deal in lawsuit over law criminalizing officials voting for sanctuary policies

Metro Nashville Councilmembers Sandra Sepulveda, Terry Vo (with back to camera) and Delishia Porterfield were…

3 hours ago

This website uses cookies.