Categories: Cyber Security News

Hackers Can Manipulate Claude AI APIs with Indirect Prompts to Steal User Data

Hackers can exploit Anthropic’s Claude AI to steal sensitive user data. By leveraging the model’s newly added network capabilities in its Code Interpreter tool, attackers can use indirect prompt injection to extract private information, such as chat histories, and upload it directly to their own accounts.

This revelation, detailed in Rehberger’s October 2025 blog post, underscores the growing risks as AI systems become increasingly connected to the outside world.

Sponsored

According to Johann Rehberger, the flaw hinges on Claude’s default “Package managers only” setting, which permits network access to a limited list of approved domains, including api.anthropic.com.

While intended to let Claude install software packages securely from sites like npm, PyPI, and GitHub, this whitelist opens a backdoor. Rehberger showed that malicious prompts hidden in documents or user inputs can trick the AI into executing code that accesses user data.

Indirect Prompts Attack Chain

Rehberger’s proof-of-concept attack begins with indirect prompt injection, where an adversary embeds harmful instructions in seemingly innocuous content, like a file the user asks Claude to analyze.

Leveraging Claude’s recent “memory” feature, which lets the AI reference past conversations, the payload instructs the model to extract recent chat data and save it as a file in the Code Interpreter’s sandbox, specifically at /mnt/user-data/outputs/hello.md.

Next, the exploit forces Claude to run Python code using the Anthropic SDK. This code sets the environment variable for the attacker’s API key and uploads the file via Claude’s Files API.

Crucially, the upload targets the attacker’s account, not the victim’s, bypassing normal authentication. “This worked on the first try,” Rehberger noted, though Claude later grew wary of obvious API keys, requiring obfuscation with benign code like simple print statements to evade detection.

A demo video and screenshots illustrate the process: An attacker views their empty console, the victim processes a tainted document, and moments later, the stolen file appears in the attacker’s dashboard up to 30MB per upload, with multiple uploads possible. This “AI kill chain” could extend to other allow-listed domains, amplifying the threat.

Sponsored

Rehberger responsibly disclosed the issue to Anthropic on October 25, 2025, via HackerOne. Initially dismissed as a “model safety issue” and out of scope, Anthropic later acknowledged it as a valid vulnerability on October 30, citing a process error.

The company’s documentation already warns of data exfiltration risks from network egress, advising users to monitor sessions closely and halt suspicious activity.

Experts like Simon Willison highlight this as part of the “lethal trifecta” in AI security: powerful models, external access, and prompt-based control.

For mitigation, Anthropic could enforce sandbox rules limiting API calls to the logged-in user’s account. Users should disable network access or whitelist domains sparingly, avoiding the false security of defaults.

As AI tools like Claude integrate deeper into workflows, such exploits remind us that connectivity breeds danger. Without robust safeguards, what starts as helpful automation could become a hacker’s playground.

Follow us on Google News, LinkedIn, and X for daily cybersecurity updates. Contact us to feature your stories.

The post Hackers Can Manipulate Claude AI APIs with Indirect Prompts to Steal User Data appeared first on Cyber Security News.

rssfeeds-admin

Recent Posts

AI vs. the Pentagon: killer robots, mass surveillance, and red lines

WASHINGTON, DC - JANUARY 29: U.S. Secretary of War Pete Hegseth (C) speaks during a…

4 minutes ago

Woot’s ‘Video Games for All’ sale features some of our favorite games

There’s a sale happening at Woot that’s delivering Black Friday-esque deals on video games through…

5 minutes ago

Asus ROG Flow Z13 Kojima Edition Review

When I reviewed the original ROG Flow Z13 last year, I was impressed at how…

30 minutes ago

Top Gun Is Getting a New 4K Steelbook to Celebrate the Movie’s 40th Anniversary

Top Gun is turning 40 this year, and in celebration of that big anniversary, it's…

30 minutes ago

We don’t have to have unsupervised killer robots

It's the day of the Pentagon's looming ultimatum for Anthropic: allow the US military unchecked…

1 hour ago

The US military reportedly shot down a CBP drone with a laser

The US-Mexico border in Fort Hancock, Texas. | Photographer: Luke Sharrett/Bloomberg via Getty Images The…

1 hour ago

This website uses cookies.