Hackers Exploit ChatGPT-5 Downgrade Trick to Evade AI Safeguards

A critical vulnerability in ChatGPT-5 that allows attackers to bypass AI safety measures using simple trigger phrases.

The attack, dubbed PROMISQROUTE (Prompt-based Router Open-Mode Manipulation Induced via SSRF-like Queries, Reconfiguring Operations Using Trust Evasion), exploits the cost-saving model routing mechanisms that major AI providers use behind the scenes to reduce operational expenses.

When users interact with ChatGPT or other major AI services, they assume they’re communicating with a single, secure AI model.

Multi-Billion Dollar Cost-Saving Scheme Exposed

The research reveals the massive economic incentives behind vulnerable routing implementations.

Researchers estimate that OpenAI saves approximately $1.86 billion annually by routing most “GPT-5” requests to cheaper model variants rather than the flagship model advertised to users.

Universal Impact Across AI Infrastructure

PROMISQROUTE affects any AI infrastructure using layered model routing, making it relevant beyond just OpenAI’s systems.

Supply chain attacks are also possible, as most enterprises access AI through intermediary services that add additional routing layers.

Enterprise deployments using multiple security tiers, development environments, and legacy compatibility modes are particularly vulnerable.

The vulnerability becomes especially dangerous when combined with Retrieval-Augmented Generation (RAG) systems, where weak models may lack adequate safety training to handle sensitive retrieved content.

Researchers recommend immediate mitigation through cryptographic routing that doesn’t parse user content for routing decisions, along with implementing universal safety filters that protect all model variants equally.

However, these fixes come with significant cost implications that may limit adoption across the industry.

Find this Story Interesting! Follow us on LinkedIn and X to Get More Instant Updates.

The post Hackers Exploit ChatGPT-5 Downgrade Trick to Evade AI Safeguards appeared first on Cyber Security News.

ChatGPT-5 Downgrade Attack Let Hackers Bypass AI Security With Just a Few Words

A critical vulnerability in OpenAI’s latest flagship model, ChatGPT-5, allows attackers to sidestep its advanced safety features using simple phrases. The flaw, dubbed “PROMISQROUTE” by researchers at Adversa AI, exploits the cost-saving architecture that major AI vendors use to manage the immense computational expense of their services. The vulnerability stems…

August 22, 2025

In "Cyber Security News"

OpenAI Unveils ChatGPT‑5: Next‑Gen AI Agent with Smarter Thinking & Tool Mastery

OpenAI has announced the release of GPT-5, a groundbreaking artificial intelligence system that represents a significant leap forward in AI capabilities and efficiency. The new system features a sophisticated unified architecture that combines multiple specialized models with an intelligent routing system, designed to provide faster responses and more accurate answers…

August 8, 2025

In "Cyber Security News"

ChatGPT-5 Released: What’s New With the Next-Generation AI Agent

OpenAI has officially launched ChatGPT-5, a new generation of its AI agent that introduces a sophisticated, unified system designed to be faster, more intelligent, and significantly more useful for real-world applications. This release marks a significant evolution from its predecessors, offering a suite of models tailored for different tasks and…

August 8, 2025

In "Cyber Security News"

rssfeeds-admin

Next Android Users at Risk – Anatsa Malware Harvests Credentials and Tracks Keystrokes »

Previous « Improvaneers—A cast of comedians with Down Syndrome—Show the ability to laugh is universal

Published by

rssfeeds-admin

6 months ago

Firefox 148 Released With Sanitizer API to Disable XSS Attack

Firefox 148 introduces the new standardized Sanitizer API, becoming the first browser to implement it.…

7 minutes ago

Cyber Security News

Critical Claude Code Vulnerabilities Enables Remote Code Execution Attacks

A critical security flaw in Anthropic’s Claude Code demonstrates how threat actors can exploit repository…

7 minutes ago

Cyber Security News

27 Years old Telnet Vulnerability Enables Attackers to Gain Root Access

A newly confirmed vulnerability in the telnet daemon (telnetd) in GNU Inetutils has revived a…

7 minutes ago

Cyber Security News

PoC Released for Windows Vulnerability That Allows Attackers to Cause Unrecoverable BSOD Crashes

A proof-of-concept (PoC) exploit has been publicly released for CVE-2026-2636, a newly documented vulnerability in Windows’…

7 minutes ago

Resident Evil Requiem Has a Convoluted Challenge Called ‘The Final Puzzle’ — and the Race Is Now on to Get It Fully Solved

Resident Evil Requiem includes numerous Easter eggs and unlockable extras, but none seemingly as complex…

16 minutes ago

Woot’s Latest Gaming Sale is Genuinely Incredible, But The Best Deals Will Expire Soon

Amazon’s Woot store has been known to offer a bunch of deals in the past,…

16 minutes ago

This website uses cookies.

Hackers Exploit ChatGPT-5 Downgrade Trick to Evade AI Safeguards

Multi-Billion Dollar Cost-Saving Scheme Exposed

Universal Impact Across AI Infrastructure

Related

ChatGPT-5 Downgrade Attack Let Hackers Bypass AI Security With Just a Few Words

OpenAI Unveils ChatGPT‑5: Next‑Gen AI Agent with Smarter Thinking & Tool Mastery

ChatGPT-5 Released: What’s New With the Next-Generation AI Agent

Recent Posts

Firefox 148 Released With Sanitizer API to Disable XSS Attack

Critical Claude Code Vulnerabilities Enables Remote Code Execution Attacks

27 Years old Telnet Vulnerability Enables Attackers to Gain Root Access

PoC Released for Windows Vulnerability That Allows Attackers to Cause Unrecoverable BSOD Crashes

Resident Evil Requiem Has a Convoluted Challenge Called ‘The Final Puzzle’ — and the Race Is Now on to Get It Fully Solved

Woot’s Latest Gaming Sale is Genuinely Incredible, But The Best Deals Will Expire Soon

Hackers Exploit ChatGPT-5 Downgrade Trick to Evade AI Safeguards

Multi-Billion Dollar Cost-Saving Scheme Exposed

Universal Impact Across AI Infrastructure

Related

Related Post

Recent Posts