Categories: Cyber Security News

OpenAI Sora 2 Vulnerability Exposes System Prompts Through Audio Transcripts

OpenAI’s Sora 2 represents a significant leap forward in video generation technology. Yet recent security research has uncovered a critical vulnerability that exposes its hidden system prompt via multimodal extraction techniques.

Researchers successfully demonstrated that carefully crafted requests across different output modalities,such as audio transcripts, encoded video frames, and text renderings,can systematically extract sensitive instructions that guide the model’s behavior, raising essential questions about the security posture of production AI systems.

The vulnerability exploits Sora 2’s ability to generate content across multiple modalities: text, images, video, and audio.

Researchers discovered that while traditional text-to-text prompt injection defenses are relatively robust, the model’s cross-modal capabilities create unexpected weaknesses.

The attack leverages the principle that information can be progressively recovered by requesting that the system render or speak the target content in different formats.

Audio transcription proved remarkably effective, as speech-to-text conversion maintains higher fidelity than image-based text rendering, which suffers from character distortion and semantic drift.

The extraction process involved fragmentary requests spread across multiple 15-second video clips, with researchers iteratively refining their approach based on successfully recovered portions.

This stepwise methodology transformed seemingly impossible extraction into a practical attack, demonstrating how temporal and format constraints can be circumvented through persistence and multi-modal chaining.

OpenAI acknowledged the vulnerability on November 4, 2025, noting that system prompt extraction was already a known possibility across multimodal systems.

The research team responsibly coordinated with OpenAI’s security team before publication, with full disclosure occurring on November 12, 2025.

While Sora 2’s exposed system prompt itself contains no highly sensitive data, researchers emphasize that system prompts function as security boundaries equivalent to firewall rules and should be protected as confidential configuration, not harmless metadata.

Vulnerability Type	Attack Vector	Severity	Status
System Prompt Extraction	Multi-Modal Input (Audio/Video/Image)	Medium	Acknowledged
Audio Transcript Leakage	Speech-to-Text Transcription	Medium	Acknowledged
Cross-Modal Data Exfiltration	Encoded Image/Video Generation	Low-Medium	Acknowledged

This research highlights an emerging gap in AI security: while text-based safeguards have matured through years of red-teaming, multi-modal systems remain vulnerable to creative circumvention strategies.

The vulnerability demonstrates how duplicate semantic content, when transformed across different output formats, can expose protected information.

As AI systems become increasingly complex and multi-modal, security teams must evolve their threat models beyond single-modality assumptions to account for cross-channel information leakage and indirect exfiltration pathways.

Find this Story Interesting! Follow us on Google News, LinkedIn and X to Get More Instant Updates

The post OpenAI Sora 2 Vulnerability Exposes System Prompts Through Audio Transcripts appeared first on Cyber Security News.

New Prompt Insertion Attack – OpenAI Account Name Used to Trigger ChatGPT Jailbreaks

The latest technique, uncovered by AI researcher @LLMSherpa on X (formerly Twitter), exposes a little-known vulnerability in OpenAI’s ChatGPT system, a prompt insertion attack leveraging the user’s OpenAI account name. Unlike traditional prompt injections, which typically involve cleverly crafted user input, this method exploits the way OpenAI stores the account…

August 26, 2025

In "Cyber Security News"

Critical LangChain Vulnerability Exposes API Keys and Sensitive Credentials

A critical security flaw in LangChain, one of the world’s most widely deployed AI frameworks, could allow attackers to extract sensitive environment variables and execute malicious code through a sophisticated serialization injection vulnerability. Security researcher Yarden Porat discovered CVE-2025-68664, a vulnerability residing in langchain-core that exploits how the framework handles…

December 26, 2025

In "Cyber Security News"

Critical Apache Commons Text Vulnerability Enables Remote Code Execution

A critical remote code execution vulnerability has been discovered in Apache Commons Text, posing severe risks to organizations worldwide. The flaw, tracked as CVE-2025-46295, affects all versions before 1.10.0 of this widely used Java library for text manipulation and processing. The Vulnerability The vulnerability resides in Apache Commons Text’s interpolation…

December 18, 2025

In "Cyber Security News"

rssfeeds-admin