Categories: Cyber Security News

vLLM Vulnerability Enables Remote Code Execution Via Malicious Payloads

A critical memory corruption vulnerability in vLLM versions 0.10.2 and later allows attackers to achieve remote code execution through the Completions API endpoint by sending maliciously crafted prompt embeddings.

The vulnerability resides in the tensor deserialization process within vLLM’s entrypoints/renderer.py at line 148.

When processing user-supplied prompt embeddings, the system loads serialized tensors using torch.load() without adequate validation checks.

The Vulnerability Explained

A change introduced in PyTorch 2.8.0 disabled sparse tensor integrity checks by default, creating an attack vector for malicious actors.

Without proper validation, attackers can craft tensors that bypass internal bounds checks, triggering an out-of-bounds memory write during the to_dense() conversion.

This memory corruption can cause the vLLM server to crash and potentially enable arbitrary code execution within the server process.

Attribute	Details
CVE ID	CVE-2025-62164
Severity	High
CVSS Score	8.8/10
Affected Product	vLLM (pip)
Affected Versions	≥ 0.10.2

This vulnerability affects all deployments running vLLM as a server, particularly those deserializing untrusted or model-provided payloads.

Any user with API access can exploit this flaw to achieve denial-of-service conditions and potentially gain remote code execution capabilities.

The attack requires no special privileges, making it accessible to both authenticated and unauthenticated users, depending on the API configuration.

Organizations using vLLM in production environments, cloud deployments, or shared infrastructure face significant risk, as successful exploitation could compromise the entire server and adjacent systems.

The vLLM project has addressed this vulnerability in pull request #27204. Users should immediately upgrade to the patched version.

As a temporary mitigation, administrators should restrict API access to trusted users only and implement input validation layers that inspect prompt embeddings before they reach the vLLM processing pipeline.

The vulnerability was discovered and responsibly disclosed by the AXION Security Research Team, highlighting the importance of coordinated vulnerability disclosure in the AI infrastructure ecosystem.

Follow us on Google News, LinkedIn, and X for daily cybersecurity updates. Contact us to feature your stories.

The post vLLM Vulnerability Enables Remote Code Execution Via Malicious Payloads appeared first on Cyber Security News.

vLLM Vulnerability Enables Remote Code Execution Through Malicious Payloads

A critical vulnerability has been discovered in vLLM, a widely used high-throughput inference and serving engine for Large Language Models. The flaw, identified as CVE-2025-62164, enables attackers to execute arbitrary code remotely through maliciously crafted payloads sent to the Completions API endpoint. With a CVSS score of 8.8 out of 10,…

November 24, 2025

In "Cyber Security News"

Critical RCE Flaws in AI Inference Engines Expose Meta, Nvidia, and Microsoft Frameworks

Security researchers at Oligo Security have uncovered a series of critical Remote Code Execution vulnerabilities affecting widely deployed AI inference servers from major technology companies. The flaws impact frameworks developed by Meta, NVIDIA, Microsoft, and open-source projects, including vLLM, SGLang, and Modular, potentially exposing enterprise AI infrastructure to serious security…

November 17, 2025

In "Cyber Security News"

Critical PyTorch Vulnerability Enables Remote Code Execution

A critical Remote Code Execution (RCE) vulnerability (CVE-2025-32434) has been identified in PyTorch, one of the most widely used open-source machine learning frameworks. This flaw, discovered by security researcher Ji’an Zhou, undermines the safety of the torch.load() function even when configured with weights_only=True—A parameter long trusted to prevent unsafe deserialization.…

April 21, 2025

In "Cyber Security News"

rssfeeds-admin