The flaw impacts the widely used parquet-avro module in versions up to 1.15.1, threatening data pipelines relying on this columnar storage format.
According to the report, the vulnerability stems from insecure deserialization during Avro schema parsing when using the specific or reflect data models.
Attackers can craft Parquet files containing schemas that trigger execution of arbitrary Java classes from trusted packages during metadata processing.
While Apache Parquet 1.15.1 introduced package restrictions, its default allowlist (org.apache.parquet.avro.SERIALIZABLE_PACKAGES) remained permissive enough for exploitation.
Exploitation prerequisites:
parquet-avro with specific/reflect models (generic model unaffected)java// Vulnerable configuration example (pre-1.15.2)
AvroParquetReader.Builder<GenericRecord> builder = AvroParquetReader
.builder(inputFile)
.withDataModel(DataModel.Reflect); // Vulnerable model
Apache Parquet’s integration with big data frameworks like Spark and Flink makes this a high-severity issue for:
Successful exploitation could enable:
The Apache Software Foundation recommends two solutions:
-Dorg.apache.parquet.avro.SERIALIZABLE_PACKAGES=""Operational safeguards:
Security researchers emphasize that this vulnerability highlights persistent risks in data serialization architectures.
“This CVE shows how seemingly narrow API choices in data processing libraries can create systemic security risks,” noted David Handermann, one of the vulnerability reporters.
As of May 2025, there are no confirmed exploitations in the wild, but the disclosure has prompted urgent patching efforts across cloud providers and data platform vendors.
Users are advised to complete mitigations before May 15, 2025, when proof-of-concept exploit code is expected to become publicly available.
This incident underscores the critical need for defense-in-depth strategies in data processing systems, including regular dependency updates and strict input validation for complex file formats.
Organizations using Apache Parquet should prioritize vulnerability scanning of their data infrastructure and review deserialization practices across all big data components.
Find this Story Interesting! Follow us on LinkedIn and X to Get More Instant updates
The post Apache Parquet Java Vulnerability Exposes Systems to Arbitrary Code Execution appeared first on Cyber Security News.
Fresh from its huge early access launch, underwater survival and crafting adventure game Subnautica 2…
dither-avatar is a lightweight, zero-dependency JavaScript library that generates deterministic, dithered SVG avatars from any…
The cyber battlefield in Eastern Europe is escalating once again. Relentless Russian state-sponsored threat actors…
Three critical vulnerabilities have been disclosed in n8n, the popular open-source workflow automation platform, any…
A critical pre-authentication remote code execution (RCE) vulnerability has been discovered in Marimo, a widely…
A critical heap buffer overflow flaw in F5 NGINX, tracked as CVE-2026-42945, has moved from disclosure…
This website uses cookies.