Categories: Cyber Security News

Cloudflare Claims Perplexity AI Skirts Firewalls and Crawls Sites Using User-Agent Manipulation

Cloudflare, a leading internet security and infrastructure provider, has accused Perplexity AI, an emerging AI-powered answer engine, of circumventing web protections to crawl content from millions of websites, despite explicit blocks.

According to a detailed technical report published by Cloudflare, Perplexity’s crawling behavior includes user-agent manipulation, evading robots.txt directives, and rotating IP addresses and Autonomous Systems (ASNs) to skirt network restrictions.

Table of Contents

Toggle

AI Crawler Found Using Stealth Tactics

Cloudflare’s investigation was prompted by customer complaints, including reports that Perplexity was accessing restricted content even after its bots PerplexityBot and Perplexity-User were explicitly blocked using both robots.txt files and Web Application Firewall (WAF) rules.

In tightly controlled tests, Cloudflare created new, non-indexed domains with restrictive crawling policies and attempted to access content via Perplexity.

Surprisingly, the platform was able to retrieve and summarize protected content from these domains, which had no public discoverability and forbade all bot access.

Technical analysis revealed that Perplexity’s crawling infrastructure initially used its declared user-agents, which identify themselves as bots.

When these were blocked, however, the company allegedly deployed crawlers impersonating generic browsers such as Chrome on macOS, using user-agent strings like:

textMozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/124.0.0.0 Safari/537.36

Cloudflare observed 3–6 million daily requests from these stealth agents, alongside the 20–25 million daily requests from Perplexity’s declared bots.

Perplexity ai

In addition to user-agent obfuscation, Perplexity reportedly rotated through multiple IP addresses and ASNs not officially linked to their public documentation. This IP churn made it challenging for standard block lists or firewall rules to keep pace, effectively bypassing standard anti-bot protections.

Industry Standards and Content Protection

Cloudflare contrasted Perplexity’s tactics with those of other AI companies, such as OpenAI, which are said to follow internet norms: using unique and declared user-agents, fetching and respecting robots.txt rules, and halting all attempts to crawl when disallowed.

Cloudflare’s experiment with OpenAI’s ChatGPT showed full compliance with these expectations, while Perplexity continued to probe blocked sites via alternate means.

To mitigate such stealth activity, Cloudflare has upgraded its managed rules to fingerprint and block Perplexity’s obfuscated crawlers, providing these protections even to free-tier customers. Over 2.5 million websites now use Cloudflare’s managed robots.txt feature or AI Crawler block rules.

Perplexity ai

As the web shifts toward more explicit controls over AI-powered scraping and content training, Cloudflare urges increased transparency and technical accountability from bot operators. The company also signals ongoing collaboration with standards groups to enforce responsible data access.

Cloudflare’s findings serve as a warning for AI companies building on internet data: transparency, compliance with robots.txt, and respect for content creator preferences remain non-negotiable terms for a trustworthy and sustainable web.

The post Cloudflare Claims Perplexity AI Skirts Firewalls and Crawls Sites Using User-Agent Manipulation appeared first on Cyber Security News.

Cloudflare says Perplexity’s AI bots are ‘stealth crawling’ blocked sites

The AI search startup Perplexity is allegedly skirting restrictions meant to stop its AI web crawlers from accessing certain websites, according to a report from Cloudflare. In the report, Cloudflare claims that when Perplexity encounters a block, the startup will conceal its crawling identity “in an attempt to circumvent the…

August 4, 2025

In "The Verge"

Cloudflare Accuses Perplexity AI For Evading Firewalls and Crawling Websites by Changing User Agent

Perplexity AI, an emerging question-answering engine powered by advanced large language models, has recently come under scrutiny for deploying stealth crawling techniques that bypass standard web defenses. Initially launched with transparent intentions, Perplexity’s crawlers would identify themselves via declared user agents such as PerplexityBot/1.0, respecting robots.txt directives and web application…

August 5, 2025

In "Cyber Security News"

Reddit sues Perplexity for allegedly ripping its content to feed AI

Reddit is suing Perplexity and three “data-scraping service providers” to “stop the industrial-scale, unlawful circumvention of data protections by a group of bad actors who will stop at nothing to get their hands on valuable copyrighted content on Reddit,” according to the complaint. The company equates the data scraping companies…

October 22, 2025

In "The Verge"

rssfeeds-admin

Next APT36 Hackers Target Indian Government Agencies in Bid to Harvest Login Credentials »

Previous « Over 10,000 Fake TikTok Shop Domains Exposed: A Threat to User Security via Login Theft and Malware

Published by

rssfeeds-admin

9 months ago

Indiana News

Son Arrested After Bloomington Fire

BLOOMINGTON, Ind. (WOWO) — A Bloomington man is facing multiple felony charges after police said…

1 hour ago

Indiana News

Bears Stay Push

ILLINOIS, (WOWO) — Political leaders moved Wednesday to block Indiana’s effort to lure the Chicago…

1 hour ago

WTVO

Recovery efforts underway in Rock County after devastating flood damage

Rock County Emergency Management Director Kevin Burnett stated that his team has been working to…

1 hour ago

This website uses cookies.

Cloudflare Claims Perplexity AI Skirts Firewalls and Crawls Sites Using User-Agent Manipulation

AI Crawler Found Using Stealth Tactics

Industry Standards and Content Protection

Related

Cloudflare says Perplexity’s AI bots are ‘stealth crawling’ blocked sites

Cloudflare Accuses Perplexity AI For Evading Firewalls and Crawling Websites by Changing User Agent

Reddit sues Perplexity for allegedly ripping its content to feed AI

Recent Posts

Exclusivity, Affordability, Third-Party Partnerships in Focus as New Xbox Leadership Vows to ‘Fix the Fundamentals’

Apex Review

The MSI Aegis Z2 RTX 5070 Ti Gaming PC Drops to $1,850 and Includes a Free Copy of Pragmata

Son Arrested After Bloomington Fire

Bears Stay Push

Recovery efforts underway in Rock County after devastating flood damage

Cloudflare Claims Perplexity AI Skirts Firewalls and Crawls Sites Using User-Agent Manipulation

AI Crawler Found Using Stealth Tactics

Industry Standards and Content Protection

Related

Related Post

Recent Posts