Artificial Intelligence & Machine Learning
,
Next-Generation Technologies & Secure Development
New Rules Default to Blocking Bots, Offer Pay-Per-Crawl Option

An internet infrastructure firm that routes 16% of global web traffic will block AI bots from new domains it hosts unless it has explicit permission from the owner. Cloudflare on Tuesday also launched “pay per crawl,” a platform that allows publishers to charge AI crawlers each time they access a site.
See Also: OnDemand Webinar | Trends, Threats and Expert Takeaways: 2025 Global IR Report Insights
“AI crawlers have been scraping content without limits,” Cloudflare CEO Matthew Prince said. “Our goal is to put the power back in the hands of creators, while still helping AI companies innovate.”
Unlike traditional search engines – which offered the implicit bargain of indexing content in exchange for directing eyeballs – AI crawlers mostly harvest material without credit or compensation.
Will Allen, Cloudflare’s head of AI privacy, control and media products, said AI bot scrapping amounts to a fundamental change. “Traditionally, the unspoken agreement was that a search engine could index your content, then they would show the relevant links to a particular query and send you traffic back to your website,” he told MIT Technology Review. “That is fundamentally changing.”
Cloudflare found that in June, Google’s crawler scraped websites 14 times for every visit it referred, OpenAI’s crawler performed 17,000 scrapes per referral and Anthropic’s ratio reached 73,000 to one.
Pay per crawl introduces a choice for publishers. Site owners can block crawlers entirely, allow them at no cost or set a fee for each crawl, putting a price tag on training data. Crawlers are expected to identify themselves through Cloudflare’s bot verification system, declaring which company they work for and what purpose the data will serve – whether it is training, fine-tuning or responding to user queries.
Cloudflare said the system could become more relevant as AI-powered agents that visit websites on the behalf of users become common. “What if an agentic paywall could operate entirely programmatically?” the company speculated in a blog.
A blanket approach may also risk unintended consequences. “Not all AI systems compete with all web publishers. Not all AI systems are commercial,” Shayne Longpre, a PhD candidate at MIT Media Lab who studies data provenance, told MIT Technology. “Personal use and open research shouldn’t be sacrificed here.”
Cloudflare said giving website owners more control will lead to a healthier web ecosystem. The company introduced tools to help detect and deter bad actors, including routing unverified crawlers to AI-generated decoy pages. Allen said that Cloudflare’s experience mitigating malicious bots informed its strategy for identifying and managing crawlers. “A web crawler that is going across the internet looking for the latest content is just another type of bot, so all of our work to understand traffic and network patterns for the clearly malicious bots helps us understand what a crawler is doing,” he wrote.
Several publishers have already signed up for Cloudflare’s model. Media outlets such as the Associated Press, Time, Conde Nast and platforms like Stack Overflow and Quora, have endorsed a permission-based approach to crawling. “Community platforms that fuel LLMs should be compensated for their contributions so they can invest back in their communities,” said Stack Overflow CEO Prashanth Chandrasekar in Cloudflare’s announcement.
The pay per crawl market is still experimental. Publishers and AI companies must use Cloudflare’s platform to set rates and process transactions, and the company has not specified a price range so far.
Cloudflare acknowledged the challenges in persuading AI companies accustomed to free access to start paying, but said that it sees itself as one of the few players with the scale to broker agreements that could reshape how content powers the next generation of AI.