Allowing or Blocking AI (Only 6% of bloggers block AI across 41,909 blogs)
This study provides accurate and reliable data on WordPress blogs based on real-world data collected from over 41,000 blogs and 9 display networks.

While the global average is less than 6%, Raptive is significantly higher at 36%. This outlier skews the overall average, and it’s clear that most owners are NOT blocking AI.
Removing Raptive, and the total drops to 3.05% blocking AI.
In the last two weeks alone, I’ve met with numerous owners who lost traffic shortly after blocking AI.
- one through Raptive
- one through Cloudflare
Raptive recommends and advocates blocking AI. They’ve even built this into their WordPress plugin.
- Help Raptive advocate for you.
- Make your position clear.
Yet, to my knowledge, none of the top display networks have followed Raptive’s path. Perhaps they don’t agree with this strategy, or perhaps it’s just another Raptive marketing/lobbying strategy.
This resonates with Raptive’s consistent need for owners to enable ‘auto-optimize’ within their dashboard, allowing Raptive to do whatever they want, whenever they want.

I’ve been helping owners for nearly a decade and have not seen a successful site using this feature. I often find it enabled while auditing sites facing traffic declines, so I recommend turning this setting off immediately.
I advocate for owners every day, and it’s more than likely that Raptive is once again prioritizing its own interests over bloggers. This isn’t uncommon; display networks are known for this across the board. It wasn’t that long ago thatMediavine released a fabricated HCU case study.
Like auto-optimize, their advice generally hurts owners in the long term, so it’s likely that their recommendation to block AI continues this pattern.
I was discussing this with Grayson Bell from iMark Interactive, and I believe Raptive enabled AI blocking through their plugin by default in mid-late 2024.
However, if you have a physical robots.txt file in your root folder, Raptive had to ask you to update it personally.
While you may not have personally consented or be aware that you consented like the user below, Raptive included automatic consent within their plugin. When you update the plugin, it displays a brief WordPress notice indicating AI blocking was enabled.

Raptive contacted this owner on Feb. 2025, requesting she modify her local robots.txt file. I helped her review, and the moment we removed her local robots.txt and the default robots.txt within WordPress took over, all of the Raptive AI blocking was already enabled.
Internal Data
Regarding data and analysis, I love collecting large data sets and digging as deep as possible. The 41,909 blogs in this study are comprised of:
Display Network 287_5c61b8-7f> |
Monthly Traffic 287_dfee1c-c7> |
Blocking AI 287_6ff2f3-ea> |
287_378436-96> |
732.3k 287_629cae-22> |
3.3% 287_840e61-97> |
287_1ec412-b3> |
721.5k 287_973907-b5> |
5.1% 287_00d09d-3c> |
287_8ed1af-1f> |
476.6k 287_7bb00a-f7> |
5.8 287_962d40-81> |
287_1f4e13-e2> |
325.7k 287_409bd2-ed> |
6.1% 287_cf2815-36> |
287_6ad075-ab> |
157.1k 287_c9fbf0-28> |
1.3% 287_0f198c-6b> |
287_765300-80> |
127.4k 287_1d8330-cb> |
36% 287_cf72b8-69> |
287_6992b2-21> |
90.7k 287_4d341f-94> |
0.7% 287_44aba3-9d> |
287_a7850d-ec> |
40.8k 287_aac0e4-14> |
1.8% 287_be002f-42> |
287_9457a8-be> |
34.2k 287_42314c-59> |
0.6% 287_a0a8a7-38> |
* Mediavine Journey, combined with Mediavine
* Raptive Rise, combined with Raptive
* Traffic data is from Ahrefs Enterprise API
Path One: Allow AI
If it isn’t apparent, AI isn’t going anywhere. Owners following this path often generate meaningful traffic from AI overviews, ChatGTP, and AI in general.

Link to this owner’s qualifying search!
July, 2024
At present, ChatGPT is still the most used option. I tested it back in July 2024.

Quality sites are referenced in ChatGPT, with source links. To investigate further, I checked each of the above.
- https://heygrillhey.com/robots.txt (Raptive, not blocking AI)
- https://theonlinegrill.com/robots.txt (Mediavine, not blocking AI)
- https://www.angrybbq.com/robots.txt (Raptive, not blocking AI)
- https://amazingribs.com/robots.txt (not blocking AI)
Feb, 2025
Testing today, I see a vast improvement. Not only are there two options presented at the top and links directly to the owners’ sites, but the source links are also expandable and include extended information about the business and brand. Every aspect of this has improved in the last six months.

Again, I reviewed robots.txt for confirmation:
- https://barbecuebible.com/robots.txt (Mediavine, Not blocking AI)
- https://www.atbbq.com/robots.txt (eCommerce, Not blocking AI)
- https://amazingribs.com/robots.txt (Raptive, Not blocking AI)
- https://heygrillhey.com/robots.txt (Raptive, Not blocking AI)
- https://www.bbqguys.com/robots.txt (eCommerce, Not blocking AI)
Does it make sense for these owners to block AI outright?
Probably Not!
Path Two: Block AI
This path takes things in an entirely different direction! And it’s hard to say how this strategy will play out in the long term.
- Blocking AI (server, robots.txt) – basic
- Blocking AI (network, Cloudflare) – advanced
Note:
https://web.archive.org/web/20240511061210/https://heygrillhey.com/
Suppose your sole intention is to block AI and disallow training. It can still train on archived data, mitigating attempts to block locally.
Note:
You decide to block AI, and suddenly, instead of your brand, random third parties, like Feedspot’s Recipe Roundup, are referenced.

So, no matter how hard we try to block AI locally, using robots.txt or even Cloudflare, your business, brand, and content are easily found and used for training elsewhere.
Blocking AI (robots.txt)
Raptive is heavily pushing this and has included it in their Raptive Plugin. You can enable and disallow AI bots from crawling your site; however, this only impacts bots that respect robots.txt.
- There are numerous reports of blocking known bots like ChatGPT, yet they still manage to crawl sites.
- Cloudflare’s report documented AI bots attempting to disguise themselves as regular traffic, questioning the effectiveness of robots.txt.
There’s not much else to add here; this is pretty basic.
Blocking AI (Cloudflare)
Cloudflare is riding the AI wave by publicly announcing a network-level feature to block AI bots and crawlers.
As a Cloudflare Enterprise client, we host hundreds of websites and take a data-driven approach to improving performance, security, and reliability for owners. In an average month, we sift through mountains of data, and in this case, I’m focusing specifically on the Block AI Scrapers and Crawlers feature.
User-Agent Sample:
- Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; GPTBot; 15.85k requests
- Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/600.2.5 (KHTML, like Gecko) Version/8.0.2 Safari/600.2.5; Amazonbot; – 7.49k requests
- Mozilla/5.0 (Linux; Android 6.0.1; Nexus 5X Build/MMB29P) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/125.0.6422.175 Mobile Safari/537.36 (compatible; GoogleOther) – 7.13k requests
- Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; GPTBot; 988 requests
- Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; GoogleOther) Chrome/125.0.6422.175 Safari/537.36 – 383 requests
- CCBot (https://commoncrawl.org/faq/; info@commoncrawl.org) 5 requests
Observation One:
In the past, enabling Bot Fight Mode (super bot mode) blocks legitimate advertising bots. When using this feature, I only recommend the following:
Cloudflare > Security > Bots

In all of our tests, Enabling Bot Fight Mode blocks legitimate advertising bots:
- peer39_crawler/1.0 – 38.22k requests
- ias-va/3.3 – 13.9k requests
- ias-or/3.3 – 8.36k requests
- and many more
Blocking any of these will result in an immediate decline in RPM and Revenue. Many brand safety and ad-tech bots purposefully hide their identities. Their core purpose is to evaluate the quality and brand safety of your content and score it.
- high scores, high RPM
- low scores, low RPM
Display networks use this data to qualify advertisers and target them, which is why you’ll often see premium ads on one site and junk on another.
Observation Two:
Data suggests that GoogleOther segments Google’s AI crawler from Google’s search engine crawler GoogleBot.


We documented a significant number of GoogleOther requests, and since this is a specific Google AI Crawler, blocking may be worthwhile if you’re fed up with AI overviews.
Unless, of course, said overviews actually generate traffic, or you’d rather not lose your share of voice to a competitor.
Cloudflare Radar – List of Verified Bots
Path Forward
While there are two distinct paths, this is not a one-size-fits-all scenario. Of course, many business and personal factors are worth considering.
The joy of owning a business is that you’re the boss; ultimately, you can do whatever is best for you and your business.
Bookmark this page for future updates, and drop a comment below; I’d love to hear from you!