2025-04-25

So, here's a fun, dissenting opinion. I do not think people should be attempting to block AI scrapers from scraping publicly available content. But I will stress that last point about public data. If you don't want your content being consumed by anyone and everything on the web, don't put it on the web. The end. This is how the Internet has always functioned. Just because the consumer of your public website is now a machine learning algorithm doesn't change anything. That said, I can sympathize with some of the posts I've seen from people regarding LLM crawlers behaving poorly and wanting to block that type of behavior. But as I see it, that's no different than wanting to block someone attempting to DOS your site or blocking any other poorly behaving web crawler from accessing your site. That has nothing to do with the crawler being an LLM. If you don't want your content consumed, don't publish it.