News organizations block AI crawler bots amid copyright concerns

September 1, 2023

20% of the world’s top 1000 websites are now blocking AI crawler bots, which are used to collect web data for AI services, according to Originality.AI.

The move comes after OpenAI introduced its GPTBot crawler in early August. The GPTBot was designed to collect data from the web to improve future AI models. However, many websites, including the New York Times, Reuters, and CNN, blocked the GPTBot, concerned that it could be used to scrape their content without permission.

The blocking of AI crawler bots is a sign of the growing tension between websites and AI companies over the use of data. While AI companies argue that they need to collect data to train their models, websites are concerned about protecting their content and intellectual property.

According to Originality.AI, the percentage of the top 1000 websites blocking OpenAI’s ChatGPT bot surged from 9.1% on August 22 to 12% on August 29. The situation is further complicated by the lack of clear legal guidelines governing the use of AI crawler bots. As a result, websites are taking matters into their own hands by blocking these bots.

The sources for this piece include an article in Axios.

Top Stories

Related Articles

December 23, 2025 Editor's Notes: This is the first of two articles reflecting on the year but Yogi Schulz. Schulz' more...

December 23, 2025 Google parent company Alphabet said Monday that it will acquire Intersect Power for $4.75 billion in cash more...

December 22, 2025 Artificial intelligence dominated global search behaviour in 2025, with Google’s own AI assistant, Gemini, emerging as the more...

December 22, 2025 OpenAI has hired the former head of Shopify’s core product organization to lead its next phase of more...

Jim Love

Jim is an author and podcast host with over 40 years in technology.

Share:
Facebook
Twitter
LinkedIn