Amazon and OpenAI Web Spiders Cause Concern with Excessive Visits to Content Farm
In a surprising turn of events, the world’s most unremarkable content farm has captured the attention of web spiders from Amazon and OpenAI. While the six billion pages on the website fail to captivate humans, these intelligent bots seem to be entranced by its content. Over the past week, Amazon’s amazonbot has visited the site a staggering 6 million times, closely followed by OpenAI’s gptbot with 2.6 million visits. It appears that this mundane content farm holds the secret to what shapes OpenAI’s ChatGPT training, shedding light on a previously undisclosed aspect of their process.
The relentless visits from these web spiders have raised concerns among the website’s operator, John. Although he considers the frequent visits from Google’s googlebot as inconsequential, the relentless activity of gptbot and amazonbot has become overwhelming. In an attempt to address the issue, John reached out to Amazon through their provided contact address. Regrettably, he has yet to receive any response from them. The situation becomes further complicated by the fact that OpenAI’s gptbot lacks a contact address altogether.
This is not the first time John has experienced such a situation. A few years ago, the bingbot became trapped in a similar manner. Fortunately, John had a connection at Microsoft who managed to relay the issue to the appropriate channels. Shortly after, the excessive visits from the bingbot ceased. While no detailed information was provided, John’s contact revealed that animated discussions took place internally at Microsoft before taking action.
Seeking resolution, John is now reaching out to anyone who may have contacts at Amazon or OpenAI. He hopes that someone can assist in addressing this persistent and overwhelming web traffic. The continuous hammering from these web spiders, particularly gptbot, has placed a significant strain on the server. Although John has managed to set up packet filters to mitigate the impact, the bots’ relentless activity still poses a challenge.
The user agent strings of both these web spiders contain URLs, allowing John to identify them. However, while Amazon’s page provides an address to communicate with, OpenAI’s gptbot provides no such option. As John awaits a response from Amazon, he remains hopeful that someone can step in and bring the situation under control. With his previous encounter with Microsoft offering a glimmer of hope, he understands the value of connections in resolving these issues effectively.
The excessive visits from web spiders such as amazonbot and gptbot have highlighted the impact that intelligent bots can have on content farms. While humans find these pages uninteresting, the autonomous nature of web spiders can lead to unforeseen consequences. Hopefully, by establishing contacts and addressing the issue at its source, John can regain control over his content farm and mitigate the strain caused by these relentless visits.