If there’s one thing most ChatGPT users want, it would be up-to-date and more accurate artificial intelligence or AI content. Well, OpenAI has been hard at work to make this happen. Just recently, the tech giant released documentation about its own web crawler.
Called GPTBot, OpenAI plans to use this crawler to help “AI models become more accurate and improve their general capabilities and safety.”
This is amidst concerns over personal privacy, bypassing paid content and harmful text generation. To address this issue, OpenAI assures that GPTBot is filtered to avoid crawling these types of content. (Although they haven’t divulged how this works.)
GPTBot can access your website content with this user agent and string on your robots.txt file:
User agent token: GPTBot
Full user-agent string: Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; GPTBot/1.0; +https://openai.com/gptbot)
However, if you don’t want GPTBot to get a snoop of your content, you can also input this command in your robots.txt file:
You can also customize how GPTBot can access your website. Here’s how you can do it:
Using this command will let GPTBot access only specific parts of your website.
All calls to websites will be made from the IP addresses OpenAI listed in its separate documentation. We believe these IP addresses will grow in number as GPTBot crawls more websites across the Internet. If you’re interested, here are the IP addresses listed as of writing:
So, what are the perks of allowing access to GPTBot? Aside from better, safer AI models – GPT-5 is already in the works – there’s really no benefit in letting your content be OpenAI’s training ground. But this should be good news if you’re leveraging AI content and want it to be more accurate and robust.
With Google’s recent moves to use publicly-accessible data for AI training, OpenAI will not stand by and watch its competitive advantage get eaten away.
Of course, there’s no guarantee that these tech companies will escape the scrutiny on the ethics and responsible use of online content. But we’re excited to see how far AI will go and how search engine optimization (SEO) will benefit from these advancements.
More SEO News You Can Use
Google Announces New, Easier and More Convenient Ranking Framework: If you’re tired of the constant and complex algorithm updates, it won’t stop soon. But this time, Google promises better search ranking updates without significantly overhauling its algorithms. Google’s researchers recently released a paper detailing a new weighting framework called TW-BERT. This framework makes finding query-relevant documents and query expansion easier. Based on their findings, TW-BERT combines the efficiency of statistic-based retrieval methods with the more context-oriented deep-learning models. This breakthrough allows Google to bring more relevant results to search queries. Also, the framework is easy to deploy, meaning Google could drop it into its system without hassle. How this affects search rankings remains to be seen. In fact, we’re yet to know if Google plans to use it going forward. Here’s the full story from Search Engine Journal. For the entire research paper, click here.
Pets Allowed? Answer That Question With This New GBP Attribute: Twitter user Claudia Tomina shared a new neat feature inside Google Business Profiles. Called “Pets,” this attribute lets you display your pet policy. Tomina’s screenshot shows you can only indicate if dogs are allowed inside and outside your establishment. If you have separate policies for cats, chickens and other non-human companions, you’ll have to watch out for another GBP update. Tomina also shared how your pet policy will be shown once you share that information. We’ve covered several helpful GBP updates over the past few weeks. (Here, here and here) We speculate this has to do with helping people distinguish real businesses from fake ones. Read more from Search Engine Roundtable.
You Shouldn’t Delete Older Content – Google: The SEO world is buzzing with the recent Gizmodo reveal of CNET’s massive content pruning to improve its search rankings. According to Gizmodo, “the company deleted small batches of articles prior to the second half of July, but then the pace increased.” In an internal memo, CNET claims this is a periodic strategic initiative to improve its domain authority. Google Search Liaison also tweeted about this: “Are you deleting content from your site because you somehow believe Google doesn’t like “old” content? That’s not a thing!” Basically, you shouldn’t just delete old content just because of their age. Instead, you should look at the content’s quality. Is it still helpful? Does it still provide value for your readers? Sure, some old content may no longer be beneficial, but some are. Therefore, Google’s ranking algorithms will not look at your content’s age as an indicator of its value. So, it’s best to assess your content to see if they’re still relevant and comply with Google’s E-E-A-T guidelines and other best practices. Read this Search Engine Land article for more information.
You Can Now Measure Brand Authority With Moz: Have you ever been curious about how strong your brand is in the online space? Moz helps you take a peek at their new metric called Brand Authority. Launched for beta testing last August 7, 2023, Brand Authority “can help you expand your vision beyond SEO,” meaning you can now quantify how your other campaigns, such as PR, impact your brand more than just search rankings. According to Dr. Pete Meyers, a Marketing Scientist at Moz, “With Brand Authority, we can finally understand how much they matter and put that power to work.” Will this be a useful metric for web owners? We’ll wait and see. Read the full story from Search Engine Land.
Editor’s Note: “SEO News You Can Use” is a weekly blog post posted every Monday morning only on SEOblog.com, rounding up all the top SEO news from around the world. Our goal is to make SEOblog.com a one-stop-shop for everyone looking for SEO news, education and for hiring an SEO expert with our comprehensive SEO agency directory.