WordPress Update Rolls Out Version 6.4.2 To Address Critical Security Issue.
Dec 08, 2023
Craw-delay in robots might sound like something out of a sci-fi action movie, but if you’re in the world of search engines and generally SEO, then you know that crawling isn’t something straight out of Skynet.
As you may know, search engines like Google, Bing, and Yahoo! “Crawl” your websites regularly to look for and index fresh content and to update their own systems about the changes your website underwent. This can mean many things, from deleting old or irrelevant content to website restructuring.
And to make things easier for you and to help your website rank better, you can actually use specific directives to somewhat manipulate these crawlers (often referred to as spiders) to crawl your site in a specific manner with the robots.txt file.
As you might have guessed, in this article, we’ll be discussing what is crawl delay in robots.txt files and why it can make sense in the case of some websites. Also, we will touch base on how you can create a better “crawling atmosphere” for these robots with crawl delay and professional search engine optimization services.
By now, you’ve probably guessed that one of the directives you can add to this text file is the possibility of controlling the rate at which these crawlers index your website and pages.
Typically, for most site owners, the terms crawl delay 1, 5, 10, and so on will be unnecessary to learn, but in other circumstances, learning how to configure crawl delay in robots.txt files can be a handy asset in your SEO arsenal.
Robot.txt files are also beneficial as they can tell the search engine crawlers the places they can and cannot go on your websites. In essence, these are simple text files containing several directives that the crawlers read.
With these files, you can communicate the basic engagement rules of your website with search engines. They are crucial because if your site is missing these directives, crawlers will crawl the entire website. As such, it’s a crucial SEO component as it tells the spiders how they can best crawl any given site. The directives can help you avoid spiders crawling duplicate content or accessing certain parts of your website and, generally, help them do a more efficient job.
For starters, robots.txt crawl delay directives can help prevent server overloads when there is a massive number of requests. Thus crawlers can be slowed down, and the speed rate can be best defined as the timeframe between any given two requests bots would make to your platform.
In other words, it describes the rate at which these spiders are allowed to crawl your site. With these directives, for example, with, let’s say, “crawl delay 20,” these robots will wait a total of 20 seconds between two requests. Needless to say, this is an effective method to keep the crawlers from eating away the site’s hosting resources.
Still, there’s a catch to it. Let’s go with the robots.txt crawl-delay example of crawl delay 10, where the delay is, well, ten seconds; you also limit the number of pages search engines can access per day (in this case, 8640). While this might not be a problem for small platforms, it can be a headache in the case of massive websites.
For this article, we’ll take the cPanel File Manager route to create our robot.txt file. Just open the client area and log in to the necessary portal, look for the CP icon, and open the File Manager under the necessary area. Here, you will need to locate your public_html folder, open it, then create a robots.txt file in File Manager,
Please note that this is only a rough outline of the process. Creating the text file is more intricate and complicated and should only be left for those experienced in search engine optimization and web development.
Now, select the text file, click on edit, and insert the necessary text into the editor tab that appears after you’ve clicked “edit.”
In this case, another robots.txt crawl-delay example would be:
Crawl delay: 5
As we’ve mentioned before, in this case, crawlers will wait five seconds between crawls. Once the necessary speed rate is set, make sure to save your changes and that the .txt file has been updated.
For the most part, this will work to help with Yahoo! and Bingbot crawl delay, but Google will simply ignore it.
The thing is, Google won’t be considering the directives in the .txt file. Still, it can be beneficial to use it on other search engine spiders that SEO experts consider “more aggressive.”
On the other hand, there’s very little chance that you will experience any major problems with Google bots; you can slow down their crawl rate, too, via Google Search Console. Just log in, select the website, click site settings after clicking on the gear icon, and you can use simply pick out your preferred setting.
Still, why does Google ignore robot.txt crawl delay directives while Bing, Baidu, and others adhere to them? The whole idea behind the delay was to help lessen the load on the servers due to crawler requests. Google simply ignores this because it can – they operate with dynamic servers, and classic crawl delay simply doesn’t make sense in their case. Furthermore, the engine giant will automatically adjust the crawling speed based on the server’s reaction.
For websites with massive page numbers (especially when they are linked from the index), bots can make too many requests in a relatively short time period, and the traffic load can deplete hosting resources. Crawl-delay can help you avoid massive traffic peaks and will avoid taking up too much bandwidth and other server resources when crawlers are lurking on your site.
As such, having an effective guideline in your robot.txt file can help you retain your resources for your visitors instead of “wasting” them for crawlers.