What is Crawl Budget and How to Optimize It?

Crawl Budget is a term that refers to how many of your pages are visited by Google bots on a daily basis. While it generally gives an average figure, it has a dynamic structure and can vary from day to day with the effect of several factors.

How Do Google Spiders Work?

Google spiders search the robots.txt file, if available, as the first thing when they log in to the sites. All links, except for subdirectories that examine the robots.txt file and are blocked, are scanned by Google spiders. Each page is crawled repeatedly and all scripts and style files on the crawled page are executed. Other links detected on the page are then queued and crawled in turn.

The bots also crawl the previously crawled links again during the visit. There can be many different reasons for these crawls. Among the reasons may be that they detect that the content has been updated with the lastmod parameter in your sitemap. It could also be that someone has linked to your page. It is not possible to fully understand this algorithm. In short, Google scans everything that comes to your page and adds the links it detects to the queue. It crawls those links in turn.

What is Crawl Budget?

The number of pages visited by Google according to the importance and value shown to the page is called “Crawl Budget”. When it falls below a certain rate, your site’s index will not be updated in search results and the value of your site will decrease day by day. So how is this ratio calculated? To calculate it, you must first enter Search Console and then go to Google Index > Index Status tab from the menu. In this way, you will start to see the total number of pages added to the index from the screen that appears.

If you want to see how many pages Google spiders crawl on a daily basis, you can log in to Search Console > Crawl > Crawl Statistics. From there, you can take a look at the numbers categorized as “Average” in the “Number of Pages Crawled Daily” section.

How to Calculate Crawl Budget?

For example, 871 total pages / 206 average pages scanned = 4.22 scan budget. So what does this figure tell us?

  • If your crawl budget is above 10, you need to seriously optimize your crawl budget.
  • If your crawl budget is below 3, you don’t need to worry.
  • If you have a crawl budget in the middle of the two values, you should check the “Average number of pages crawled per day” value more often and take quick action when it declines.

What Determines the Scan Budget?

The crawl budget is basically determined according to 4 different factors. The factors include the total number of pages added to the Google index, site size, site speed and the referral links (backlinks) the site receives. Let’s examine these factors together.

  1. Total Number of Pages Added to Google Index

When Google bots visit your site, they examine your sitemap and your existing indexed pages. If new links are identified after the review, they are crawled, categorized and displayed in the search results. This doesn’t mean that every crawled link will rank well. There are more than 200 details that affect rankings. Your indexed pages are revisited and crawled by Google bots at certain intervals. You can access the number of indexes via Search Console.

  1. Site Size

Images, HTML files, CSS or JavaScript files on your site are downloaded and scanned by Google bots. Each of the files also determines the resources and time that the bots will allocate to your site. If your site is large in size and not valuable enough, Google bots will visit your site less often. This is because they have more resources to allocate.

  1. Site Speed

Site opening speed is as important for Google bots as it is for users. Sites that open fast and can be crawled quickly are processed by Google bots at the same speed. Thus, Google bots use the allocated resources more functionally and efficiently. Bots begin to reward such sites by visiting them more often. Site speed is also a ranking factor. For this reason, site opening speed is also an important part of SEO efforts.

  1. Number of Reference Links (Backlinks)

Google bots index each page one by one as they crawl the entire internet. They also follow the links on the indexed pages. Google bots follow all the links inside or outside the site on the pages they enter. If your page is linked to from another site, Google bots that go to that site will also visit you. Therefore, your number of referral links will also be related to your crawling budget.

How to Optimize Crawl Budget?

In the first phase, you should eliminate redirect problems, broken links and links to irrelevant pages that take place on your site. Google bots may give the website a negative score in such cases because the bots don’t know what to do in such cases. You should also make sure that the links from your home page and sub-pages are redirected with the 200 code and are relevant pages.

If an in-site link is 301 or 302 redirected to another in-site link, Google bots have the right not to follow these links. When such a link is encountered, the links go to the list to be reviewed later. However, pages that are directly redirected with the 200 code are not considered as links to be evaluated later by Google bots, but as links to be crawled immediately.

Block Unnecessary Directories

Block all unnecessary pages on the site that can be crawled by Google bots. Size calculation on e-commerce sites and information pages about cargo companies are among the best examples of this kind of pages. When Google bots crawl unnecessary pages every time, it will create an unnecessary workload in indexing.

Duplicate Reference Links

If you think that Google bots don’t like you despite applying all the procedures one by one, take care to get referral links from sites that receive a lot and frequent updates. Google bots that visit those pages frequently will follow the links and visit you as well.

Share this