Google Help Doc On Advanced Crawl Budget Management

Dec 2, 2020 - 7:11 am 0 by

Google Crawl Budget

Google has released a help document named large site owner's guide to managing your crawl budget. It is an advanced help document on helping developers manage Googlebot crawls on their web site. It reminds me of the blog post Gary Illyes of Google wrote in 2017 about crawl budget.

Google first defines who should think about managing crawl budget:

  • Large sites (1 million+ unique pages) with content that changes moderately often (once a week), or
  • Medium or larger sites (10,000+ unique pages) with very rapidly changing content (daily).

For everyone else, crawl budget is overrated.

The document them is broken into these sections:

  • General theory of crawling
  • Best practices
  • Monitor your site's crawling and indexing
  • Emergency crawl reduction
  • Myths and facts about crawling

My favorite part is the myths section, here are a few that caught my eye:

(1) Crawling is a ranking factor: False: Improving your crawl rate will not necessarily lead to better positions in Search results. Google uses many signals to rank the results, and while crawling is necessary for a page to be in search results, it's not a ranking signal.

(2) The nofollow directive affects crawl budget: Partly true: Any URL that is crawled affects crawl budget, so even if your page marks a URL as nofollow it can still be crawled if another page on your site, or any page on the web, doesn't label the link as nofollow.

(3) The closer your content is to the homepage the more important it is to Google: Partly true: Your site's homepage is often the most important page on your site, and so pages linked directly to the homepage may be seen as more important, and therefore crawled more often. However, this doesn't mean that these pages will be ranked more highly than other pages on your site.

(4) Alternate URLs and embedded content count in the crawl budget: True: Generally, any URL that Googlebot crawls will count towards a site's crawl budget. Alternate URLs, like AMP or hreflang, as well as embedded content, such as CSS and JavaScript, including XHR fetches, may have to be crawled and will consume a site's crawl budget.

Yea, none of this is really new but the document can be super helpful to those working on large sites that are concerned with crawl budget.

Nice find Adam Gent!

Forum discussion at Twitter.

 

Popular Categories

The Pulse of the search community

Follow

Search Video Recaps

 
Google Core Update Rumbling, Manual Actions FAQs, Core Web Vitals Updates, AI, Bing, Ads & More - YouTube
Video Details More Videos Subscribe to Videos

Most Recent Articles

Google Updates

Google Urges Patience As The March 2024 Core Update Continues To Rollout

Mar 18, 2024 - 7:51 am
Google

Official: Google Replaces Perspective Filter With Forums Filter

Mar 18, 2024 - 7:41 am
Google Maps

Google Business Profiles Now Offers Additional Review After Appeal Is Denied

Mar 18, 2024 - 7:31 am
Google Maps

EU Searchers Complaining About Google Maps Features Changes Related To DMA

Mar 18, 2024 - 7:21 am
Google

Google Showing Fewer Sitelinks Within Search

Mar 18, 2024 - 7:11 am
Search Forum Recap

Daily Search Forum Recap: March 15, 2024

Mar 15, 2024 - 4:00 pm
Previous Story: Even Google's Stan The Dinosaur Wears A Mask