Google Clarifies Using Noindex & 404 Status Codes For Crawl Budget Optimization

Dec 5, 2022 • 7:21 am | comments (1) by | Filed Under Google Search Engine Optimization
 

Google Gecko Crawl Budget

On Friday, Lizzi Sassman from Google updated the crawl budget management help document with two more topics. Specifically, Google added two new myths to the crawl budget documentation.

(1) Google added that using noindex isn't a good way to control crawl budget but it can be a method to indirectly free up crawl budget in the long run, Google said.

(2) Also that pages that serve 4xx status codes (except 429) do not waste crawl budget, Google wrote.

Here is where Google added them, right at the bottom:

Any URL that is crawled affects crawl budget, and Google has to crawl the page in order to find the noindex rule.

However, noindex is there to help you keep things out of the index. If you want to ensure that those pages don't end up in Google's index, continue using noindex and don't worry about crawl budget. It's also important to note that if you remove URLs from Google's index with noindex or otherwise, Googlebot can focus on other URLs on your site, which means noindex can indirectly free up some crawl budget for your site in the long run.

Pages that serve 4xx HTTP status codes (except 429) don't waste crawl budget. Google attempted to crawl the page, but received a status code and no other content.

click for full size

Forum discussion at Mastodon.

Previous story: December 2022 Google Webmaster Report
Ninja Banner
 
blog comments powered by Disqus