Google: Crawl Budgets & Delays Not About Page Size

May 13, 2019 - 8:35 am 4 by

Google Tape Measure

This past Friday, while at the GooglePlex, John Mueller, Martin Splitt and Lizzi Harvey from the Google team hosted an office hours and Martin MacDonald was there to represent the SEOs. He asked a question about crawl budget and stuff and if a page the size of 10mb can reduce crawl budget compared to a page only 400kb. All the Googlers shook their heads no.

What does matter is the number of requests your server can handle. If Google detects that your server is slowing down in terms of the number of requests, Google will back off a bit to make sure GoogleBot isn't the reason your server crashes. But page size really isn't a direct factor to Google slowing down the crawl of your web site.

Here is the transcript but it starts at 31:11 minute mark into the video (you can also scroll back a bit more to hear more of the question):

Martin MacDonald: Is that [crawl budget] tied to a hard number of URLs... transfer size it is much that it might reduce a pages website at websites pages from 10mbs to 300kbs. Would that dramatically increase the number of pages they can crawl?

John Mueller: I don't think that would change anything.

Martin Splitt: It’s request

John Mueller: I mean what happens, what sometimes happens, is if you have a large response then it just takes longer for us to get that and with that we'll probably crawl them less because we really trying to avoid having too many simultaneous connections to server. So if you have a smaller response size and obviously we can get more simultaneous requests and we could theoretically get more. But it's not the case that if you reduce the size of your pages and suddenly solve problems.

Martin Splitt: Also it's also that when the response takes a long time it's not just the size of the page, it is also the response time, the service tend to respond slower than if overloaded or allowed to be overloaded. So that's also signal that we're picking up. Like this takes a really long time to get data from the server, maybe we should look into the crawl limits of the hosts code on this particular service so that we're not taking down the server.

John Muller: We look at it on a per server level.So if you have content from a CDN or from from other networks, other places then that would apply to their protocol. Essentially because like how how slow and embedded resources doesn't really affect the rest of the content on the site.

Here is the video embed:

Forum discussion at YouTube.

 

Popular Categories

The Pulse of the search community

Follow

Search Video Recaps

 
Google Core Update Flux, AdSense Ad Intent, California Link Tax & More - YouTube
Video Details More Videos Subscribe to Videos

Most Recent Articles

Search Forum Recap

Daily Search Forum Recap: April 23, 2024

Apr 23, 2024 - 4:00 pm
Link Building

Google: Ignore Link Spam Especially To 404 Pages

Apr 23, 2024 - 7:51 am
Google Search Engine Optimization

Google: We Have Taken Action On Some Parasite SEO In Recent Update

Apr 23, 2024 - 7:41 am
Bing Search

Mikhail Parakhin Breaks Silence On Mustafa Suleyman Of Microsoft (Kinda...)

Apr 23, 2024 - 7:31 am
Google Maps

Google Business Profiles Gains Select Preferred Menu Source

Apr 23, 2024 - 7:21 am
Google Search Engine Optimization

Google: Crawl Budget Goes Across All Googlebot Crawling, Not Just Web Search

Apr 23, 2024 - 7:11 am
Previous Story: Old Google Cache Date Is Nothing To Worry About