Google: Crawl Budgets & Delays Not About Page Size

May 13, 2019 - 8:35 am 4 by

Google Tape Measure

This past Friday, while at the GooglePlex, John Mueller, Martin Splitt and Lizzi Harvey from the Google team hosted an office hours and Martin MacDonald was there to represent the SEOs. He asked a question about crawl budget and stuff and if a page the size of 10mb can reduce crawl budget compared to a page only 400kb. All the Googlers shook their heads no.

What does matter is the number of requests your server can handle. If Google detects that your server is slowing down in terms of the number of requests, Google will back off a bit to make sure GoogleBot isn't the reason your server crashes. But page size really isn't a direct factor to Google slowing down the crawl of your web site.

Here is the transcript but it starts at 31:11 minute mark into the video (you can also scroll back a bit more to hear more of the question):

Martin MacDonald: Is that [crawl budget] tied to a hard number of URLs... transfer size it is much that it might reduce a pages website at websites pages from 10mbs to 300kbs. Would that dramatically increase the number of pages they can crawl?

John Mueller: I don't think that would change anything.

Martin Splitt: It’s request

John Mueller: I mean what happens, what sometimes happens, is if you have a large response then it just takes longer for us to get that and with that we'll probably crawl them less because we really trying to avoid having too many simultaneous connections to server. So if you have a smaller response size and obviously we can get more simultaneous requests and we could theoretically get more. But it's not the case that if you reduce the size of your pages and suddenly solve problems.

Martin Splitt: Also it's also that when the response takes a long time it's not just the size of the page, it is also the response time, the service tend to respond slower than if overloaded or allowed to be overloaded. So that's also signal that we're picking up. Like this takes a really long time to get data from the server, maybe we should look into the crawl limits of the hosts code on this particular service so that we're not taking down the server.

John Muller: We look at it on a per server level.So if you have content from a CDN or from from other networks, other places then that would apply to their protocol. Essentially because like how how slow and embedded resources doesn't really affect the rest of the content on the site.

Here is the video embed:

Forum discussion at YouTube.


Popular Categories

The Pulse of the search community


Search Video Recaps

Google Weekend Volatility, Google On Search Leak, Elizabeth Tucker Interview & Apple Intelligence - YouTube
Video Details More Videos Subscribe to Videos

Most Recent Articles

Search Forum Recap

Daily Search Forum Recap: June 19, 2024

Jun 19, 2024 - 10:00 am

Report On If Google Showing Fewer Reddit Links In Search?

Jun 19, 2024 - 7:51 am

Google AI Overviews FAQs Including Why You Can't Disable AI Overviews

Jun 19, 2024 - 7:41 am
Google Search Engine Optimization

Google: Two Common Reasons When A Spike In Crawling Is Bad

Jun 19, 2024 - 7:31 am

Google Tests Two More People Also Ask Results (6 PPA)

Jun 19, 2024 - 7:21 am
Google Maps

Google Local Panels Gains Menu Button

Jun 19, 2024 - 7:11 am
Previous Story: Old Google Cache Date Is Nothing To Worry About