Yahoo Reporting Error: Advertisers Reports Showing Charges in Millions | Main | The October 2007 Paid Link Debate

GoogleBot Not Sending IF_MODIFIED_SINCE Request?

A WebmasterWorld thread discusses a more detailed issue with how Google's spider, GoogleBot, is crawling some pages. Let me quote the detailed explanation:

I've tried: Checking for the HTTP_IF_MODIFIED_SINCE header and returns "304 Not Modified" if possible.

Problem: Googlebot doesn't always send this header. Even if they already know about a page they doesn't always send the header.

I've tried: Using the expires header to tell google that each page should expire in a month from the request.

Problem: Googlebot keep requesting the pages. They seem to ignore this header.

Brett Tabke, founder of WebmasterWorld, said he noticed these issues as well. jdMorgan, a WebmasterWorld moderator, tried to offer some advice:

Check that the 'expires' header is relative -- Expires after so much time, rather than Expires at a certain time.

You should check your Cache-control server response headers as well.

Is this a Webmaster issue or GoogleBot issue?

Forum discussion at WebmasterWorld.



Like The Story? Vote For It On Yahoo Buzz! Or On Sphinn!

posted rustybrick in Google Optimization at October 9, 2007 7:46 AM Comments (1)

Comments

A colleague asked about this a little bit. I think that if we're following a chain of redirects, then we might not send the "If-Modified-Since:" header. I'm not 100% sure, but I don't think we make use of the "Expires:" HTTP header right now. But you might be interested in this post:
http://googleblog.blogspot.com/2007/07/robots-exclusion-protocol-now-with-even.html

It mentions how you can use an HTTP header like "X-Robots-Tag: unavailable_after: 7 Jul 2007 16:30:00 GMT", but that's more to remove a page after a certain time.

 

Post a comment (Note: Can Take 120 Seconds For Your Comment To Show Up)

Do you want us to save your personal Information?


To subscribe to the Search Engine Roundtable, click here