Bing's MSNBot Crawls Twice, Once For Compressed HTML & Again For Uncompressed

Dec 16, 2009 • 8:38 am | comments (1) by twitter Google+ | Filed Under Bing Search
 

Here is one more oddity to add to Microsoft Bing's web crawler, MSNBot. Why on earth are people reporting that MSNBot is crawling the same page twice, once for the compressed version and then once again for the uncompressed version? Technically, it should probably only crawl once and it should opt for the compressed, gzip version - don't you think?

We have two threads complaining about this, one oldish one at WebmasterWorld and another at Bing Forums. Let me quote the Bing thread:

I've notice that bing is crawling each page of my website twice, first making an HTTP 1.1 request and getting a compressed response then immediately issuing an HTTP 1.0 request to receive the same page without gzip compression The following lines from my log show the issue (there are thousands more similar occurrences): 65.55.207.74 - - [13/Dec/2009:14:58:42 +0000] "GET /specimen/235698/ HTTP/1.1" 200 1742 "-" "msnbot/2.0b (+http://search.msn.com/msnbot.htm)" 65.55.207.74 - - [13/Dec/2009:14:59:06 +0000] "GET /specimen/235698/ HTTP/1.0" 200 4259 "-" "msnbot/2.0b (+http://search.msn.com/msnbot.htm)" 65.55.106.209 - - [13/Dec/2009:15:03:08 +0000] "GET /specimen/250262/ HTTP/1.1" 200 1733 "-" "msnbot/2.0b (+http://search.msn.com/msnbot.htm)" 65.55.106.209 - - [13/Dec/2009:15:03:14 +0000] "GET /specimen/250262/ HTTP/1.0" 200 4164 "-" "msnbot/2.0b (+http://search.msn.com/msnbot.htm)" This seems a waste of bandwidth and completely defeats the point of supporting http compression.

Indeed a waste of bandwidth and yes, it defeats the point of supporting HTTP compression.

A Bing representative, Brett Yount said:

could you please mail this information to bwmc@microsoft.com and I will get our crawling team to check it out?

But we have no confirmation from Bing on why this issue is occurring or when it will be fixed. Like I said, just one more oddity to add to MSNBot's crawl behavior.

Forum discussion at WebmasterWorld and Bing Forums.

Previous story: When Will Google Fix the Web Design Local Pack Bug?
 

Comments:

Alistair Lattimore

12/17/2009 11:46 pm

Could this be some form of cloaking detection? Spammy webmasters delivering one set of content compressed, a different set of content uncompressed?

blog comments powered by Disqus