Yahoo's Crawler Not Listening To Robots.txt Directive?

Sep 19, 2011 • 9:04 am | comments (3) by twitter Google+ | Filed Under Yahoo Search Engine Optimization
 

Yahoo SlurpA WebmasterWorld thread reports that Yahoo may not be fully listening to the robots.txt directive to block their spider, Yahoo Slurp.

The thing is, Yahoo spider isn't all that active these days - because Bing is now powering much of Yahoo and thus BingBot is most active.

The webmaster said:

Depending on the Host and UA, the official Yahoo! Slurp apparently does whatever it wants to. Note the subtle differences in the subdomains and UAs...

This morning, the only Host to read/heed robots.txt was:

b3091154.crawl.yahoo.net [67.195.112.189] Mozilla/5.0 (compatible; Yahoo! Slurp; http://help.yahoo.com/help/us/ysearch/slurp)

These retrieved graphics by the pageful, over 60 total:

b5101137.yst.yahoo.net [98.137.72.218] b5101139.yst.yahoo.net [98.137.72.228] Mozilla/5.0 (compatible; Yahoo! Slurp/3.0; http://help.yahoo.com/help/us/ysearch/slurp)

I am not sure if this is a widespread issue or something that is just a smaller bug.

The main question is, should you care of Yahoo is crawling your site when Bing is? That discussion is also taking place in the forum thread. The answer is, it depends.

Forum discussion at WebmasterWorld.

Previous story: Bing Uses User Search History To Adapt Your Search Results
 

Comments:

Michael Martinez

09/19/2011 05:20 pm

Since Yahoo! has chosen not to be a search engine, they don't need to be crawling my sites.

mobile crusher

09/20/2011 09:00 am

Since Yahoo was bought, i only attention bing now

Ashley Pearson

07/12/2012 04:34 pm

Yahoo just got hacked. We covered the article on our site: http://suckmytrend.com

blog comments powered by Disqus