del.icio.us Blocks Search Engine Spiders

Feb 18, 2008 • 9:21 am | comments (4) by twitter | Filed Under Social Search Engines & Optimization
 

Colin Cochrane noticed that del.icio.us has blocked search engine spiders. He believes that it's not a simple robots.txt exclusion; instead, del.icio.us is serving 404 errors based on the User-Agent. Barry Welford confirmed this by changing the User-Agent himself.

How did he come across this? He was using a Firefox del.icio.us addon that and couldn't locate a page he had referenced before. It was only when he did the search directly on del.icio.us that he found it.

Not many people believe that this approach is a good idea. A 403 response code is better, says Pierre aka eKstreme. Pierre has noticed a bunch of errors lately within Yahoo, including JavaScript errors and pop-up alerts that indicate that something has broken. Some folks have turned the thread into a rant about the competence of Yahoo at this point. Barry Welford puts it this way: "this may be a sign of a debilitating decline for del.icio.us and Yahoo! is in no position to invest massively in a property that has uncertain monetization."

I honestly hope that that is not the case.

But EGOL says something else. It is possible that many people are gaming their way onto the del.icio.us front page (heck, I've seen some pretty bad-quality sites there myself) and this is the way to not pass juice to them. It's not the most ideal solution, and it's a mistake to do this without being forthright.

Most people believe this is just a bad mistake made on Yahoo's part, which doesn't help since Yahoo has been having a difficult time lately, and this doesn't help matters at all.

Forum discussion continues at Cre8asite Forums.

Previous story: Google Establishes Hydro-Power Data Center
 

Comments:

Michael Jensen

02/18/2008 06:54 pm

You may want to keep in mind that Google and other robots have methods for identifying the actual robot that is crawling based on the IP address, so changing your user-agent wouldn't be enough to see what Google sees for a well engineered site like del.icio.us.

Greg

02/18/2008 10:27 pm

I noticed about 2 months ago that my delicious seemed to have some kind of PR cull. My delicious account went from PR 3 to 0. I hadn't heard anyone else reporting issues, but perhaps this explains it.

jonah stein

02/19/2008 03:42 am

I would guess the del.ic.ous is more likely to have setup security that blocks anyone SPOOFING an engine then blocking Googlebot. This seems like a reasonable defense against scrapers and something that IncrediBill has been trying to engineer over at CrawlWall.

Farhad Divecha

02/21/2008 08:36 pm

Ugh! I’m tired of misinformation being spread by self proclaimed “SEO Analysts”. This is just plain false. The only thing being blocked are admin-type pages and users spoofing search engine bots.

blog comments powered by Disqus