DuckDuckGo Not Respecting Robots.txt Directives?

Nov 27, 2012 - 9:07 am 15 by

DuckDuckGoThere is an interesting thread at Hacker News on DuckDuckGo being toyed by Google. That part honestly doesn't interest me as much as the core SEO topic in the thread.

Google's Matt Cutts is very active on Hacker News and he questioned Gabriel Weinberg, the founder of DuckDuckGo, about the DuckDuckGo spiders and crawlers. There are some folks asking if DuckDuckGo's spider, aka DuckDuckBot, respectes the robots.txt directives.

Some noticed DuckDuckGo crawling under the IP range they own but not declaring the useragent and thus not respecting the robots directives set by the webmaster.

Matt Cutts asked Gabriel:

Gabriel, does DuckDuckGo's crawler have a distinct user agent? Can you talk more about how DuckDuckGo observes/respects robots.txt?

I emailed Gabriel and he explained that in this case, they are only checking for parked domains. He wrote, "what they're seeing there is not a crawler but a parked domain checker." He added, "it doesn't crawl through a site. In fact, it only checks the front page." When I questioned why they can't do this using the DuckDuckBot useragent, he said, "some parked domain networks show different things based on the user agent, and we want to find out what is really shown to the user."

He added also:

We don't believe it needs to be identified as anything else as it only makes one request very infrequently and doesn't index any information.

Forum discussion at Hacker News.

 

Popular Categories

The Pulse of the search community

Follow

Search Video Recaps

 
Google Core Update Flux, AdSense Ad Intent, California Link Tax & More - YouTube
Video Details More Videos Subscribe to Videos

Most Recent Articles

Search Forum Recap

Daily Search Forum Recap: April 24, 2024

Apr 24, 2024 - 4:00 pm
Google Search Engine Optimization

Google: We Won't Change The 301 Redirect Signals For Ranking & SEO

Apr 24, 2024 - 7:51 am
Google

Google Image Search Tests Tablet Like Design Interface

Apr 24, 2024 - 7:41 am
Google Search Engine Optimization

Google: Our Link Best Practices Doc Are Still Good Guidelines

Apr 24, 2024 - 7:31 am
Google Ads

Google Ads Established In Extensions

Apr 24, 2024 - 7:21 am
Bing Search

Bing Tests Lock Icon In New Search Snippet Location

Apr 24, 2024 - 7:11 am
Previous Story: Beating the Scrapers To Google