Craigslist Blocks Most Spiders: Millions of Pages Delisted

Jan 3, 2006 - 8:10 am 0 by

A thread started at our forums named Craigslist Delists Millions of Pages from Search Engine Indexes uncovers the new robots.txt file in place over at Craigslist. It basically reads;

############################## # Exclude robots from these

User-agent: YahooFeedSeeker Disallow: /forums Disallow: /res/ Disallow: /post Disallow: /email.friend Disallow: /?flagCode Disallow: /ccc Disallow: /hhh Disallow: /sss Disallow: /bbb Disallow: /ggg Disallow: /jjj

User-agent: * Disallow: /cgi-bin Disallow: /cgi-secure Disallow: /forums Disallow: /search Disallow: /res/ Disallow: /post Disallow: /email.friend Disallow: /?flagCode Disallow: /ccc Disallow: /hhh Disallow: /sss Disallow: /bbb Disallow: /ggg Disallow: /jjj

#####################################

They supposedly had millions, 3.6 Million to be exact, of pages indexed at Google and millions at the other search engines. Now? 211,000 at Google, 280,000 at Yahoo and 4,695 atMSN.

Forum discussion at Search Engine Roundtable Forums.

 

Popular Categories

The Pulse of the search community

Follow

Search Video Recaps

 
Google Core Update Flux, AdSense Ad Intent, California Link Tax & More - YouTube
Video Details More Videos Subscribe to Videos

Most Recent Articles

Search Forum Recap

Daily Search Forum Recap: April 23, 2024

Apr 23, 2024 - 4:00 pm
Link Building

Google: Ignore Link Spam Especially To 404 Pages

Apr 23, 2024 - 7:51 am
Google Search Engine Optimization

Google: We Have Taken Action On Some Parasite SEO In Recent Update

Apr 23, 2024 - 7:41 am
Bing Search

Mikhail Parakhin Breaks Silence On Mustafa Suleyman Of Microsoft (Kinda...)

Apr 23, 2024 - 7:31 am
Google Maps

Google Business Profiles Gains Select Preferred Menu Source

Apr 23, 2024 - 7:21 am
Google Search Engine Optimization

Google: Crawl Budget Goes Across All Googlebot Crawling, Not Just Web Search

Apr 23, 2024 - 7:11 am
Previous Story: Google's 2006 New Year Graphic; "Weird"?