Google Stops Indexing Craigslist; Matt Cutts Fixes

Mar 18, 2013 - 8:27 am 3 by

Google CraigslistA HackerNews thread highlights a blog post by Tempest Nathan where he said Google stopped indexing Craigslist.

It was true, Google did stop indexing Craigslist. But why?

Did Craigslist spam Google? Did they violate Google's webmaster guidelines? Did they add the noindex directive to their pages? Nope. None of this.

It was a technical quirk.

Matt Cutts, Google's head of search spam, explained at the HackerNews thread saying they are fixing the issue on Google's end but this is what technically happened:

To understand what happened, you need to know about the “Expires” HTTP header and Google’s “unavailable_after” extension to the Robots Exclusion Protocol. As you can see at , Google’s “unavailable_after” lets a website say “after date X, remove this page from Google’s main web search results.” In contrast, the “Expires” HTTP header relates to caching, and gives the date when a page is considered stale.

A few years ago, users were complaining that Google was returning pages from Craigslist that were defunct or where the offer had expired a long time ago. And at the time, Craigslist was using the “Expires” HTTP header as if it were “unavailable_after”–that is, the Expires header was describing when the listing on Craigslist was obsolete and shouldn’t be shown to users. We ended up writing an algorithm for sites that appeared to be using the Expires header (instead of “unavailable_after”) to try to list when content was defunct and shouldn’t be shown anymore.

You might be able to see where this is going. Not too long ago, Craigslist changed how they generated the “Expires” HTTP header. It looks like they moved to the traditional interpretation of Expires for caching, and our indexing system didn’t notice. We’re in the process of fixing this, and I expect it to be fixed pretty quickly. The indexing team has already corrected this, so now it’s just a matter of re-crawling Craigslist over the next few days.

So we were trying to go the extra mile to help users not see defunct pages, but that caused an issue when Craigslist changed how they used the “Expires” HTTP header. It sounded like you preferred Google’s Custom Search API over Bing’s so it should be safe to switch back to Google if you want. Thanks again for pointing this out.


Forum discussion at HackerNews.


Popular Categories

The Pulse of the search community


Search Video Recaps

Google AI Overviews, Ranking Volatility, Web Filter, Google Ads AI Summaries & More - YouTube
Video Details More Videos Subscribe to Videos

Most Recent Articles

Search Forum Recap

Daily Search Forum Recap: May 17, 2024

May 17, 2024 - 4:00 pm
Search Video Recaps

Search News Buzz Video Recap: Google AI Overviews, Ranking Volatility, Web Filter, Google Ads AI Summaries & More

May 17, 2024 - 8:01 am
Google Search Engine Optimization

Remove Your Content From Google's AI Overviews

May 17, 2024 - 7:51 am
Google Ads

Google Ads AI Summaries Live For Some Advertisers

May 17, 2024 - 7:41 am
Google Maps

Order with Google For Food Delivery Going Away End Of June

May 17, 2024 - 7:31 am
Google Search Engine Optimization

Two New Googlebots: GoogleOther-Image & GoogleOther-Video

May 17, 2024 - 7:21 am
Previous Story: Google AdSense Blocks Arrows (Nessie) From 5,000 Publishers