Google Shares Insight Into Content Removal: 404s Confirmation & More

May 29, 2013 • 8:39 am | comments (11) by twitter Google+ | Filed Under Google Search Engine Optimization
 

GooglebotA Google Webmaster Help thread has some interesting details from Google's John Mueller about content removal from Google's index and/or search results.

Some of these points you already know but every SEO and webmaster should understand these. Heck, some are even eye opening to me.

Here are the raw points John made and then I'll share what I think is revealing:

  • The URL removal tool is not meant to be used for normal site maintenance like this. This is part of the reason why we have a limit there.
  • The URL removal tool does not remove URLs from the index, it removes them from our search results. The difference is subtle, but it's a part of the reason why you don't see those submissions affect the indexed URL count.
  • The robots.txt file doesn't remove content from our index, but since we won't be able to recrawl it and see the content there, those URLs are generally not as visible in search anymore.
  • In order to remove the content from our index, we need to be able to crawl it, and we should see a noindex robots meta tag, or a 404/410 HTTP result code (or a redirect, etc). In order to crawl it, the URL needs to be "not disallowed" by the robots.txt file.
  • We generally treat 404 the same as 410, with a tiny difference in that 410 URLs usually don't need to be confirmed by recrawling, so they end up being removed from the index a tiny bit faster. In practice, the difference is not critical, but if you have the ability to use a 410 for content that's really removed, that's a good practice.

I find the 404 versus 410 point very interesting. With a 404 result code, Google will typically recrawl to verify the page is really not found. But if you serve up a 410, Google may not need to recrawl to verify the page is not there. This is an important thing for webmasters to know. It is safer to go with a 404 but seems quicker to go with a 410.

The second item is that Google said the URL removal tool does not remove URLs from the index, it removes them from the Google search results. Many know this, but it is important to point out as well.

Forum discussion at Google Webmaster Help.

Previous story: Google AdWords Weekly Reports No Longer Just On Mondays
 

Comments:

Kenny Shafer

05/29/2013 02:08 pm

Thanks Barry...this is goooood stuff right here.

Kyle Risley

05/29/2013 02:50 pm

Interesting re: 410 vs. 404. Thanks Barry!

LarryEngel

05/29/2013 03:16 pm

I use 404's and 410's for two different and distinct purposes... however, never knew about the potential for 410's to get removed from the index quicker. That makes perfect sense to me. I see the same url's 404 several times over a 6-week period before the disappear for good... not the same for the 410's.

Dani Zehra

05/29/2013 10:52 pm

waow that is something really intresting ... never knew 410 can do this too.

Soni Sharma

05/30/2013 06:02 am

Thanks for sharing 410 information... It is really helpful.

Dhiraj Rawat

05/30/2013 06:12 am

410 (deleted) is better in place of 404 for SEOs if webmasters are not able to manage 301 for URLs were changed.

Zach

05/30/2013 08:36 am

The same post last year did clarified, John Muller saying that they treat 404 slightly different than 410 http://www.seroundtable.com/404-410-google-15225.html But what confuses me is this https://twitter.com/mattcutts/status/168136014290890752

Duran Drake

05/30/2013 10:51 am

Either Going to Google Web Matser I think you can do the same with robots.txt .

Harsh Agrawal

05/30/2013 02:44 pm

Some very interesting point...Specially using 410 over 404....

Lisa Hosman

06/04/2013 03:23 pm

If you have removed a domain or pages through the url removal tool, will that keep Google from re-crawling those pages and finding meta noindex/404s/redirects from those pages? I know it removes the url from the search results - not the index, but if you want something permanently removed from the index quickly and remove the url from search results with Webmaster tools - then add a noindex meta tag to that page, will Googlebot re-crawl that same page if it is still removed within the tool?

Spook SEO

02/01/2014 05:32 pm

Still, if you remove a bit indexed-content or pages, it will return into a "404 errror" or "sorry we coudn't find the url/content".

blog comments powered by Disqus