Should Google Not Index Robots.txt Files in Search Results?

May 15, 2008 • 8:02 am | comments (4) by twitter Google+ | Filed Under Google Search Engine Optimization
 

An interesting discussion is taking place at WebmasterWorld on the topic of the robots.txt file. One webmaster did not want his robots.txt file to be indexed by Google, but has no way of delisting in in Google.

The only ways of removing content is Google includes:

  • Via meta tags
  • Via robots.txt command
  • Return a 404 server status
  • Use the Remove URLs feature in Webmaster Tools
  • Password protect the page
  • Some more ideas on how to remove content in Google can be found there.

But if you implement any of those, Google will likely remove your robots.txt, and it won't follow the rules you have implemented in that file. Which can be very upsetting for webmasters. So if you block you block your robots.txt file in your robots.txt file, does Google really see the robots.txt file to block it? (Okay, that was a bit of a joke, but it makes the point).

That brings up the question, should Google list robots.txt files in the search results? In most cases, they do not contain any useful content for searchers. Well, with the exception of Brett Tabke's robots.txt blog, which is a hilarious idea. But outside of that, how is it useful?

As Tedster notes, Google has indexed plenty robots.txt files, should they?

Let me ask you, here is a poll. Should Google display robots.txt files in the search results (even for searches like [inurl:robots.txt filetype.txt])?

There is an "other" option, but try not to use it. :)

Forum discussion at WebmasterWorld.

Update: Google's JohnMu commented below explaining how to block your robots.txt file from showing up. News to me, this is excellent news:

Hi guys, there are two ways to block your robots.txt from showing up in search results:

- disallow it in your robots.txt (don't worry, we'll still check it); you can then use the Webmaster Tools URL removal tool to have it taken out of the index if it's indexed.

- use the x-robots-tag HTTP header tag with "noindex"

On the other hand, robots.txt URLs generally would not show up in any search results where you have more relevant pages within your site, so this is probably not something you'd want to spend all too much effort on :-).

Previous story: Of Course Google Indexes Content About Vodka & Hard Liquor
 

Comments:

Mark

05/15/2008 12:41 pm

Why are robot.txt relevant for Google? I think it is for webmasters, to see how the others do it.

JohnMu (Google)

05/15/2008 07:51 pm

Hi guys, there are two ways to block your robots.txt from showing up in search results: - disallow it in your robots.txt (don't worry, we'll still check it); you can then use the Webmaster Tools URL removal tool to have it taken out of the index if it's indexed. - use the x-robots-tag HTTP header tag with "noindex" On the other hand, robots.txt URLs generally would not show up in any search results where you have more relevant pages within your site, so this is probably not something you'd want to spend all too much effort on :-).

lovekills_s

05/16/2008 06:57 am

Show it or do not.. those who need to see it can do so by adding /robots.txt to the domain URL.

Nikki

05/16/2008 01:47 pm

Why is webmasterworld cloaking http://www.webmasterworld.com/robots.txt if the user-agent is Googlebot?

blog comments powered by Disqus