How Does Googlebot Find/Index Hidden FTP Logs?

Jul 14, 2006 - 3:16 pm 0 by
Filed Under Miscellaneous

Google's crawler constantly scours the Internet for pages to index, which is one of the reasons you should run away if someone offers to "submit your website to Google." On any page you do not want indexed, it is important to disallow the Googlebot (one of the nicknames for their spiders) by using special code. A prime example of pages you may not want indexed would be new pages under construction, especially if they contain content you already have in the index on "live pages."

A recent thread at WebMasterWorld Forums shows us another example of pages you probably don't want in the Index: your FTP logs. The member complains:

My FTP log is cached by Google...and there has never been a link to it, ever!
The first response is fairly obvious, indicating that all FTP log and other pages that you do not want indexed should be password protected, therefore making it impossible for the Googlebot to crawl. So knocking out links and assuming the pages are protected, could it still be possible for the Googlebot to find the URL and "accidentally" index it?

One member astutely reminds readers that

When you use the Google toolbar and have the PageRank bar enabled, it sends url data to Google so.. so Google knows what urls exist out there, even if they are not linked to anywhere. So you have to be careful about what links you pull up when the PageRank bar is enabled.

The original poster comes back and thanks everyone for their responses, but claims it’s not that simple...The discussion continues at WebmasterWorld Forums.

 

Popular Categories

The Pulse of the search community

Follow

Search Video Recaps

 
Google Core Update Flux, AdSense Ad Intent, California Link Tax & More - YouTube
Video Details More Videos Subscribe to Videos

Most Recent Articles

Google Updates

Google March Core Update Still Rolling Out & Heated SEO Chatter Continue

Apr 25, 2024 - 7:51 am
Google

Report: How Prabhakar Raghavan Killed Google Search

Apr 25, 2024 - 7:41 am
Google Search Engine Optimization

Google Favicon Documentation Adds Rel Attribute Value Definitions

Apr 25, 2024 - 7:31 am
Google Ads

Google Ads API Version 16.1 Now Available

Apr 25, 2024 - 7:21 am
Google Search Engine Optimization

Google: Splitting & Merging Sites Takes Longer Than Normal Site Migrations

Apr 25, 2024 - 7:11 am
Search Forum Recap

Daily Search Forum Recap: April 24, 2024

Apr 24, 2024 - 4:00 pm
Previous Story: Is Click Fraud Actually Good For Google?