How Does Googlebot Find/Index Hidden FTP Logs?

Jul 14, 2006 - 3:16 pm 0 by
Filed Under Miscellaneous

Google's crawler constantly scours the Internet for pages to index, which is one of the reasons you should run away if someone offers to "submit your website to Google." On any page you do not want indexed, it is important to disallow the Googlebot (one of the nicknames for their spiders) by using special code. A prime example of pages you may not want indexed would be new pages under construction, especially if they contain content you already have in the index on "live pages."

A recent thread at WebMasterWorld Forums shows us another example of pages you probably don't want in the Index: your FTP logs. The member complains:

My FTP log is cached by Google...and there has never been a link to it, ever!
The first response is fairly obvious, indicating that all FTP log and other pages that you do not want indexed should be password protected, therefore making it impossible for the Googlebot to crawl. So knocking out links and assuming the pages are protected, could it still be possible for the Googlebot to find the URL and "accidentally" index it?

One member astutely reminds readers that

When you use the Google toolbar and have the PageRank bar enabled, it sends url data to Google so.. so Google knows what urls exist out there, even if they are not linked to anywhere. So you have to be careful about what links you pull up when the PageRank bar is enabled.

The original poster comes back and thanks everyone for their responses, but claims it’s not that simple...The discussion continues at WebmasterWorld Forums.

 

Popular Categories

The Pulse of the search community

Follow

Search Video Recaps

 
Video Details More Videos Subscribe to Videos

Most Recent Articles

Search Video Recaps

Google Volatility, Bing Generative Search, Reddit Blocks Bing, Sticky Cookies, AI Overview Ads & SearchGPT

Jul 26, 2024 - 8:01 am
Google

Google Gemini Adds Related Content & Verification Links

Jul 26, 2024 - 7:51 am
Other Search Engines

SearchGPT - OpenAI's AI Search Tool

Jul 26, 2024 - 7:41 am
Search Engine Optimization

Google's John Mueller: Don't Use LLMs For SEO Advice

Jul 26, 2024 - 7:31 am
Google

Google Search With Related Images Carousel Below Image Box

Jul 26, 2024 - 7:21 am
Google Maps

Google Local Book Online CTA For Call Business

Jul 26, 2024 - 7:11 am
Previous Story: Is Click Fraud Actually Good For Google?