A Google Webmasters Help thread has a webmaster worried that Google has not downloaded his XML Sitemap file in about five days. I went to check the status of my sitemap file in Google Webmaster Tools and Google has not...
A Google Webmasters Help thread has a webmaster worried that Google has not downloaded his XML Sitemap file in about five days. I went to check the status of my sitemap file in Google Webmaster Tools and Google has not...
A HighRankings Forum thread asks why do some people use more than a single robots.txt file to control and instruct search spiders how to crawl and access their content. That is a good question. Typically, the spiders will only listen...
About a week ago, we ran a poll asking Who's To Credit For Faster Indexing? The options included Google Sitemaps or FeedBurner, due to the topic we were discussing. The results are now in and the majority said, Google Sitemaps,...
An SEOMoz post charts the positive impact having a Sitemap file can have on the speed of Google and Yahoo crawling and indexing your web pages. The report seems pretty impressive and I myself feel that Sitemaps are important to...
A WebmasterWorld thread asks why does the site command in Google not match up in the number of "indexed" URLs reported in Google Webmaster Tools. A very valid question, let me show you. A simple site command in Google for...
incrediBILL, moderator at WebmasterWorld, noticed that one of Live Search's bots was crawling through his JavaScript. The bot is named MSNBOT-MEDIA and he noticed that it was accessing JavaScript files and AJAX functions. He noticed that the bot was triggering...
Many SEOs use the site command to see how healthy their site is in a particular search engine. So you plug in site:www.mydomain.com in a search engine and the search engine will return the number of pages they have indexed...
I found an interesting tidbit while reading a somewhat detailed thread at Google Groups. The scenario is as follows. You have blocked Googlebot from accessing your site for a 6 month period or so. Then you want to welcome Googlebot...
Michael Gray has composed a post that helps SEOs find out which pages of their site haven't been crawled, which becomes increasingly more important due to Google's removal of the supplemental index. He explains that you should put a timestamp...
There are threads at Google Groups and DigitalPoint Forums with multiple reports of Google not crawling Blogger hosted blogs, that are on custom or private domains (i.e. not on blogspot.com domains). Many have reported that the Googlebot crawling has stopped...
An unusual question came up at WebmasterWorld, asking if you can request a site to be completely removed from a country specific Google search engine. For example, the site owner wants to remove his site from Google Netherlands, because the...
Yesterday I reported that GoogleBot is crawling less pages then they once were, based on a large WebmasterWorld thread. Now, I spotted a response from a Googler at a Google Groups thread with similar complaints. This time, I decided to...
A WebmasterWorld thread reports from dozens of Webmasters that GoogleBot, Google's web crawler has not been crawling as many documents as they have in the past. Many webmasters are noticing reduction in crawl rates as much as 90-percent, relative to...
Now that Google admitted to crawling JavaScript and forms SEOs and Webmasters need to be aware of how to manage even more duplicate content issues. In the past, a good strategy was to build out filter pages (filter by color,...
A WebmasterWorld thread reports that new installations of the popular blogging software, WordPress, is by default blocking all search engines. He said, when you go to the Privacy Options section in the administration panel, by default, it is set to...
A Google Groups thread has a fairly simple but educational FAQ on how the "Set Crawl Rate" feature works in Google Webmaster Tools. In short, you can only set the crawl rate for a site on the domain or subdomain...
In August, Yahoo announced a new crawl behavior for Slurp, Yahoo's web crawler. The new crawl behavior was suppose to tame the crawler to go through your site in a more relaxed and efficient manner for both the crawler and...
Last week Tamar wrote about How to Stop Googlebot from Crawling Your Site Rapidly, so I thought I write about the opposite. How can you induce GoogleBot into crawling your site. Although there is no magic shot that guarantees inducement...
A WebmasterWorld thread discusses a more detailed issue with how Google's spider, GoogleBot, is crawling some pages. Let me quote the detailed explanation: I've tried: Checking for the HTTP_IF_MODIFIED_SINCE header and returns "304 Not Modified" if possible. Problem: Googlebot doesn't...
A detailed Google Groups thread is reporting various reports of webmasters claiming GoogleBot is timing out before reaching their pages. First, these webmasters are noticing a drop in GoogleBot activity on their server. So they login to Google Webmaster Tools...
A Cre8asite Forums thread asks how can he generate unique robots.txt files for each domain he has, when each of those sites are sharing the same local files through a form of IIS mirroring? There are several ways to do...
A very interesting Google Groups thread has many bloggers, including SEO Buzz Box, DaveN, and Search Engine Journal voicing their reactions. The background is that the webmaster of AlkenMRS.com realized that his 10+ year old site had been delisted from...
To subscribe to the Search Engine Roundtable, click here