Google: Sitemaps Do Not Guarantee Indexing

Dec 26, 2013 • 8:40 am | comments (11) by twitter Google+ | Filed Under Google Search Engine Optimization
 

Google SitemapThis is likely obvious to most the readers here but it is simple, submitting an XML sitemap to Google does not mean the pages in that sitemap will be fully indexed.

A Google Webmaster Help thread has Google's Gary Illyes responding to a question about why a site that has submitted 40,000 pages only has 100 pages indexed in Google.

For example, here are two sites that have submitted their URLs to Google via an XML sitemap file. One, has submitted 17,987 pages and Google has actually index all of them, plus one. :) The other has submitted over 7 million pages, but Google has only indexed about 4 million of them, which is about 53% of the pages submitted.

Why did Google index all the pages on one site but only about half on this other site? Why did Google only index 100 pages of the 40,000 of the site complaining above?

Gary from Google explains:

First and foremost, submitting a Sitemap doesn't guarantee the pages referenced in it will be indexed. Think of a Sitemap as a way to help Googlebot find your content: if the URLs weren't included in the Sitemap, the crawlers might have a harder time finding those URLs and thus they might be indexed slower.

Another thing you want to pay attention to is that our algorithms may decide not to index certain URLs at all. For instance, if the content is shallow, it may totally happen it will not be indexed at all.

Google make take a look and decide based on the content or the PageRank that the page is not worth indexing.

Forum discussion at Google Webmaster Help.

Previous story: PlugRush Adult Ad Network Might Harm Your Google Rankings
 

Comments:

Alexander Hemedinger

12/26/2013 02:47 pm

I agree with the content part with new algorithms. However, even with small content but huge social media engagement (including comments) sure does index quickly. ;)

Guest

12/26/2013 06:24 pm

http://mrwhatis.net/what-is-corrupted-google.html

Swayam Das

12/27/2013 02:38 am

Nice :)

Benjamin Burns

12/27/2013 05:21 am

Creating a sitemap is one of those technical items which need to be fundamentally included, but to your point Barry, it's not a guarantee indexation of all URL's submitted. If a URL(s) is not indexed, it should be a signal to the site owner, to take a closer look at the content and determine the value of that URL/page. Personally, I still see value in creating, segmenting and submitting sitemaps...especially if you have been lucky to get an early invite to the Android mobile app sitemap submission process. Again, I agree...if Google accepted and indexed everyones sitemaps in their entirety, the web would be even more polluted with shallow and useless content.

Profesor de SEO

12/27/2013 02:40 pm

I don´t used sitemap and my site go right

Joseph Paulino

12/27/2013 04:25 pm

I find that the more accurate my sitemap is the more pages get indexed. No less than 90%, most of the time, get indexed after carefully excluding any URL that shouldn't belong in the sitemap.

Terry Van Horne

12/27/2013 04:59 pm

Sitemap is a discovery tool and like any other discovery tool Google will decide if the page gets cached. I think they at least sniff every page.... also keep in mind that the Panda Algo is now rolled into indexing so if a page is seen as low quality and it is "the HUB" that can block the spoke pages which means whole sections of the site are more or less invisible to Google. IMO, PageRank is a near non factor it's more about social signal verifiers.

Gracious Store

12/28/2013 02:41 am

Can someone please explain to me the difference between the search engines crawling and indexing pages of a website. Are there any benefits for for the website if search engines to crawl and index pages of that website? If there any benefits for such activity why will Googlebots not index and craw all the pages on a website to maximize the benefits for that website?

Ankit Das

01/03/2014 09:23 am

Hi Gracious, You might find this helpful- article from Google http://www.google.co.in/intl/en/insidesearch/howsearchworks/crawling-indexing.html

Gracious Store

01/03/2014 08:09 pm

Ankit, Thank you very much for this link. It answers my question. Thank you and have a blessing filled new year

Canada SEO

04/24/2014 03:27 pm

So who is the judge and jury on the content? WOW! Talk about control freaks.

blog comments powered by Disqus