When To Use Canonical Tag For Paginated Results

Mar 10, 2011 • 9:02 am | comments (11) by twitter Google+ | Filed Under Google Search Engine Optimization
 

GoogleBot Canonical TagI learned a couple things last night from the Your Toughest Technical SEO Questions Answered session at SMX West.

A lot of people use the rel=canonical tag for paginated results on their site, but they may be doing it the wrong way. A WebmasterWorld thread has one example of such use.

For example, you have a category landing page with ten results per page, and let's say five pages for pagination. So you have a total of 50 listings cut up against five pages.

Some webmasters may decide to use the canonical tag to communicate to Google to redirect pages 2, 3, 4, and 5 to page 1. But technically, as per Maile from Google in the panel last night, that is wrong and should not be done.

Maile explained that since the results on pages 2, 3, 4, and 5 are different from page 1, you should not use the canonical tag here.

Not only that, if you do, Google may ignore it because Google uses methods to determine if the canonical tag command is actually something valid for that case. So if you canonical page 2 to page 1 and page 2 is not similar enough to page 1, Google may ignore your canonical tag.

Got that?

So what do you do? I suspect you create a "view all" page and have all pages, page 1, 2, 3, 4 and 5 canonical to the "view all" page. I know that might not be feasible for many sites and in that case, you need to think at a higher level.

In any event, I know many sites and webmasters who have it set up the "wrong way" and I am not sure how it may hurt them.

Forum discussion at WebmasterWorld.

Previous story: SMX Live: Building Buzz On Facebook: Getting Liked & Shared
 

Comments:

Ryan Bradley

03/10/2011 02:58 pm

For the most part the canonical tag is a great tool. The pagination part seems to be the trickiest.

Shumisha

03/10/2011 03:40 pm

I suspect you'll be facing the same issue with your solution, as the viewall page would be too different from each pag1, page2, etc

Dave

03/10/2011 03:48 pm

For pagination I generally use meta noindex,follow. That way the pages aren't competing with each other in the rankings, and pagerank is still flowing to all the items listed.

Barry Schwartz

03/10/2011 03:50 pm

kinda of, but not really. I believe Google is okay with that, because page 1 items are on the view all page. Of course, I can be wrong.

Vanessa Fox

03/10/2011 05:27 pm

We talked about solutions at the session also. If you're able to do a view all page for this type of pagination, you can use rel=canonical to point all pages to that view all, as each page is a subset. Otherwise, as someone else in the comments mentioned, noindex,follow on all but page 1 gets all the links crawled but only the first page indexed.

Shumisha

03/10/2011 06:15 pm

My concern sparked from some earlier statement such as "..We allow slight differences, e.g., in the sort order of a table of products..." from http://googlewebmastercentral.blogspot.com/2009/02/specify-your-canonical.html. I would consider that a 10-items subset of a global 200 items list has more that "..slight differences..." with said global list.

Aaron Bradley

03/11/2011 12:42 am

Interesting. I know all sorts of ecommerce sites that use canonical to point to page one of a category page from page two and below. I even know of a notable site that uses not only rel=canonical in this fashion, but - additionally - a 301 to permanently redirect page two and below to the first page of the category. If canonical redirects are frowned upon by Maile, I can only imagine his horror at this server-side redirect (it is, to boot, agent-based - which of course is the only way it could work). But it is spectacularly effective: only that first page appears in Google's index, and the site ranks extremely well for keywords relevant to each category (though not only for this reason). While I don't advocate this, I can see the logic behind it, and even a benefit to Google. Most ecommerce sites have multiple spider paths to individual product detail pages (as well as XML sitemaps), and product short description snippets typically appear on multiple category pages. So pages two and below are really just that much thin, duplicate content - same meta data, and very similar grouping of products across pages. That Google doesn't think very highly of lower pages - when not controlled by any of the mechanisms discussed - is evidenced by the fact that hardly any lower paginated category pages appear when the site: command is used, and they are almost never returned by the keyword queries. Does it benefit Google to spend it's indexing resources crawling all these largely duplicitous pages? I'd argue not. As per Vanessa's comment, that it may be beneficial to use canonical to point to page one seems to me fairly pedantic (and the typically parameter-laden URLs produced by show X, Y, Z product options cause their own canonicalization issues). I haven't observed many benefits of having all those product snippets crawled (again). I guess what I'm trying to say is that to frame use of canonical for paginated URLs as something "that is wrong and should not be done" is rather unhelpfully vague. The Webmaster Tools help page on canonical also uses equivocal language like "should" and uses that "would not be appropriate." This suggests that improper use of canonical is akin to actual violations of Google's webmaster guidelines (like cloaking), but never comes right out and says so. Nor do Maile's comments or the guidelines suggest that a penalty might be invoked if you use canonical when you "shouldn't," but the suggestion is there. Google may "ignore" an improper canonical tag is a far cry from being penalized from doing so, and like Barry "I am not sure how it may hurt" webmasters that choose to do so. An analogy to me is that it may not be appropriate to use the alt attribute to make an irrelevant textual association with an image, but there's nothing preventing you from doing so (akin to improper rel="canonical" use). However, Google is less equivocal when CSS is used to provide a coded representation of graphically rendered text, which basically say is a violation when the accessible text doesn't match the image exactly, and don't even like when it is identical (akin to agent-based redirects for pagination). Thanks for the report, Barry.

Paul

03/11/2011 04:05 pm

I made the orig post on WMW. noindex, follow appeared to be the best solution IMO. Though it neither helped nor hindered my SERPs. Though it had a huge impact on Adsense for some reason. Adsense isn't even on the gallery, it's on the proceeding pages. I can only theorize that CPC is taking in to account the referer, which is now seen as lower quality, due to the fact it's not indexed? I'm 99% certain this was no coincidence. I reverted the change this morning, so we'll see. The view all method seems feasible BUT what about cases where you've got 100+ pages and each page has 30+ links. Would it still be wise to have a view all page?

Alistair Lattimore

03/12/2011 01:09 pm

This was covered indirectly by Matt Cutts in a Google Webmaster Tools video recently talking about products, product reviews & whether it was appropriate to canonical each individual product review back to the product page. http://www.youtube.com/watch?v=AXnbBsRbKDA

Vanessa Fox

03/16/2011 12:28 pm

Shumisha, Maile from Google specifically said that's fine to point paginated page to a canonical view all if it contains a subset of the view all page.

Shumisha

03/16/2011 02:45 pm

Hi Vanessa. Thanks, I didn't come across such reference, at least on the introductory blog post I linked to and its comments. As a matter of fact, I found a comment by Maile which looks to me as hinting to the opposite (in response to "vizualbod.com") "...so it should only be used in situations where the content is identical or nearly identical. In a paginated series, each page contains entirely different content/items so they shouldn't be grouped as one URL. Thanks for asking, though!" I agree in that case context is not the same, as user wanted to aggregate quality signals on the first page instead of a view all, but maile seems to reiterate the "nearly identical" idea that's part of the main blog post text.

blog comments powered by Disqus