Google On Potential Issues With Canonicals & JavaScript

Jan 24, 2019 • 8:22 am | comments (0) by twitter | Filed Under Google Search Engine Optimization
 

Google Dynamic Rendering

Google's John Mueller posted a very detailed response on Reddit about how Google treats canonicals and how JavaScript based sites that use them might run into issues. JavaScript isn't the issue itself but it can cause some confusion.

Here is what John said:

With canonicals, the first question on Google's side is always: Are these URLs for the same content? And if they're not meant to be the same, why might Google think that they are? The rel-canonical plays more into the "which of these URLs is shown" decision later on.

If these are different URLs that Google thinks are for the same content, then usually it either falls into "they return mostly the same content to Google" or "the URL structure is so messy that Google can't efficiently check them all and has to guess". With JavaScript based sites, the content side is a common reason for this: for example, if you're using a SPA-type setup where the static HTML is mostly the same, and JavaScript has to be run in order to see any of the unique content, then if that JS can't be executed properly, then the content ends up looking the same.

There are multiple reasons why JS might not be executed properly, sometimes it's just because it's flakey code, sometimes it doesn't degrade gracefully (eg, doesn't support ES5, or requires features which a crawler wouldn't use, such as local storage or service workers), sometimes resources (JS-files) or server responses (API requests, etc) are blocked by robots.txt, or it can be that it just takes too long to be processed. If you're sure that your code works for Googlebot (a quick way to test is the mobile-friendly test), then it's worth estimating if speed of processing might be an issue. The hard part here is that there's no absolute guideline or hard cut-off point that you can focus on or test for, partially also because a page rarely loads exactly the same way across separate tests. My way of eyeballing it is to see how long the mobile-friendly test roughly takes, and to check with webpagetest.org to see how long a page might take to load the critical / unique content, and how many resources are required to get there. The more resources required, the longer time until the critical content is visible, the more likely Google will have trouble indexing the critical content.

I think that's what's happening here. Google sees those pages as serving mostly the same content, which is a sign that Google can't pick up the unique content on the pages properly. Going back from there, it's probably a sign that it's too hard to get to your unique content -- it takes too many requests to load, the responses (in sum across the required requests) take too long to get back, so the focus stays on the boilerplate HTML rather than the JS-loaded content. Reducing dependencies & latency can help there.

As you can see, it is a complex topic but something that should work if done correctly.

Forum discussion at Reddit.

Previous story: Google Local Pack Defaults To Showing 4 Stars For Some Queries
 
blog comments powered by Disqus