Google Promotes iGoogle Artist Themes | Main | Which Advertiser Types Do You Put in Your Google AdSense Filter?

Managing Duplicate Content In a World Where Google Can Crawl JavaScript

Now that Google admitted to crawling JavaScript and forms SEOs and Webmasters need to be aware of how to manage even more duplicate content issues.

In the past, a good strategy was to build out filter pages (filter by color, size, price, etc.) using JavaScript pull down menus. Google would typically stay away from such forms and you would not necessarily have to worry about Google seeing the same content filtered or sorted by color, price, size and so on.

But now with Google crawling JavaScript and forms, Webmasters need to take an extra step towards preventing Google from crawling and indexing such content. Why? Duplicate content.

A WebmasterWorld thread has discussion on this topic and offers tips on what to do, to help you with this problem. Some of the advice includes:

  • Include the duplicate content in an external Js, assign it to variables, and do innerHTML to some divs.
  • Use XmlHTTPRequest (GET) to retrieve the data in XML format and then put it into the page.
  • Use an Ajax POST and retrieve the XML content with this.
  • Use robots.txt to block specific files and/or page naming conventions.
There are many ways to tackle the issue, but using JavaScript alone is no longer the best answer.

Forum discussion at WebmasterWorld.

posted rustybrick in Dynamic Site Topics at April 30, 2008 8:01 AM Comments (2)

Comments

We have been using java script to enter 'hover over text' - when your mouse is positioned over an image, a small pop-up definition appears. I'm concerned that if the crawlers are now viewing this javascript, along with the ALT tag definitions, that this will be considered duplicate content on a specific web page. Any advice?

 

Google has been crawling Javascript for years, a fact that Matt Cutts confirms in his blog comment here.

I know I've discussed their ability to crawl Javascript on both Spider-food and Highrankings many times through the years. Google has made no secret of the fact that it's been able to parse escaped Javascript redirects for years, either.

I reported the first Javascript crawler from Stanford University back in 2001/2002, as I recall (that post has been long deleted). No one should be surprised at Google's ability to parse Javascript. This is very old news.

Yahoo! has been doing the same for years.

Google only seems to be treating Javascript content a little differently now.

 

Post a comment

Do you want us to save your personal Information?


To subscribe to the Search Engine Roundtable, click here