Google's Wildcard Subdomain Handling Issues?

Jan 2, 2009 • 8:15 am | comments (2) by twitter Google+ | Filed Under Google Search Engine Optimization
 

A Google Webmaster Help thread points out an interesting bug of some sort in Google. A webmaster asks about a particular site that seems to have flooded Google with tons and tons of subdomains. Since the site is already "outed," I figure I show you this example, because I find it to be pretty interesting.

I am going to jump you to page 33 for a set of search results that show handster.com in all ten spots. You will notice that all of the handster.com results are starting with unique subdomains. Let me show you a quick video of the results:

As pointed out in the thread, it seems like this domain supports wildcard domains, so anything preceding the handster.com will return a page. Yes, a major duplicate content issue. For example, try http://spam.handster.com/ or http://google.handster.com/ or http://duplicatecontent.handster.com/ and so on. They all return the same page, and it also works for specific pages, such as http://barryschwartzwuzhere.handster.com/software.php?id=203&for=Yakumo+PDA.

I see you get the point. This looks like a major oversight by the web administrator but it also shows an issue with how Google handled this problem.

JohnMu at Google commented in the thread saying that he has "passed it on here for someone to look at."

Forum discussion at Google Webmaster Help.

Previous story: Live Search Begins Crawling JavaScript with MSNBot-Media
 

Comments:

Stephan Miller

01/02/2009 04:10 pm

Shareware sites do this a lot. Just search for a specific type of software and you should find all sorts of examples. And with them, not an oversight.

No Name

01/09/2009 06:39 am

I could also see the same kind of issue for some search based dynamic link generation websites. But i am not sure how they could this. But the thing is they are the one who actually spam search result pages. Let's does any of the search engine take a necessary action to manage these websites in future.

blog comments powered by Disqus