Yahoo's Q1 Profits: $542.2 Million | Main | Daily Search Forum Recap: April 24, 2008

Google Spidering Encoded HTML in Urls?

Maybe so, just a spidering glitch, weird links, or sitemap error? I was searching this morning in Google doing some tests on Google's new Whois feature. When I plugged in a domain what popped up in the first page of the results was a weird URL including encoded ampersand and other characters in front of the shown URL for the website (aboutus.org). Once I clicked on the link I got a 404 error.

Screenshot:
google spidering encoded urls

Link to Result.

Thoughts? Comments?



Like The Story? Vote For It On Yahoo Buzz! Or On Sphinn!

posted Phoenix in Other Google Topics at April 24, 2008 2:13 PM Comments (4)

Comments

Looks like someone fubar'd an href and didn't encode things properly. That's supposed to be a br tag (yeah, a line break), but it ended up being part of the URL.

 

Try cutting and pasting the URL into your browser:

<br>www.aboutus.org/EasyJournal.com

 

Haha, nice find John :)

This is certainly weird. Google seems to sometimes parse an entire website with mistakes in the url. For example someone was linking to one of our sites a while ago with a space between http:// and www . In most browsers that kicks up an error when clicking on it (some actually handle it!). But Google had crawled that one link and then based off of it crawled an insane number of pages all with the space in the url. Lots of these were then indexed, even though they were clearly dup content. Users clicking on the links nearly always got a browser error (except those using the browsers that managed to handle it).

There's potential there for some serious damage to a competitors site. Pump through some links with spaces in the url to competitive pages and then watch as google drops the regular link for the one with the space which users can't actually use..... nice!

 

Hi Ben,

Good catch and great article! I suppose also whoever is uploading those excel sheets of destination URLS is half a sleep??

 

Post a comment (Note: Can Take 120 Seconds For Your Comment To Show Up)

Do you want us to save your personal Information?


To subscribe to the Search Engine Roundtable, click here