URL Normalization: Is a Trailing Slash the Same Page

Dec 28, 2004 - 3:00 pm 1 by

There is a very interesting thread brewing at Search Engine Watch Forums named Is A Trailing / On A Directory Seen As A Differnet File By Google?. In this thread a member lists an example of the same page, different URLs due to the trailing slash, have different PageRank values. His example is:

http://www.avismauritius.com/en/locations/ PR=3 http://www.avismauritius.com/en/locations PR=0

In the thread, Orion, the resident search technology guru at SEW forums, discusses how search engines normalize the URLs in order to give each URL a unique identifier. I hope that I explain this correctly. It is my understanding that the unique identifier is a hash string, possibly a 64 or 128 bit hash string. In order to assign a unique identifier, the URL needs to be stripped down and normalized. The process is a bit like Orion stated:

Removal of the protocol prefix (http://) if present Removal of a :80 port number specification if present (However, non-standard port number specifications are retained) Conversion of the server name to lower case Removal of all trailing slashes ("/")

However, this does not really explain if Google does all or some or none of this. Moderator Chris_D referenced an old WebmasterWorld thread where GoogleGuy sheds some more light on this topic. He talks a lot about http responses and URL requests, but the important line to get out of the thread is "I would always recommend the trailing slash. If you know the exact right url, it's often best to give it directly and save everyone that extra redirect." You also might want to check out msg # 6 in that thread.

PageOneResults from the SEO Consultants Directory explains that this is more of a matter of "content negotiation". He goes on to explains;

The W3C and other large website structures are now utilizing content negotiation. That means that this...

www.example.com/sub

...could be different than this...

www.example.com/sub/

With the use of content negotiation, there are no file extensions. Basically you are cleaning the URI of all underlying identifying technologies.

Bottom line, the same URL with and without a trailing slash can and is considered different to most search engines. Most are weeded out through the use of duplicate content filters, and most sites do not have this problem because of the built in way the server handles these URL requests.

 

Popular Categories

The Pulse of the search community

Follow

Search Video Recaps

 
Video Details More Videos Subscribe to Videos

Most Recent Articles

Search Forum Recap

Daily Search Forum Recap: May 28, 2024

May 28, 2024 - 10:00 am
Google Updates

Memorial Day Google Search Ranking Volatility

May 28, 2024 - 7:51 am
Google Search Engine Optimization

Google's John Mueller On Recovering From Core Updates - Maybe You Had A Good Run...

May 28, 2024 - 7:41 am
Google Ads

Undated Google Ads Experiments To End August 23, 2024

May 28, 2024 - 7:31 am
Google

Google Tests Thin Top Deals Search Bar

May 28, 2024 - 7:21 am
Google Search Engine Optimization

Report: 14,000+ Google Search Ranking Features Leaked

May 28, 2024 - 6:15 am
Previous Story: OR Factor: Originality Factor