URL Normalization: Is a Trailing Slash the Same Page

Dec 28, 2004 - 3:00 pm 1 by

There is a very interesting thread brewing at Search Engine Watch Forums named Is A Trailing / On A Directory Seen As A Differnet File By Google?. In this thread a member lists an example of the same page, different URLs due to the trailing slash, have different PageRank values. His example is:

http://www.avismauritius.com/en/locations/ PR=3 http://www.avismauritius.com/en/locations PR=0

In the thread, Orion, the resident search technology guru at SEW forums, discusses how search engines normalize the URLs in order to give each URL a unique identifier. I hope that I explain this correctly. It is my understanding that the unique identifier is a hash string, possibly a 64 or 128 bit hash string. In order to assign a unique identifier, the URL needs to be stripped down and normalized. The process is a bit like Orion stated:

Removal of the protocol prefix (http://) if present Removal of a :80 port number specification if present (However, non-standard port number specifications are retained) Conversion of the server name to lower case Removal of all trailing slashes ("/")

However, this does not really explain if Google does all or some or none of this. Moderator Chris_D referenced an old WebmasterWorld thread where GoogleGuy sheds some more light on this topic. He talks a lot about http responses and URL requests, but the important line to get out of the thread is "I would always recommend the trailing slash. If you know the exact right url, it's often best to give it directly and save everyone that extra redirect." You also might want to check out msg # 6 in that thread.

PageOneResults from the SEO Consultants Directory explains that this is more of a matter of "content negotiation". He goes on to explains;

The W3C and other large website structures are now utilizing content negotiation. That means that this...

www.example.com/sub

...could be different than this...

www.example.com/sub/

With the use of content negotiation, there are no file extensions. Basically you are cleaning the URI of all underlying identifying technologies.

Bottom line, the same URL with and without a trailing slash can and is considered different to most search engines. Most are weeded out through the use of duplicate content filters, and most sites do not have this problem because of the built in way the server handles these URL requests.

 

Popular Categories

The Pulse of the search community

Follow

Search Video Recaps

 
Google Core Update Coming, Ranking Volatility, Bye Search Notes, AI Overviews, Ads & More - YouTube
Video Details More Videos Subscribe to Videos

Most Recent Articles

Search Forum Recap

Daily Search Forum Recap: July 25, 2024

Jul 25, 2024 - 10:00 am
Google Ads

Google Again: We Will Test Ads In AI Overviews Soon

Jul 25, 2024 - 7:51 am
Bing Search

Microsoft Now Testing Bing Generative Search Experience

Jul 25, 2024 - 7:41 am
Bing SEO

Reddit Blocked Bing Search & Others But Not Google

Jul 25, 2024 - 7:31 am
Local Search

Apple Maps Web Version Launches Beta

Jul 25, 2024 - 7:21 am
Google Ads

Google Local Service Ads Shows Phone Number On Hover

Jul 25, 2024 - 7:11 am
Previous Story: OR Factor: Originality Factor