Pound Signs (#) In URL

Feb 1, 2006 • 9:43 am | comments (14) by twitter Google+ | Filed Under SEO - Search Engine Optimization
 

Every now and then I like to write about a basic SEO thread and here is one named How Google read ULR with "#" at WebmasterWorld.

The member asks this question because of all the duplicate content issues people have these days.

There are about two ways to have URLs with the # (anchors) in them, that I can think of right now.

The basic method is deployed here on some individual archive entries. Kim Krause posted an entry here the other day at http://www.seroundtable.com/archives/003204.html but you can also get to this entry by going to http://www.seroundtable.com/archives/003204.html#more. The latter URL anchors you down to a place in the source code that looks like

<a name="more"></a>

Same page, same source code, and same URL as above. It is fairly easy for a search engine to assume that they can strip off the # and anything after the # from the URL.

The second type of URL is found a lot in dynamically driven sites, including many forums. I often link directly to a post that way. For example, in my coverage of Expanded Broad Match Hurting AdWords Advertisers I link to a thread at http://forums.searchenginewatch.com/showthread.php?t=9817 but I also link within that thread directly to a members post at http://forums.searchenginewatch.com/showthread.php?p=72232#post72232. Two URLs, same content on the page, same source code, duplicate content. Stripping off the # and whatever is after the # doesn't help with the duplicate content issue. I assume built into these forums are methods to tell the search engines not to index that URL but the primary thread URL, but I didn't look.

Forum discussion at WebmasterWorld.

Previous story: YPN Bans Threads Begin to Hit Forums
 

Comments:

Jiří Sekera

02/01/2006 03:44 pm

You can use &lt;div id="more"&gt;&lt;/div&gt; as well.

Tyson

02/01/2006 05:24 pm

I think this has big ramifications for SE-friendly affiliate systems. See http://zen-sem.blogspot.com/2006/02/urls-with-anchors-affiliate.html for my thoughts.

Barry Schwartz

02/01/2006 05:35 pm

Seems logical to me. But can it be that easy?

Jaimie Sirovich

02/01/2006 06:09 pm

Barry, that won't work. The article contradicts itself, since it mentions # isn't sent with the URI, then says it can be tracked by the server. While the browser does know, the server doesn't. You're better off with using the REFERER if you want to do this. J

Shawn Hogan

02/01/2006 07:56 pm

Search engines ignore the # and anything after it. Has anyone actually seen a search result with it in it? Although from a technical standpoint, you could serve different content based on it if you wanted (with PHP for example). The web server does get the full request (with the fragment) in it and server side scripting languages are able to see them. For example: http://www.php.net/manual/en/function.parse-url.php

Jaimie Sirovich

02/01/2006 08:46 pm

Shawn, I do not believe this is correct. The function exists in PHP, but $_SERVER[REQUEST_URI] does not contain the internal link because I do not believe the browser sends it in the GET. This makes sense, because the browser only needs to know anyway. J

Dennis Pallett

02/01/2006 09:01 pm

Jaimie's right, and it doesn't show up at all. The parse_url() function does support it, but that parses any URL you send it.

Tyson

02/01/2006 09:03 pm

Shawn, from what I can tell, the parse_url function can parse anchors from a given url string, but PHP has no way of getting the anchor from the current page. It doesn't look like the anchor is included in any of the server variables like PHP_SELF, QUERY_STRING, etc. I would love to do this, though, do you know of a way? (We're probably way off-topic now...)

Shawn Hogan

02/01/2006 09:31 pm

Okay... I should have been more specific. Technically it's possible, but most browsers don't send the fragment in the GET request. But if it does send it, PHP can see it and it's logged to the http logs. For example... fragment.php is a PHP script that just does "echo $_SERVER['REQUEST_URI'];"... ----- Titan:~ shawn$ telnet www.digitalpoint.com 80 Trying 216.9.35.56... Connected to crystalmethod.digitalpoint.com. Escape character is '^]'. GET http://www.digitalpoint.com/~shawn/jungle_gym/fragment.php?something=1234#blahblah HTTP/1.1 Host: www.digitalpoint.com HTTP/1.1 200 OK Date: Wed, 01 Feb 2006 21:26:24 GMT Server: Apache/1.3.20 Sun Cobalt (Unix) mod_ssl/2.8.4 OpenSSL/0.9.6 mod_auth_pam_external/0.1 mod_perl/1.26 Transfer-Encoding: chunked Content-Type: text/html http://www.digitalpoint.com/~shawn/jungle_gym/fragment.php?something=1234#blahblah Connection closed by foreign host. ----- [root httpd]# grep blahblah access www.digitalpoint.com ##.###.###.### - - [01/Feb/2006:13:26:25 -0800] "GET http://www.digitalpoint.com/~shawn/jungle_gym/fragment.php?something=1234#blahblah HTTP/1.1" 200 94 "-" "-" -----

Nacho Hernandez

02/01/2006 09:48 pm

Tim Berners-Lee comments on his blog: "On the web of [x]HTML documents, the links are critical. Links are references to 'anchors' in other documents, and they use URIs which are formed by taking the URI of the document and adding a # sign and the local name of the anchor. This way, local anchors get a global name. On the Semantic Web, links are also critical. Here, the local name, and the URI formed using the hash, refer to arbitrary things. When a semantic web document gives information about something, and uses a URI formed from the name of a different document, like foo.rdf#bar, then that's an invitation to look up the document, if you want more information about. I'd like people to use them more, and I think we need to develop algorithms which for deciding when to follow Semantic Web links as a function of what we are looking for." Read more here: http://dig.csail.mit.edu/breadcrumbs/node/62 Comments on that blog are very interesting too, so be sure to read those as well. Saludos, Nacho

Sebastian

02/01/2006 10:36 pm

Fragment identifiers in links do not "create a new URL", thus SEs will not treat the sheer URI and its variants with fragement identifiers as duplicates. Do *not* use the outdated [A NAME="fragment-identifier"][/A] syntax to define anchors. Better add the ID attribute to [x]HTML elements like P/Hx/UL/IMG..., for example [p id="fragment-identifier"]. http://www.smart-it-consulting.com/article.htm?node=155&page=90#href-fragment-identifier

beth

11/26/2007 05:31 pm

How does Mediapartners see the FI?

Gary

07/22/2008 03:57 pm

The (#) is often used in AJAX development for a number of reasons. if you've got methods setup to handle it you can even have AJAX application history using the (#). It can also serve as a link also for external websites to AJAX heavy developments. For instance you can essentially execute functions in an app through it if the author has that in place. here's and example: http://www.customrigsmag.com/?page=Videos#video_5

bip

11/25/2009 05:51 pm

Yes , How does Mediapartners see the FI?

blog comments powered by Disqus