Does Google Index Content in "The Cloud" (Amazon S3, etc)

May 14, 2008 • 7:56 am | comments (1) by twitter Google+ | Filed Under Google Search Engine Optimization
 

Cloud computing is becoming more and more popular amongst webmasters and site owners. In short, companies like Amazon, RackSpace, Google and others are offering hosting services where you upload your content (html, images, videos, pdfs, etc.) to a web server, that web server then replicates that content onto other web servers - so if you think about it, your content is not just on one server, with limited resources and bandwidth, but on dozens (or more) of servers with virtually unlimited bandwidth and resources.

Duplicate content issue? Nope. There is only one URL for that content (unless you generate multiple URLs for the same content yourself) but Amazon S3, for example, doesn't create a duplicate content issue.

One webmaster at WebmasterWorld is complaining that Google Image search doesn't seem to be indexing the images he has hosted over at Amazon S3. But honestly, I think it is just a timing issue for him.

If you conduct a site command on site:s3.amazonaws.com, the location of the S3 content, you will find hundreds of thousands of results returned. If you conduct the same site command search at Google Image search, you find many images from S3 included in the Google Image Search index.

So, it does appear Google is indexing content in the cloud. Specifically from Amazon S3. Does something have to happen on the Amazon side for Google to index your content? I personally cannot find any hints to Amazon blocking any content from search engines on the technical docs or the FAQs. So maybe it is just a timing thing?

Forum discussion at WebmasterWorld.

Previous story: In 2008, Is The NoArchive Tag a Red Flag in SEO?
 

Comments:

No Name

05/14/2008 08:29 pm

Interesting Post... Glad to hear it doesnt create duplicate content... we all hate that!

blog comments powered by Disqus