Google Indexes PDF Documents But Refreshes Them Slowly

Feb 25, 2016

We know Google indexes PDFs, it is clear, we see them in the index. We also know Google will follow and pass PageRank in links found within PDF documents.

But did you know Google refreshes the content and links within PDFs slowly? The reason is, PDFs likely do not get updated that often themselves. Similar to how images get refreshes slower because the images themselves are not updated that often, same here with PDFs.

In fact, I suspect that PDFs are possibly updates less often than an image may be.

The question came up in a Google+ hangout at the 18:50 mark:

Q) I can't seem to get alot of my pdfs indexed on my product pages should I just add the content on my product tab as well so it's in both places. Will that cause duplication issues and any idea why they wont index?

A) So in general we index PDF files like we would normal pages on a web site. What probably would happen with PDFs is that we don’t refresh them as quickly as normal HTML pages because we assume that a PDF file generally kind of stays stable.

As you can see, John said that PDFs aren't refreshed quickly.

