Can A Search Engine Like Google.com Index My PDF Files?

Nov 10, 2006 - 6:51 am 5 by

pdf_logo_trefoil.gifA Search Engine Watch Forums thread has a member not understanding why Google has not indexed his PDF files. To make a long story short, Google did index his PDF files, but I thought it would make a nice quick post to explain the type of PDF documents search engines can or cannot index well.

Like with any document, HTML, PDF, Word file, etc, the search engines love text. So you write this document, 100% text in Word and then you convert it to a PDF file. Some PDF convertors will translate the text in the document into text format in the PDF document. Some PDF convertors will take an image screen capture of the Word file and use that within the PDF document.

Now images may look fine, but just like you cannot copy and paste the text from an image from one text editor to another, the same is with this PDF document.

If a search engine cannot read the text, due to it being a graphic and not text, then it won't be able to fully index the words on the document.

I assume, eventually, if not now, search engines will use OCR technology to read those PDF files that appear to be text driven, but in reality they are graphics.

So how do you know if your PDFs are Search Engine Friendly? Try to copy and paste the body text from the PDF to a text editor like Word or Note Pad. If that works, then it is most likely that Google, Yahoo!, MSN (Live.com), and Ask.com will index those PDFs.

Forum discussion at Search Engine Watch Forums.

 

Popular Categories

The Pulse of the search community

Follow

Search Video Recaps

 
- YouTube
Video Details More Videos Subscribe to Videos

Most Recent Articles

Search Video Recaps

Search News Buzz Video Recap: Google November Core Update Done, Chrome Site Engagement Metrics, Canonicals, 21 Years & More

Dec 6, 2024 - 8:11 am
Google Updates

Google November 2024 Core Update Finally Finished Rolling Out

Dec 6, 2024 - 8:01 am
Google Search Engine Optimization

Google Does Try To Handle Broken Canonicals

Dec 6, 2024 - 7:51 am
Google Search Engine Optimization

Google Search: How Clustering Works With Localization

Dec 6, 2024 - 7:41 am
Google Search Engine Optimization

Google Marauding Black Holes With Clustering & Error Pages

Dec 6, 2024 - 7:31 am
Google Search Engine Optimization

Google Has 40 Signals For Canonicalization

Dec 6, 2024 - 7:21 am
Previous Story: Google Tests AdWords "Account Snapshot" Beta