About a year ago, Google went on record about phone spam web pages and how they don't want them in their index. Google's Matt Cutts said "when people search for a phone number and land on a page like the one below, it's not really useful and a bad user experience." Matt added, "we [Google] do consider it to be keyword stuffing to put so many phone numbers on a page."
Since then, finding phone spam has been less of an issue because Google has not been allowing much of it in their index. Matt also did a video on this earlier this year:
That being said, a Google Webmaster Help thread has one person who invested hundreds of thousands into a caller ID site and is upset Google didn't index it. Currently, Google has not indexed a single page.
Looking at the site and clicking through the site makes you scratch your head and say, wow this looks spammy. But reading this guys posts in the forum makes you think he really believes his site is awesome. He wrote:
We had seen Matt Cutts' video well before we began developing our site. This misses the point. He was referring to the spam sites that list phone numbers but have absolutely no useful information. These were sites that listed every prime number, every harshad number, etc etc in order to get traffic of people searching for numbers.
We have studies the search results for phone numbers closely, and don't believe this to be such a simple matter.
You stated you often find useful results -- our estimates are that abuot 15-25% of phone numbers currently have "useful results" on google. We were providing useful results for about 80% of all US phone numbers. Again, to reiterate, we STUDIED the phone search market/results on google and feel we made a substantial, honest, and ethical improvement to the results. Nonetheless, we are being flagged as spam. This was my initial question. Hope this clarified any confusion.
Basically, we feel we made a substantially better directory, and shouldn't be flagged as spam.
He is even so upset that he said he will take Google to court over this. I kid you not.
Then John Mueller of Google comes in and asks him to step back. John wrote:
In general, it's important to us that a site is not just autogenerated thin content, but rather that it provides unique and compelling value of its own. Looking through a few of the pages on your site, I'm a bit worried that - at least what's indexable - is just a collection of links and numbers. Assuming this were a service for a different kind of lookup (say error-codes), then it would be a collection of pages listing all possible error codes, without actually providing insights on that error code. Clicking through, after completing a captcha, you might see the bare-bone error-text, but even past that there wouldn't be any more insight there.
Instead of focusing on what you think Google needs for indexing, I'd recommend taking a step back and focusing on the user instead. Kevin mentioned a reasonable approach -- work on becoming the absolute best website like this, such that users go to it directly for this information, not that you need over a billion pages indexed that only vary by a handful of digits and a tiny snippet of text. I can see that you're passionate about providing a useful & compelling service, but you really need to think past trying to map all of those database entries 1:1 (or rather 1:many -- considering the "category" pages) to pages that need to be indexed in search with minimal unique & compelling content individually. That's not impossible, but it's not trivial either :) - and that's where you can make your passion & knowledge shine.
None of this advice is new but sometimes after investing hundreds of thousands of dollars into something, you lose sight into what the purpose is.
Forum discussion at Google Webmaster Help.