In the Seattle Conference on Scalability, Google's Marissa Mayer was asked a question where she revealed that Google has 10,000 human evaluators that manually go through the search results and rate them.
You can find coverage of this keynote at Dare Obasanjo's blog.
Q: How do they tell if they have bad results?
A: They have a bunch of watchdog services that track uptime for various servers to make sure a bad one isn't causing problems. In addition, they have 10,000 human evaluators who are always manually checking the relevance of various results.
Most people agree that the 10,000 number sounds large. The question is, does this include internal Google staff that have this rater hub evaluation software installed on their machines?
Forum discussion at WebmasterWorld.