Google Reveals Spam Scores Accidently?

Jul 5, 2006

Spotted over at PeterD, A DigitalPoint Forums thread that reveals some interesting data. Basically, a member clicked on a cache link of a page and was presented with the following from Google.

pacemaker-alarm-delay-in-ms-overall-sum 2341989
pacemaker-alarm-delay-in-ms-total-count 7776761
cpu-utilization 1.28
cpu-speed 2800000000
timedout-queries_total 14227
num-docinfo_total 10680907
avg-latency-ms_total 3545152552
num-docinfo_total 10680907
num-docinfo-disk_total 2200918
queries_total 1229799558
e_supplemental=150000 --pagerank_cutoff_decrease_per_round=100 --pagerank_cutoff_increase_per_round=500 --parents=12,13,14,15,16,17,18,19,20,21,22,23 --pass_country_to_leaves --phil_max_doc_activation=0.5 --port_base=32311 --production --rewrite_noncompositional_compounds --rpc_resolve_unreachable_servers --scale_prvec4_to_prvec --sections_to_retrieve=body+url+compactanchors --servlets=ascorer --supplemental_tier_section=body+url+compactanchors --threaded_logging --nouse_compressed_urls --use_domain_match --nouse_experimental_indyrank --use_experimental_spamscore --use_gwd --use_query_classifier --use_spamscore --using_borg"

Look at some of those values; "e_supplemental," "pagerank_cutoff_decrease_per_round," "supplemental_tier_section," "use_experimental_spamscore," "use_spamscore" and "using_borg."

Very interesting but no confirmation from Google, of course.

Forum discussion at DigitalPoint Forums.

Michael Martinez

07/05/2006 04:01 pm

Those look like configuration parameters, possibly used by whatever software is communicating with the query processor. I doubt there is any information in there that is relevant to reverse engineering the spam scoring process. The page in question looks like it's a forum post (in Arabic, so I did not click on the link to look at it).


07/05/2006 08:09 pm

Hi Barry, I would like to hear about what happened on June 27 :-)

Barry Schwartz

07/05/2006 08:30 pm

This is what happened on <a href="">June 27th</a>, :-).


07/06/2006 09:48 am

Wow. That's very useful. Things like a tiered supplemental makes sense to cope with basic site errors, and webmaster mistakes etc. Capping pagerank growth and decline at different rates, very interesting. I just wonder if some of these are site specific or default settings, now if only we had another test case for comparison. O'h Matt...where are you?

