Is The Google Search Console Page Filter Throwing Off Accuracy?

Sep 27, 2018 • 8:42 am | comments (3) by twitter Google+ | Filed Under Google Search Engine Optimization
 

There are some savvy SEOs and webmasters that are noticing a possible bug with the Search Analytics/Performance reports in Google Search Console. This is specifically when you use the page filter option, the numbers significantly drop both in the Google Search Console interface and within the Search Console API.

John Mueller from Google did respond, on Twitter calling the issue "an edge-case" but it isn't clear if that is the case. John wrote:

We looked into this w/the team but don't see any bigger patterns there. Subtle side-effects might be visible in certain scenarios, but given the lack of other complaints, this seems more of an edge-case (I wish I had a better answer, but I don't think there's much we can do here).

That is an interesting response from John.

Here is how the issue was explained by Mark Chalcraft:

I'm seeing the same issue across multiple sites within both the API and the main GSC interface (I assume this is the same issue). As soon as "page" is added as a dimension within the API query, the totals drop. I've based this on clicks, but I assume the same is the case with impressions. Using the API explorer, the following two query requests return very different totals.

This request returns what I assume to be the 'complete' number:

However, this request returns a far lower number, I assume due to sampling, once all clicks for individual pages have been summed together. This happens whenever page is added as a dimension:

This has been happening within the API since 19th August, the same date that the change to 'keyword not containing' filters was announced. Things to note here:

  • I'm querying data aggregated 'byPage' here, and there's nothing to indicate this dataset would be impacted by the change on the 19th
  • I'm not using any 'not containing' filter at all, yet I still see drops in the data when the page dimension is added
  • This does not happen for dates prior to 19th August - e.g. I ran checks on 10th August and found the total clicks from each API call matched
  • I checked each API response to ensure that I had extracted all URL records which recorded at least one click

In addition, I believe this can be replicated in the main Search Console interface - below are two screenshots from one of my clients' properties - the first with no page-level element, the second with a 'page contains' filter - I opted for the single letter 'p' here because this GSC property is for a subfolder beginning with that letter - so the reverse 'page not containing' filter returns no data.

Without page filter:

click for full size

With page filter:

click for full size

Mark added "The impact of this is extremely severe - it means that any time-series analysis conducted on a range that begins prior to 19th August and ends after that date is distorted. This means that any attempt to understand trends in the data is fatally undermined. In addition, it means that data after August 19th is heavily sampled - >35% in one example I've seen - to the point where it's value is reduced."

After John responded:

So this is all very interesting, espesially after all the confusion with the anonymous queries change in the reporting there.

Forum discussion at Twitter and Google Webmaster Help.

Previous story: Google Search Console Adds Event Markup Report & Sends Notices
 
blog comments powered by Disqus