How To Stop Google Preview From Being Counted In SiteCatalyst

Google Instant Preview, designed to show you a visual preview of your search results, rolled out in early November 2010. You now have the ability to click a small magnifying glass icon next to each search result to get a snapshot of what the page looks like.
Google Instant Preview

Seems like a pretty helpful feature, but how do they do it? Well it would appear that Google has a new spider that crawls the web and takes snapshots of each page in its results. In order for them to get an accurate look at what the page looks like, this new bot needs to able to execute JavaScript. Here is the problem. Since it is executing JavaScript that means it is also firing off the SiteCatalyst code and is being counted as another visitor and is registering page views.

How can you tell if this new Google Web Preview bot hit your site? If you are capturing User Agent you can see it show up in that report:
User Agent Report

NOTE: If you are not capturing user agent and would like to, a super simple way would be to use the SiteCatalyst Dynamic Variable functionality and include s.eVarX=”D=User-Agent”; in your s_code.js file. Just insert the number of the eVar you would like to use (a s.prop would work too) and you are all set.

Another way to see if you are being affected with spider traffic in your report suite from the Google Preview Bot would be to check out a Browser report (Visitor Profile > Technology > Browsers) and filter it to only show visitors using Safari 3.1 and then trend it.
Browser Report
We can see that this report suite has recorded about an additional 15,000 visitors over the last week that is just attributed to Safari 3.1. Checking the User Agent we saw earlier, the Google Web Preview bot is registering itself as Safari 3.1.

Now that we can see that the Google Web Preview bot is having an effect on our traffic how do we get rid of it? We could block that bot in our robots.txt file, but I like having that additional functionality available for my visitors in the Google search results. I just don’t want it to execute my SiteCatalyst code. Well here is how to do it.

I call this my bot detection code (real catchy title, right?). I currently have it just set to look for the Google Web Preview bot, but it could easily be modified to exclude other bots that can execute JavaScript. Here is how you implement it. In your s_code file, at the top you will have a s_account variable that contains your report suite id. It will look something like this:

var s_account="dead"

To implement the bot detection code you will want to change that line to include the function call. It should look like this:

var s_account=botCheck("dead")

Pretty simple so far, right? We just added the function call and included our report suite id in it. Next we have a block of code that needs to be added to the plug-ins section of the s_code file:

function botCheck(b){var c=navigator.userAgent.toLowerCase(),a="";a+=c.indexOf("google web preview")!=-1?"":b;return a};

And that’s all there is to it. So how does it work you ask? What it does is it removes your report suite id if it is the Google Web Preview bot that is accessing the page. The SiteCatalyst code will still fire off, but it will not include the report suite id so it will be discarded by SiteCatalyst and it will not affect your metrics.

Want to see it in action? I thought you’d never ask! Check out the page http://webanalyticsland.com/test.php. On this page I have a basic SiteCatalyst implementation, one line of code that displays your user agent, and then I print the results of the SiteCatalyst debugger right to the screen. Opening this page in a standard Firefox browser we can see that the SiteCatalyst code has fired off properly, it has displayed the correct user agent and the report suite id is contained within the image request string.
Test 1
So far so good. Using the User Agent switcher plug-in for Firefox, we can switch out user agent to the one that we found in the SiteCatalyst report to mimic the Google Web Preview bot.
Test 2
We can now see that when we use that bot’s user agent string, the report suite id is missing from the image request call. Any action that happens now will not be recorded in my report suite, and when SiteCatalyst receives this request it will be discarded. I’ve had this running for a few days now and have not found any issues, but since this is a pretty new chunk of code be sure to test it out before using it on your production site.

Enjoy!

Google Indexes Omniture SiteCatalyst Tracking codes

Tracking codes are an important part of Web Analytics. They are used by every single analytics tool. I recently came across a problem with how Google handles these tracking codes.

Here is what I have set up. We run the site RV Trader. The site lives at rvtraderonline.com. We also own the domain rvtrader.com, which we have set up with a 301 redirect to rvtraderonline.com. Recently we wanted to know how many visitors were accessing our site by going to rvtrader.com and being redirected to our main site. What I did was set up a tracking code on the redirected URL, using the query string parameter ?zmc=rvt. Now whenever anyone lands on the url rvtraderonline.com with that tracking code I will know 100% that they came from rvtrader.com since this is the only place where this tracking code exists. Pretty simple stuff.

I have been using ?zmc= for a long time now, and have it set in Google Webmaster Tools to ignore this parameter. For those who don’t know, there is a place in Webmaster Tools where you can tell Google what tracking codes that they should ignore. I have it set up to tell Google that it should not index any query string parameters that contain ?zmc=. I just double checked it and I do see that it is still there.
Google Webmaster Tools

Then today we take a look in Google and what do I see:
Google Indexing Tracking Codes

How many of my other tracking codes have been indexed? Can I trust the reporting of any of my Omniture SiteCatalyst tracking code reports now? How many other analystics customers are going to have this problem? Is Google indexing tracking codes from any other Web Analytics tools?

If you are noticing any abnormalities in your tracking code reporting this may be why.