Google Instant Preview, designed to show you a visual preview of your search results, rolled out in early November 2010. You now have the ability to click a small magnifying glass icon next to each search result to get a snapshot of what the page looks like.

Seems like a pretty helpful feature, but how do they do it? Well it would appear that Google has a new spider that crawls the web and takes snapshots of each page in its results. In order for them to get an accurate look at what the page looks like, this new bot needs to able to execute JavaScript. Here is the problem. Since it is executing JavaScript that means it is also firing off the SiteCatalyst code and is being counted as another visitor and is registering page views.
How can you tell if this new Google Web Preview bot hit your site? If you are capturing User Agent you can see it show up in that report:

NOTE: If you are not capturing user agent and would like to, a super simple way would be to use the SiteCatalyst Dynamic Variable functionality and include s.eVarX=”D=User-Agent”; in your s_code.js file. Just insert the number of the eVar you would like to use (a s.prop would work too) and you are all set.
Another way to see if you are being affected with spider traffic in your report suite from the Google Preview Bot would be to check out a Browser report (Visitor Profile > Technology > Browsers) and filter it to only show visitors using Safari 3.1 and then trend it.

We can see that this report suite has recorded about an additional 15,000 visitors over the last week that is just attributed to Safari 3.1. Checking the User Agent we saw earlier, the Google Web Preview bot is registering itself as Safari 3.1.
Now that we can see that the Google Web Preview bot is having an effect on our traffic how do we get rid of it? We could block that bot in our robots.txt file, but I like having that additional functionality available for my visitors in the Google search results. I just don’t want it to execute my SiteCatalyst code. Well here is how to do it.
I call this my bot detection code (real catchy title, right?). I currently have it just set to look for the Google Web Preview bot, but it could easily be modified to exclude other bots that can execute JavaScript. Here is how you implement it. In your s_code file, at the top you will have a s_account variable that contains your report suite id. It will look something like this:
var s_account="dead"
To implement the bot detection code you will want to change that line to include the function call. It should look like this:
var s_account=botCheck("dead")
Pretty simple so far, right? We just added the function call and included our report suite id in it. Next we have a block of code that needs to be added to the plug-ins section of the s_code file:
function botCheck(b){var c=navigator.userAgent.toLowerCase(),a="";a+=c.indexOf("google web preview")!=-1?"":b;return a};
And that’s all there is to it. So how does it work you ask? What it does is it removes your report suite id if it is the Google Web Preview bot that is accessing the page. The SiteCatalyst code will still fire off, but it will not include the report suite id so it will be discarded by SiteCatalyst and it will not affect your metrics.
Want to see it in action? I thought you’d never ask! Check out the page http://webanalyticsland.com/test.php. On this page I have a basic SiteCatalyst implementation, one line of code that displays your user agent, and then I print the results of the SiteCatalyst debugger right to the screen. Opening this page in a standard Firefox browser we can see that the SiteCatalyst code has fired off properly, it has displayed the correct user agent and the report suite id is contained within the image request string.

So far so good. Using the User Agent switcher plug-in for Firefox, we can switch out user agent to the one that we found in the SiteCatalyst report to mimic the Google Web Preview bot.

We can now see that when we use that bot’s user agent string, the report suite id is missing from the image request call. Any action that happens now will not be recorded in my report suite, and when SiteCatalyst receives this request it will be discarded. I’ve had this running for a few days now and have not found any issues, but since this is a pretty new chunk of code be sure to test it out before using it on your production site.
Enjoy!


