Archives

 

Google Changes Referrer Values Again For Secure Searches

Over the past 6 months Google has made changes to their search experience in an attempt to increase the privacy and security of their signed in users. What this has meant for analytics tools is that the referring URL for those signed in users was stripped of any searched keywords when clicking on Google organic search results.

Here’s what has been happening behind the scenes. All signed in users are now on a secure version of Google (https), and a redirect has been added to each search results click. That redirect is to a non-secure page (http), where the referring URL is changed before the visitor arrives at the page they requested. That new referring URL value has had its keywords removed, but still contains enough information to determine it was a Google Secure search. Workarounds were created to help identify a Google Secure search in SiteCatalyst keyword reporting, as well as Omniture making a change to try and account for those searches.

Since making that change Google has determined that the additional redirect is unnecessary and potentially slowed down the users experience, so they have decided to eliminate it (unfortunately that does not mean analytics tools will be able to see those keywords again).

Today Google announced a change to the way they plan on handling referring URL’s starting in April 2012. Google has decided that they will now begin to use the referrer meta tag for browsers that will support it, as opposed to the redirect to the non secure page. Currently the only major browser that supports it is Google Chrome.

If you are not familiar with the referrer meta tag, what it does is it lets each web page decide how referrer from it should be handled. For example, here’s what a meta referrer tag looks like:

<meta name="referrer" content="never">

What this tag will do is that it tells the browser to never pass any referring information from the page its on. The browser should then set the referrer header value to a blank string for referrers from that page.

Fortunately Google is not going to that extreme. They have decided to use the “origin” value:

<meta name="referrer" content="origin">

This is the referrer meta tag value that Google will begin to use in April 2012. When the change goes live, all search clicks from signed in users will now only have the referrer value of https://www.google.com/. There will be no other information in the referring URL, so no way to determine that it was specifically a Google secure search other than the URL being simply that host value. Non-secure searches, ones made from a user not logged into a Google account, will continue to function in the same way as they do now.

Currently the referrer meta tag is not currently supported in all browsers. I tried it using Chrome 17 and it is working. Testing it in Safari 5.1.4 and Firefox 11, the referrer meta tag has no impact.

So what does this mean as far as SiteCatalyst reporting? According to the Knowledge Base answer #5329, “If the domain of the referrer corresponds to that of a recognized search engine (e.g. “google.com”) and contains the recognized search keyword query parameter for the given search engine domain (e.g. for Google this is “q=”), then the referrer is considered a search engine, and the value of the keyword query parameter is taken as the search keyword.” So no search keyword query parameter, no search is counted. Currently for a Google secure search, the parameter is still there, but it’s unpopulated. Now Google plans to remove it all together.

Hopefully before Google rolls this out publicly Adobe will come up with a solution for SiteCatalyst so there are no interruptions in the Search Engine reporting. If Adobe does not get to it in time, or if Google decides to push the change out before April, then a couple of lines of code added to your s_code.js file will keep the impact to a minimum while Adobe works out a solution.

if(document.referrer=="https://www.google.com/"){
	s.referrer="https://www.google.com/?q=google%20secure%20search";
}

What this will do is look for a referrer with the exact value of https://www.google.com/ and append a q= value to it with the keyword of google secure search. If the .com version of Google is not the one most used by your visitors, then just replace it with the correct tld version applicable to your site. This snippet of code will make sure the search is still counted, and you will continue to keep the same level of reporting that you have now.

UPDATE: If your your visitors are coming to your site from multiple country specific versions of Google, then I have you covered. Just include this plugin in your s_code file and all Google secure searches from every Google domain will automatically be handled. All you need to do is cut and paste.

s.getGoogle=new Function(""
+"var s=this,a=document.referrer,b=a.split('/')[3],c=a.substring(0,19"
+");a&&26>=a.length&&(b||c=='https://www.google.'&&(this.s.referrer=a"
+"+'?q=google%20secure%20search'));return this.s.referrer")();

enjoy!

How To Stop Google Preview From Being Counted In SiteCatalyst – Updated

UPDATE: I have a much better way to block the Google Web Preview bot from being tracked as a visitor in SiteCatalyst. The original solution I had posted here, required a block of code to be placed at the very top of the s_code file, and your account ID was put into a function call. Then when the s_code was fired, a function would first check the user agent of the visitor to see if it was the Google Web Preview bot, and if so then swap out the account ID with a blank value. The SiteCatalyst code would still fire, but when Omniture received the image request it would get discarded because of the missing account ID. The more I thought about this I figured there had to be a better way. No reason to execute all the s_code javascript and fire off the beacon call when I don’t want that visitor (Google) to be tracked. So after a little brainstorming I came up with a new and improved way to do this. Now when the user agent is determined to be the Google Web Preview bot, then the SiteCatalyst code is prevented from even firing (how it should have been originally). Even better, this can now be done by simply adding a tiny bit of code to the plugin’s section of your s_code file, right next to all your other plugin code. Thats it. No other changes need to happen. No code at the top of the page, no adding calls to functions in the account variable. Just cut, paste, and done.

Here is the code. Just add this right next to all of your other plugins.

/*
 *  Block the Google Web Preview Bot from firing SiteCatalyst code
 */
if(s.u.toLowerCase().indexOf('google web preview')!=-1){s.t=function(){}}

If you are using the original version from below make sure you remove it. And as with any code, make sure you fully test it before deploying to a live site.
~kevin

Google Instant Preview, designed to show you a visual preview of your search results, rolled out in early November 2010. You now have the ability to click a small magnifying glass icon next to each search result to get a snapshot of what the page looks like.
Google Instant Preview

Seems like a pretty helpful feature, but how do they do it? Well it would appear that Google has a new spider that crawls the web and takes snapshots of each page in its results. In order for them to get an accurate look at what the page looks like, this new bot needs to able to execute JavaScript. Here is the problem. Since it is executing JavaScript that means it is also firing off the SiteCatalyst code and is being counted as another visitor and is registering page views.

How can you tell if this new Google Web Preview bot hit your site? If you are capturing User Agent you can see it show up in that report:
User Agent Report

NOTE: If you are not capturing user agent and would like to, a super simple way would be to use the SiteCatalyst Dynamic Variable functionality and include s.eVarX=”D=User-Agent”; in your s_code.js file. Just insert the number of the eVar you would like to use (a s.prop would work too) and you are all set.

Another way to see if you are being affected with spider traffic in your report suite from the Google Preview Bot would be to check out a Browser report (Visitor Profile > Technology > Browsers) and filter it to only show visitors using Safari 3.1 and then trend it.
Browser Report
We can see that this report suite has recorded about an additional 15,000 visitors over the last week that is just attributed to Safari 3.1. Checking the User Agent we saw earlier, the Google Web Preview bot is registering itself as Safari 3.1.

Now that we can see that the Google Web Preview bot is having an effect on our traffic how do we get rid of it? We could block that bot in our robots.txt file, but I like having that additional functionality available for my visitors in the Google search results. I just don’t want it to execute my SiteCatalyst code. Well here is how to do it.

I call this my bot detection code (real catchy title, right?). I currently have it just set to look for the Google Web Preview bot, but it could easily be modified to exclude other bots that can execute JavaScript. Here is how you implement it. In your s_code file, at the top you will have a s_account variable that contains your report suite id. It will look something like this:

var s_account="dead"

To implement the bot detection code you will want to change that line to include the function call. It should look like this:

var s_account=botCheck("dead")

Pretty simple so far, right? We just added the function call and included our report suite id in it. Next we have a block of code that needs to be added to the plug-ins section of the s_code file:

function botCheck(b){var c=navigator.userAgent.toLowerCase(),a="";a+=c.indexOf("google web preview")!=-1?"":b;return a};

And that’s all there is to it. So how does it work you ask? What it does is it removes your report suite id if it is the Google Web Preview bot that is accessing the page. The SiteCatalyst code will still fire off, but it will not include the report suite id so it will be discarded by SiteCatalyst and it will not affect your metrics.

Want to see it in action? I thought you’d never ask! Check out the page http://webanalyticsland.com/test.php. On this page I have a basic SiteCatalyst implementation, one line of code that displays your user agent, and then I print the results of the SiteCatalyst debugger right to the screen. Opening this page in a standard Firefox browser we can see that the SiteCatalyst code has fired off properly, it has displayed the correct user agent and the report suite id is contained within the image request string.
Test 1
So far so good. Using the User Agent switcher plug-in for Firefox, we can switch out user agent to the one that we found in the SiteCatalyst report to mimic the Google Web Preview bot.
Test 2
We can now see that when we use that bot’s user agent string, the report suite id is missing from the image request call. Any action that happens now will not be recorded in my report suite, and when SiteCatalyst receives this request it will be discarded. I’ve had this running for a few days now and have not found any issues, but since this is a pretty new chunk of code be sure to test it out before using it on your production site.

Enjoy!

Google Indexes Omniture SiteCatalyst Tracking codes

Tracking codes are an important part of Web Analytics. They are used by every single analytics tool. I recently came across a problem with how Google handles these tracking codes.

Here is what I have set up. We run the site RV Trader. The site lives at rvtraderonline.com. We also own the domain rvtrader.com, which we have set up with a 301 redirect to rvtraderonline.com. Recently we wanted to know how many visitors were accessing our site by going to rvtrader.com and being redirected to our main site. What I did was set up a tracking code on the redirected URL, using the query string parameter ?zmc=rvt. Now whenever anyone lands on the url rvtraderonline.com with that tracking code I will know 100% that they came from rvtrader.com since this is the only place where this tracking code exists. Pretty simple stuff.

I have been using ?zmc= for a long time now, and have it set in Google Webmaster Tools to ignore this parameter. For those who don’t know, there is a place in Webmaster Tools where you can tell Google what tracking codes that they should ignore. I have it set up to tell Google that it should not index any query string parameters that contain ?zmc=. I just double checked it and I do see that it is still there.
Google Webmaster Tools

Then today we take a look in Google and what do I see:
Google Indexing Tracking Codes

How many of my other tracking codes have been indexed? Can I trust the reporting of any of my Omniture SiteCatalyst tracking code reports now? How many other analystics customers are going to have this problem? Is Google indexing tracking codes from any other Web Analytics tools?

If you are noticing any abnormalities in your tracking code reporting this may be why.