Improve Accuracy & Identify Traffic That SiteCatalyst Can’t

I’ve been doing a lot of work recently with my Traffic Sources reports. My goals have been to clean up messy data that could come in, and to make it easier to look at traffic from different sections of the same referrer. Now I would like to see what I can do to make the standard Referrer and Referring Domains reports a little more accurate, and try to fill in some of the holes they create which prevent me from getting a really good summary view of my traffic.

Overall the standard Referrer and Referring Domains reports do a pretty good job at telling me where my visitors came from, but there is one item that is a major problem for me. That one item is called “Typed/Bookmarked”.

According to the SiteCatalyst Knowledge Base, “Typed/Bookmarked line items occur in reporting where a referrer for an image request is not present.” So in other words, if SiteCatalyst does not see a referrer value, then it simply can not tell you where that visitor came from, so they get dropped into the Typed/Bookmarked bucket. Typically that’s fine. There is no way to know completely where every single one of your visitors came from. Thats just the nature of the beast. But one problem I have is that even though SiteCatalyst may not know where that visitor came from, I possibly do. So how do I know but SiteCatalyst doesn’t you ask? Tracking codes.

Yesterday Omniture shared a study that was done by BtoB Magazine which said “email is used by 88% of marketers surveyed and ranked as their No. 1 form of digital outreach”. Its been no secret that running email campaigns is a great way to get more visits for your site, and ultimately more conversions. Judging by my inbox there are a lot of marketers out there that agree. Email marketing best practices recommend that tracking codes are included on all of the URL’s in the email. This is typically the best way to determine the effectiveness of those email campaigns, and hopefully it’s something you’ve been doing with your own email campaigns. The problem with this is that most email applications are not going to pass a referrer value to the site. So even though we are able to track the performance of these emails in our campaign reporting, when looking at our Referrer reports we are not able to see that traffic credited to the correct source. No referring value means the Traffic Sources reports consider it to be Typed/Bookmarked traffic, when we know it isn’t. Our Typed/Bookmarked values get over inflated, and the email campaign traffic doesn’t get properly credited. So what can be done about it?

Here’s what I like to do. I add in a tiny bit of code to the doPlugins section of my s_code.js file that checks to see if the image request has no referring URL, and if the current URL has a tracking code associated with one of my email campaigns. If that criteria is met then inject a specific referring domain value to my traffic sources report, correctly attributing that visit as being from one of my email campaigns. The code to do this is quite simple:

	var s_eml = s.getQueryParam('eml');

Now lets say I’m running an email campaign which contains links to my home page. I made sure that the URL’s for those links have the query string parameter eml=56789. The parameter eml is the tracking code I use specifically for my email campaigns, and 56789 is the identifier for that specific campaign. When a visitor tries to access a page of my site using one of those URL’s containing my email tracking code and they do not have a referring URL value, my normal campaign tracking does it’s job, and this new snippet of code inject’s the value of mail://email.campaign/56789 as the referring domain. If the visitor was using using an email application that did pass a referrer value, then that passed value will always take precedence. Injecting that new value as the referring URL will accomplish a couple of things. First that whole value will now appear in my Referrers report. With that I’m able to compare the traffic generated from specific email campaigns to other traffic sources. Comparing traffic generated from an email campaign to traffic generated from an organic source wasn’t always the easiest thing to do in a single SiteCatalyst report.
Email Referrer

Next in my Referring Domains report I will get the value of email.campaign, and more importantly I won’t register another instance of Typed/Bookmarked. With this I can get a look at the traffic generated from the email campaign compared to all my other known referrers as a whole to see how it stacks up.
Email Referring Domains

Here’s an additional bonus I get from doing this. If you take a look at that URL value I used as the referrer value, it does not begin with http://, but it begins with mail://. In the Traffic Sources reports you will find a report called Referrer Type. This report is basically a glorified SAINT classification that looks at each referring URL and assigns it to a different bucket. When a SiteCatalyst see’s a referring URL beginning with the value of mail:// or the value of imap:// it then gets classified to the Mail bucket in the Referrer Type report. I’m now starting to get a better view of all my traffic sources in one pretty graph.

Email Referrer Types

Another source of traffic for some sites that is also not being accurately represented in the Traffic Sources reports is when a visitor comes to the site by clicking a link in a pdf. Last week after the latest iPad was announced, a friend sent me a pdf that came from Apple about using the iPad at Work. It contained good information, so I passed it along to a couple of other friends. Looking at the pdf a little closer, there was one thing that caught my eye. It contained 25 individual links back to I wasn’t viewing this email in a web browser but in the simple Preview app on my Mac, yet every single one of those links was clickable. I thought that was great, another opportunities to drive traffic to the site. But this was also not so great because it was another opportunities to take a visitor and classify them as being Typed/Bookmarked, even though Apple could know easily and specifically where they were coming from.

Much like with emails, all it takes is a tiny couple of lines of code to identify that traffic. I like to use the value of pdf= as the query string parameter for links embedded in pdf’s, but you can obviously change it to whatever you like.

	var s_pdf = s.getQueryParam('pdf');

Just like before with the emails, the current URL needs to contain that tracking code and there must be no referrer value present for this to work.

Taking a look at that snippet of code, I used the referring URL domain value of file://pdf.document combined with a unique identifier for that pdf. Unlike with the email’s, this time I started the value with file://. This will now get assigned a new value in the Referrer Type report, the value of Hard Drive. Not the best description of what’s going on, but that’s the only value left. The Referrer Type report is set to classify all referrers into 6 different buckets, and there is no way to add any additional categories to it.

All Referrer Types

Another popular source of traffic I’ve seen that does not get proper credit in Traffic Sources reports are visits that come from a mobile application. There are tons of mobile apps which have some component to them that can take the visitor from the app to their website. Like all the other links, you hopefully have the ones in your mobile app tagged with a tracking code. If so then just one more snippet of code and that traffic can be accounted for:

	var s_app = s.getQueryParam('app');

I used the tracking code of app, but you can use what ever you would like. This app referred traffic will also be listed in the referrer type report as Hard Drive. Since I used up all the options in that report, all the extras such as mobile app traffic and pdf traffic will need to share the Hard Drive bucket.

I like to keep all three snippets of code wrapped up in one clean package:

	var s_eml = s.getQueryParam('eml');
	var s_pdf = s.getQueryParam('pdf');
	var s_app = s.getQueryParam('app');

		s.referrer = "mail://email.campaign/"+s_eml;
	}else if(s_pdf){
		s.referrer = "file:///pdf.document/"+s_pdf;
	}else if(s_app){
		s.referrer = "file:///mobile.application/"+s_app;

Now in my Referrer and Referring Domains reporting, I get a better look at how my visitors are arriving to my site.
All New Referrers

All New Referring Domains

Maybe you have some other kind of web app, or some kind of shared widget, or some other totally different way for visitors to follow a link to get to your site. So long as you can add a tracking code to that link you can get that traffic correctly represented in your Traffic Sources reports, and stop over inflating the Typed/Bookmarked metric. This is not meant to replace the Marketing Channels reports, the Channel Manager plugin, or the Unified Sources VISTA rule, but to improve the functionality and accuracy of some of the core SiteCatalyst reports.

I hope this helps.

How To Improve Referring Domain Reporting In SiteCatalyst

While going over my Traffic Sources reports recently, I realized that I had a problem with how my referrers were reported. Not a problem with bad data coming into SiteCatalyst, like the recent issue I fixed regarding the Google Plus referrers, but this was a problem with the Referrers and Referring Domains reports themselves.

Both reports are working as they were designed, but the problem is that both reports are on opposite ends of the spectrum from one another. I feel that at times the Referrers report gives me too much information, while the Referring Domains report doesn’t give me enough. I need a something a bit more in the middle.

Here’s an example of the problem. Checking the Referring Domain report in SiteCatalyst, this is a snippet of what I see:
Referring Domains Report in SiteCatalyst

I can see that some of my visitors came from Google, and some from Yahoo, and that’s all this report tells me. When I see something like that, I start to wonder which sections of the referring site lead to the visitors deciding to come to my site. Well, the first thing I could do in an attempt to answer my question would be to click the magnifying glass icon next to each line item in the Referring Domains report to get a quick view of what referrers specifically mades up that value. Unfortunately the little popup window doesn’t break out the referrers in any kind of a useful way, and I can’t do anything with that popup report myself because there is no export option in that window, so that option’s out. I could do a standard sub-relation breakdown of the two reports, then export the results into Excel, then do some sorting and filtering and could probably get the answer I need for a single Referring Domain value in roughly 5 to 10 minutes. But then I’m stuck having to go back and do that whole process again for the other Referring Domains. This method could end up taking a while, so this option is also out. I’m sure I could whip something up in Data Warehouse, but I won’t see that report anytime soon either. I could go set up a ton of Processing Rules, but that won’t work because I would end up using all the rules just to help me populate one variable. So how do I get the report I need? Looks like it time for some more s_code magic.

And voilà. I give to you the s.getFullReferringDomains plugin. What this will do is look at that referrer value and grab not only the domain name, but any subdomains it may have as well. This uses the same s.linkInternalFilters values that you’ve already set up to ensure that this will only set its value when there is an external referrer. So now I will get a better, quicker, easier look at my referring domains:
Full Referring Domains Report in SiteCatalyst

All that needs to be done is to choose an unused variable, and call the plugin from it. This should be placed inside of the doPlugins section of the s_code.js file.

s.eVar1 = s.getFullReferringDomains();

Then just add the plugin itself to the plugins section of your s_code file.

 * Get Full Referring Domains
s.getFullReferringDomains=new Function(""
+"var s=this,dr=window.document.referrer,n=s.linkInternalFilters.spli"
+"t(',');if(dr){var r=dr.split('/')[2],l=n.length;for(i=0;i<=l;i++){i"
+"f(r.indexOf(n[i])!=-1){r='';i=l+1;}}return r}");

One last thing I like to do with this new report is to go into the Report Suite manager in the Admin Console, and move it to the Traffic sources menu. This way all of your referrer related reports will all live in one spot. It looks like it should have lived there all along.

Full Referring Domains Traffic Sources Report in SiteCatalyst


Capture Mobile Device Screen Orientation In SiteCatalyst

Recently I was speaking to someone who was in the process of creating an tablet experience for their visitors. At one point they asked the question “of my iPad visitors, how do I find out in what format do they view my site the most in, landscape or portrait?”. I started going through all of the reports in SiteCatalyst and tried to find the answer, but that information just was not available. So I decided to whip a little bit of code that would figure this out for us. I call this the screenOrientation plug-in.

Basically what this will do is it will check to see in what position the mobile visitor is viewing the site in, whether they are viewing the site in a portrait or a landscape view when the page loads, and capture that value into a SiteCatalyst variable.

To implement this plug-in you first need to take this code, and add it to your s_code file near the rest of your plug-ins.

function screenOrientation(){switch(window.orientation){case 0:case 180:return("Portrait");break;case 90:case -90:return("Landscape");}window.scroll(0,0)}

Next in the do_plugins section of the s_code file, add the call to the plug-in to what ever SiteCatalyst variable you want this data captured in. In the example here you can see I am capturing it in s.prop1


Thats all it takes. Once the code is implemented, if the device does not have an orientation value the variable will not capture anything, but if the visitor is on a mobile device with an orientation value, the value of Landscape or Portrait will be captured on each page load. You will end up with a report that looks something like this:

Mobile Screen Orientation Report

To make it easier to access the report I also moved it into the Mobile report menu by using the customize menus option in the admin console of SiteCatalyst.

SiteCatalyst Mobile Reports


Video Introduction To SiteCatalyst Target Reports

The SiteCatalyst Target report is one of my favorite reports that I don’t see used enough. THis video gives you a brief introduction to the report, and walks you through setting one up, and how to view the results.


How To Stop Google Preview From Being Counted In SiteCatalyst – Updated

UPDATE: I have a much better way to block the Google Web Preview bot from being tracked as a visitor in SiteCatalyst. The original solution I had posted here, required a block of code to be placed at the very top of the s_code file, and your account ID was put into a function call. Then when the s_code was fired, a function would first check the user agent of the visitor to see if it was the Google Web Preview bot, and if so then swap out the account ID with a blank value. The SiteCatalyst code would still fire, but when Omniture received the image request it would get discarded because of the missing account ID. The more I thought about this I figured there had to be a better way. No reason to execute all the s_code javascript and fire off the beacon call when I don’t want that visitor (Google) to be tracked. So after a little brainstorming I came up with a new and improved way to do this. Now when the user agent is determined to be the Google Web Preview bot, then the SiteCatalyst code is prevented from even firing (how it should have been originally). Even better, this can now be done by simply adding a tiny bit of code to the plugin’s section of your s_code file, right next to all your other plugin code. Thats it. No other changes need to happen. No code at the top of the page, no adding calls to functions in the account variable. Just cut, paste, and done.

Here is the code. Just add this right next to all of your other plugins.

 *  Block the Google Web Preview Bot from firing SiteCatalyst code
if(s.u.toLowerCase().indexOf('google web preview')!=-1){s.t=function(){}}

If you are using the original version from below make sure you remove it. And as with any code, make sure you fully test it before deploying to a live site.

Google Instant Preview, designed to show you a visual preview of your search results, rolled out in early November 2010. You now have the ability to click a small magnifying glass icon next to each search result to get a snapshot of what the page looks like.
Google Instant Preview

Seems like a pretty helpful feature, but how do they do it? Well it would appear that Google has a new spider that crawls the web and takes snapshots of each page in its results. In order for them to get an accurate look at what the page looks like, this new bot needs to able to execute JavaScript. Here is the problem. Since it is executing JavaScript that means it is also firing off the SiteCatalyst code and is being counted as another visitor and is registering page views.

How can you tell if this new Google Web Preview bot hit your site? If you are capturing User Agent you can see it show up in that report:
User Agent Report

NOTE: If you are not capturing user agent and would like to, a super simple way would be to use the SiteCatalyst Dynamic Variable functionality and include s.eVarX=”D=User-Agent”; in your s_code.js file. Just insert the number of the eVar you would like to use (a s.prop would work too) and you are all set.

Another way to see if you are being affected with spider traffic in your report suite from the Google Preview Bot would be to check out a Browser report (Visitor Profile > Technology > Browsers) and filter it to only show visitors using Safari 3.1 and then trend it.
Browser Report
We can see that this report suite has recorded about an additional 15,000 visitors over the last week that is just attributed to Safari 3.1. Checking the User Agent we saw earlier, the Google Web Preview bot is registering itself as Safari 3.1.

Now that we can see that the Google Web Preview bot is having an effect on our traffic how do we get rid of it? We could block that bot in our robots.txt file, but I like having that additional functionality available for my visitors in the Google search results. I just don’t want it to execute my SiteCatalyst code. Well here is how to do it.

I call this my bot detection code (real catchy title, right?). I currently have it just set to look for the Google Web Preview bot, but it could easily be modified to exclude other bots that can execute JavaScript. Here is how you implement it. In your s_code file, at the top you will have a s_account variable that contains your report suite id. It will look something like this:

var s_account="dead"

To implement the bot detection code you will want to change that line to include the function call. It should look like this:

var s_account=botCheck("dead")

Pretty simple so far, right? We just added the function call and included our report suite id in it. Next we have a block of code that needs to be added to the plug-ins section of the s_code file:

function botCheck(b){var c=navigator.userAgent.toLowerCase(),a="";a+=c.indexOf("google web preview")!=-1?"":b;return a};

And that’s all there is to it. So how does it work you ask? What it does is it removes your report suite id if it is the Google Web Preview bot that is accessing the page. The SiteCatalyst code will still fire off, but it will not include the report suite id so it will be discarded by SiteCatalyst and it will not affect your metrics.

Want to see it in action? I thought you’d never ask! Check out the page On this page I have a basic SiteCatalyst implementation, one line of code that displays your user agent, and then I print the results of the SiteCatalyst debugger right to the screen. Opening this page in a standard Firefox browser we can see that the SiteCatalyst code has fired off properly, it has displayed the correct user agent and the report suite id is contained within the image request string.
Test 1
So far so good. Using the User Agent switcher plug-in for Firefox, we can switch out user agent to the one that we found in the SiteCatalyst report to mimic the Google Web Preview bot.
Test 2
We can now see that when we use that bot’s user agent string, the report suite id is missing from the image request call. Any action that happens now will not be recorded in my report suite, and when SiteCatalyst receives this request it will be discarded. I’ve had this running for a few days now and have not found any issues, but since this is a pretty new chunk of code be sure to test it out before using it on your production site.


4 Things I Would Change About Omniture’s Data Warehouse

Omniture’s SiteCatalyst can give you a tremendous amount of data. You can slice and dice tons of different metrics and the drop of a hat. Lets say one day the CMO comes to you and asks for one of those left-field ad-hoc reports you randomly get, something like “how much revenue was generated on August 3rd between 1 AM and 4 AM by first time visitors who came to the site from our latest campaign on Bing, visited pages A, B, and C, using a Safari browser….” and on and on. You will soon find as you get requests for reports that are so far out there, that you need to visit one of Omniture’s most powerful tools, the Data Warehouse. Here are a couple of changes I wish Omniture would make to give us a better Data Warehouse user experience.

I request a ton of reports. I’ve been requesting them for some time now, so I am pretty in tune with how long they will take to come back to me. So I request my report, and wait the amount of time I feel it should take, check my inbox and nada. Hmmmmm. Lets log into the Data Warehouse request manager and take a look at what’s going on.

Houston we have a problem. Looks like there was en error with the report. Wouldn’t it have been nice to get an email sent to me letting me know there was an error? I had to log into the tool and find out myself that there was a problem. Well now that I’m here, lets click on the report name and see if we can find out a little more info on what went wrong.

Ok from here we can see that the report was “Not Delivered”. I figured that much out when it didn’t show up. Maybe clicking on that link will tell why it was not delivered.

Well that’s just not very helpful at all. Did I enter some incorrect FTP information? Was there no information to return? Did someone trip over the cord and unplug the server? We will never know.

So now what? I obviously wanted that report, so I am forced to go and request it again. If I didn’t want it, I wouldn’t have requested it in the first place. How about being a little proactive and automatically re-requesting the report for me? A simple email stating that there was an error but my report is being processed again would have save me a bunch of time. Or even if it said that there was an error because of mistakes I made in the request would have been extremely helpful.

There are a handful of reports that I have scheduled that have top priority. I need to make sure these reports are ran before anything else that is scheduled at the same time. For this, I turn to the priority settings.

Seems pretty simple, just click the arrows and my report moves up or down. The problem lies in where you have 60+ scheduled reports and moving one to the top of the list now turns into an hour of clicking arrows. How can this be improved? If this was a little more like the system used on Netflix, it would have been handled in a few seconds.

Another gripe I have with the priority arrows is that anyone can go in there and fiddle with them. After spending an hour moving my mission critical reports to the top of the list, any other person with access can go in there and just push my reports back to the bottom of the list. The priority arrows should be an admin feature only.

There are times when I have a lot of reports that I need all at one time. I just scheduled 24 reports last week that come daily all at the same time. I usually try to space out the reports as much as possible when I can, but the smallest granularity that is offered to me is just hourly.

If I could schedule reports on a quarter hour, or even half hour granularity, it may ease a little bit of the load on the Data Warehouse servers and I may get my reports delivered a little quicker.

Metric Additions
As business requirements change, so do the metrics delivered in reports. I have some complex reports scheduled and it would be great if i could open up the report and add a metric instead of canceling the report and starting from scratch.

If someone comes along and asks if we can add 2 more events to this report, I should be able to just click some kind of “edit metrics” button, add in what I need and be all set. Too bad it isn’t that easy.

The Data Warehouse is a very powerful tool that does work extremely well. Most casual users may never even notice these items, but as you get more advanced in using the tool, as your business requirements change over time, you too may feel the need for these enhancements.

Get More From Your Campaign Tracking using the clickThruQuality Plugin

NOTE: UPDATED 7-22-09 Many of us are using the s.getQueryParam plugin matched with the s.campaign variable to track your paid search. You have your paid search click thru URL and your tracking code right on the end, something like and the query string parameter value gets dumped right in to your campaign measurement. This as we know is the preferred method of tracking paid search. You can pull up your campaigns report, add in your conversion events and there you go. You can see which campaigns converted and which ones didn’t. Unfortunately that gives you an all or nothing view of things. What if you wanted a little more? This sounds like a job for the s.clickThruQuality plugin.

What this plugin does is sets an event on each time a visitor clicks through to your site. Nothing exciting yet. But then when the visitor makes it one page past that landing page, it sets a second event. Now you can see which ad group, landing page or campaign engaged the visitor a little more than the rest. You will end up with a report that looks a little something like this:
Click Thru Quality Report
This example is pretty high level, but using individual tracking codes for your keywords this report can really give you a good look at your paid search campaigns.

First thing you need to do is to set one variable, I have it right after the s.usePlugins=true call.

/* CTQ variables */
var i=1;

Next you will need to use two events. A call to the plugin needs to be added right after your campaign tracking code:


In the call to the plugin you need to add in which tracking code you are looking to track, and the two events you want to use. Name one event Click Through and the other Click Past. Then add the actual plugin code into the plugins section of the s_code.js file:

 * Plugin: clickThruQuality 0.8
s.clickThruQuality=new Function("scp","tcth_ev","cp_ev","cff_ev","cf_th", ""
+"if(i<=1){var ev=(',':'');if(s.getQueryParam(scp)){"
+"tcth_ev;if(s.c_r('cf')){var tct=parseInt(s.c_r('cf'))+1;s.c_w('cf',tct"


Quick Tip: Correlating Servers with 404 Error Pages in SiteCatalyst

Looking for a quick win? Try correlating your 404 pages with your servers report. On larger sites, multiple servers are used to deliver the pages with some kind of load balancing. The problem that can come of this is when there is a code push, errors can occur and the push does not make it out to all of the servers and errors ensue. By being able to break down your 404 error pages by the server that is deloivering them can help spot troublesome machines, or code pushes that didn’t quite make it out to every machine.

Not using the s.server variable yet? Get on it!

Tip: Finding Missing Revenue Opportunities With Web Analytics Data

Revenue. The goal of every Website. As long as it’s up, everyone is happy. When it’s down, there are a few less smiling faces. But what can you as a Web Analyst do to help that? I have all these cool numbers, but how can I help turn them into revenue? This tip is one of my favorites.

The majority of the sites I work on are classified advertising sites. They are quite simple in nature, at it’s core consisting of a home page with search functionality, search results pages, and listing pages. Now there are actually much more pages than these, but this is the real bread and butter of the site. One neat item I like to track is the number of results returned when someone does a search.

Taking a look here at eBay, I do a search for Apple iPhone and it shows me how many results were returned for that search.
Number of Search Results
Now I like to grab that number and record it into a s.prop in SiteCatalyst. Ok so now you end up with a report full of numbers. How is this useful to me? Let’s open SiteCatalyst and take a look at this new Results Returned report and take a look at what we have.
Results Returned Report
Look what this tells us. For the time period selected, we had just under 195k times where someone did a search on the site and got nothing. The marketers did what needed to do to drive traffic to the site, only for these people to not get what they wanted. Now let’s take it a step further. What were these people looking for? That’s where Data Correlations come into play. If you are capturing all the other elements of that visitors search on your site, you are now able to break down that 0 Results Returned, and find out where they were searching and for what item. You know the exact make, model, city, state and whatever else about what gave this visitor nothing from your site.

You now have pinpointed where you have a demand with no supply. Once you know this you can help work to tailor your site to give your users what they want, and what they expect out of your site, and generate that revenue you are missing out on by having a visitor leave your site not being able to find what they want.