Our secure and easy-to-use client portal lets you access hundreds of on-demand training sessions, download our proprietary tools and manage your account and billing information. Not client? Take a look around to see what you can expect when you partner with Adswerve.
Subscribe to our monthly newsletter to get the latest updates in your inbox
When the numbers just don't add up, what can you do? The image at right illustrates this in a perfect way. Notice the two blue dots? Is one bigger or smaller? Look carefully. The answer: it's all about
perspective. I recently received an email from a reader that went like this: [quotebox]For 8 years we've been using Urchin 6, provided by our web hosting company. Now that we've switched hosts, we started using Google Analytics, and instead of reporting 50,000 total "sessions" a month, we now see 10,000 total "visits" a month. How is it possible for the numbers to be so different, when both companies are owned by Google? Which numbers are correct?[/quotebox] This is a great question, and not an uncommon one to hear. I want to delve into what goes into the differences that commonly arise between web analytics tool, even ones from the same company with a shared past (Urchin and Google Analytics, case in point).
Know Your Tools
With this question in particular, the case is one where the products being compared (Urchin vs. Google Analytics) are likely not very close, despite both being owned by Google. Given that the asker noted using "Urchin" for 8 years and that it was provided by their hosting company, my guess is that they are using Urchin 5, or even 4 since Urchin 6 was released in April of 2008. The hosting company may have upgraded, but still it is a far cry from saying the tools are the same.
Know What You're Comparing
There are two key ways to get web analytics data:
From web server log files
From JavaScript tags that use cookies and tracking pixels
I like to think of the difference in these two tools as looking at the top-side or under-side of the same rug. The under-side shows a complex mis-mash of strings while the top-side shows a beautiful pattern. Web server logfiles are literally the
server's perspective on what happened, while the tag-based method is literally the
user's perspective on what they did while on the site.
This is the most critical aspect to consider. If you want to answer questions about what the servers were doing, look at server logs. If you want to answer questions about what
people were doing, look at tag-based data. For more on this,
Brian Clifton does a great job going into more detail on this topic
in his book. Back to the question at hand: I've found hosting companies are notorious for providing "plain vanilla" stats packages. A "stock" Urchin profile will report on the web server logfiles, NOT JavaScript tags + cookies that Google Analytics or Urchin UTM reporting is based on. This is a totally different way to analyze usage and is good for the server operators, but has little to do with the reality of website traffic from
people.
Understanding Data Sources
The server log contains ALL hits, whether from humans or non-humans. I find usually most logs are comprised of about 60% data from non-humans, i.e. search engine robots, content scraping bots, etc... An improperly configured Urchin profile will report this all as the same and will identify sessions based on simple IP Address + USer Agent combinations of hits in a 30 minute time window. Thus, you usually get a much larger number of "sessions" reported than what is actually happening. Google Analytics and Urchin using UTM tags will report
only people visiting your site since the mechanism of reporting is based on the visitor's modern web browser executing JavaScript on each page load. It also uses cookies so it is precise to the computer/browser vs. less precise when based on IP address. The numbers are almost always lower than what you see in server log based data, and usually at least 50% lower. I've seen it as high as 90% lower depending on factor.
Truth is Rather Gray
It's not that IP+User-Agent data is "wrong" per-se, however it must be interpreted in its context. If it were up to me it would be a crime for hosting companies to not make this clearly known, because you've been thinking you're reporting "people who visit the site" when in reality you're vastly over-reporting that number because the bots are probably included. All this to say, a disparity is common.
The Solution
If you have questions about the quality of your data I would recommend that you conduct an full audit of your data. Sometimes if your GA/UTM tags aren't placed on all parts of your site the data will be falsely low because it won't be complete. To really be able to unravel the knot created by this and explain it to executives you'll need to be able to show
why the numbers have changed, explain what goes into the numbers from each system, and help guide the transition to better data. If you have your old server logfiles around and your Urchin profile allows filtering and re-processing then we can take some measures to filter out bots, but the numbers still won't line up well -
think of it like measuring the height of your desk in Centimeters vs. Inches - same desk, different scale. A few tools that you can use to help auditing of your data: