Implement Usage Analysis that Works easily (aka. Google Analytics)

Now this is post I’ve been meaning to write for about a year – there’s quite a few on my todo list. The good thing about this one being so long in the works is that the proposed method of usage analysis is thoroughly tested and works great 😉

For (almost) all the SharePoint sites I’m more or less responsible for we always use the same small piece of code to enable Google Analytics (GA) that coupled with a few custom settings on the GA side of things makes it all very easy to get extremely powerful reporting for nothing. Most of the stuff here relate equally well to non-SharePoint sites.

So what is it?

  1. A fairly simple web control to insert into your master page that includes support for SharePoint user groups
  2. A rather complicated setup in Google Analytics

Why? (or: The value propositions of this solution are)

  • The same code is deployed to all tiers in a farm and can indeed be shared between disparate farms
  • Filters on GA splits up the usage data in whatever categories needed (e.g. dev, test and prod tiers)
  • The actual tracking is done by the client browser, so
    • Even though your intranet servers have no internet connection you can still track your visitors as long as they do
    • You get detailed info about client capabilities that many other analytics tool (that crawls the server logs) miss
    • You only get info about page views not document or image upload/download and some other actions. To get that you need something to crawl the server logs (Nintex reporting would do the trick for SharePoint) L
    • Your data is not polluted by the crawler (it does not run JavaScript)
  • You are not allowed to track individual users (though it is perfectly possible) due to the usage terms (section 7 Privacy) of GA
    • But I’m using some code to track what SharePoint groups the user is a member of, which is very useful. Even though I cannot identify individual users (and generally don’t need it) I can track how many of, say, editors in Sweden are using the site (segment by SharePoint group “editors” and use the map overlay report in GA)
  • You get a very powerful front-end for free (I hate the inflated “enterprise grade” term)
    • Track visitors by almost anything you can imagine
    • Even track the SharePoint search engine (but not the results)
    • Combine every report with segments, so you can limit it to certain SharePoint user groups
    • Schedule reports and distribute to key-users (I love everything that automates some aspects of my job)
    • Finally remember that reports don’t equate insights you need some time to experiment before you get anything but fancy pie-charts out of it 😉
    • The front-end is very powerful and useful but I won’t call it easy to use

I believe that GA is superior to most usage trackers (in particular the one built into SharePoint 2007 or 2010), however what’s missing is

  1. No knowledge of what happens on the server side of things, i.e.
    1. No tracking of page speed (load time, generation time)
    2. No tracking of server load
    3. No tracking of resource (document, image, etc.) actions
    4. No personally identifiable information
  2. Reliant on the client to provide the data
    1. JavaScript and cookies must be enabled
    2. Vulnerable to add blockers
    3. Internet connection must be available (that is actually not always a given even in this day – different countries, different cultures)

If you really need those things you can always use SharePoint’s usage analysis (at least the search usage analysis reports are very useful) or a third party tool like Nintex Reporting (others do exist).

As of today the features of neither Nintex Reporting nor the advances made with SharePoint 2010 replaces the value of GA. To get the best data you need both.

The Code

This is quite simply a web control that emits the required JavaScript to enable GA. I’ve used the latest version of the tracking scripts, but if/when Google changes it the control will need to be updated. As far as I know that has only happened once so far and they were kind enough to maintain backwards compatibility.

There is one twist; it will dump all the SharePoint user groups that the current user is a member of into the custom variable and ship it off to GA, which can then be used to segments the users. The field is called “User Defined” in GA. If you are not using this for SharePoint then remove or change that piece of the code.

Inject it anywhere into your masterpage using something like (remember a tagprefix for your assembly):

<WebParts:UsageAnalysis runat=”server” id=”tracking” UAKey=”UA-XXXXXX-1″ />

Download the code here (no need to inject 100 lines of code here).

GA setup

My primary goal is to be able to use the same GA control unchanged through all tiers, possibly through all farms. That is, I don’t want to change the tracking tag, because then I would need different master pages or some other logic in the web control, which defeats the purpose of staging environment just a little bit.

That means that all data goes into one big pile and I then use filters based on the host name to group them into different profiles.

Get the Account and Master Profile

To start it all, go to GA and get a new account and a master profile with a tracking tag (the term is “Web Property ID” in GA). You only need one.

(Disclaimer: Don’t expect this to be a point and click guide all the way through)

  1. Create a new account with a default “master profile” for all data. In the Create New Website profile page choose “Add a profile for new domain” and use the URL for your production site. It does not matter what you use here, it is just a reference used later. Copy the tracking id (“U-XXXXXXX-1”) to the master page or web control above
  2. As different domains goes into this profile we need to add the host name to the URI being tracked through a filter. Setup a filter for this profile like this (I usually apply this filter to ALL my profiles as I’m then able to easier spot errors in the filtering):

  3. GA uses case sensitive URIs in some reports (or it shows in some, most don’t matter) therefore I like to convert them all to lowercase. Setup and attach a new filter like this (apply to all profiles):

Setup Profile for Dev, Test, QA, Prod etc.

Now you are ready to create a couple of additional profiles for whatever specialised reporting you might need. I usually have one for DEV, TEST, PROD tiers and possibly for subareas on PROD. There’s nothing here you couldn’t get from the master profile but it does make reporting a lot easier and makes it possible to limit/allow your business administrators to access reports for only their sites.

To create a new profile click the “Add new profile” button on the overview page:

You can see that I’ve created one for DEV, TEST and PROD. More will follow.

Now you need to create a bunch of filters that allows you to extract data from the Master Profile (that contains everything) into the specialised profiles. The trick is to create include or exclude filters based on the host header for each page hit.

The filters are based on regular expressions on the hostnames. I’ve been fond of regular expressions for many years – if you are not then GA has some syntax help. I generally find that it’s fairly easy to distinguish the test and dev environments. Prod is then everything excluding dev and test.

I often use the following filters:

Name Type Regular expression
Test env only Include ^test\..*

(Anything that starts with “test.”)

Dev env only Include .*dev.*|.*udv.*|^[a-zA-Z0-9]*$

(Anything that includes “dev” or “udv” anywhere in the hostname or any single word hostname without periods (e.g. “extranet” or “intranet”)

Exclude test Exclude ^test\..*
Exclude dev Exclude .*dev.*|.*udv.*|^[a-zA-Z0-9]*$

Note that especially the dev filter will need to be adjusted to suit your individual needs.

For the prod profile I then use “Exclude test”, “Exclude dev”, “Append hostname” and “Request URI to lowercase” filters.

Note: It takes a little while for results to show in GA, usually hours and certainly less than a day.

Useful reports

Three simple reports that I find useful to show of GA to new people is usually

  • The Content / Content Drilldown report that answer questions like how many visits is there on a given page and how many on the subpages (if looking at visits)
  • The Visitors / Map Overlay report that correlates number of visitors to countries and present a nice colour coded map
  • The Visitors / Browser Capabilities / Browsers that shows the browser statistics (and yes, marketing and developers are the ones that don’t use IE)

Segments

Segments are a sort of filtering that you can apply on top of (almost) any given report. That in itself is quite useful. I’ve used the function to be able to filter reports on what SharePoint user groups the visitors belong to (see the code linked above).

For instance the geographic distribution of visitors on one of my sites that is not part of the HQ

To set it up I click the segment button / “Create a new advanced segment”. Drag the “User defined Value” to the box below.

The web control dumps the list of SharePoint user groups into the “user defined value” with a semicolon as separator. It is therefore just a question of creating matching text filters possibly by combining a number of criteria (even regular expressionsJ).

Happy reporting!

Advertisements

About Søren Nielsen
Long time SharePoint Consultant.

6 Responses to Implement Usage Analysis that Works easily (aka. Google Analytics)

  1. Bill Richardson says:

    Nicely done. One issue though – it’s perfectly acceptable to track individual users by username. The usage terms for GA that you refer to (Section 7 – Privacy), say that you can’t track PERSONALLY IDENTIFIABLE INFORMATION. That’s not what most people think it is – that is, username is NOT PII. Here’s a wikipedia article that does a great job of discussing what it is.

  2. Thanks Bill

    I think it’s a fair point that it’s debatable what is PII and what is not. The wikipedia link were gobbled by wordpress (is it this one? http://en.wikipedia.org/wiki/Personally_identifiable_information)

    I still think however that a username is PII as it very easily correlates to your company’s HR system and it does identity you personally.
    In that regard is just as much a piece of PII as the social security number is.

    Still… I doubt google really cares (until the day someone complains)

  3. Danco says:

    Have to write business requirements for Tracking and A/B testing (Google Analytics/Web Optimizer).CMS platform will run on Sharepoint 2010.

    Pretty new in this so every help is more than welcomed

    • Søren Nielsen says:

      Use what you can from the post (feel free to link to it).

      I would also add that a new feature has been added to ga and html5 that allows client rendering speeds to be tracked. The next time I install ga tracking I’ll be sure to add the few extra lines required.

      Extremely useful as you will be able to see how your users experience the page speed – usually varies greatly between continents and it will give you very good data about what is worth to spend time on optimising.

  4. On their domestic and trans-Tasman networks, Air New Zealand has
    Area + seats out there complimentary for Koru Club elite members and for a
    small cost at check-in for others.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: