Asynchronous Google Analytics is Better but Not Faster

calendarJuly 29, 2010 in Firefox , HTTP , HttpWatch , Internet Explorer , Javascript , Optimization

In December 2009, Google launched the asynchronous version of the Google Analytics script. The update aimed to address potential script blocking problems that have been extensively researched and reported by Steve Souders at Google.

Steve’s blog post about the new asynchronous loading of Google Analytics identified three potential benefits:

  1. Your pages should load faster
  2. Availability or performance problems at Google should have less impact on your site
  3. Analytics data is more likely to be collected if a user leaves a page early

Before applying the change to our web site we decide to compare the new and old versions of the Google Analytics scripts using HttpWatch 7.0 .

The following sections describe what has changed in the asynchronous Google Analytics script and the tests we performed to compare it to the traditional synchronous version.

What has Changed?

Google Analytics collects data using two components on a page:

  • A javascript file script file that is loaded from http://www.google-analytics.com/ga.js
  • A 1×1 pixel image beacon (http://www.google-analytics.com/__utm.gif) that passes data back to Google in query string parameters

You enable Google Analytics on a web page by calling a page tracking function in ga.js. This function automatically gathers analytics data, such as operating system and browser versions, then generates the call to the image beacon.

The difference between the two ways of loading Google Analytics is in the way that the script file is loaded. In the traditional synchronous version, two small script tags are added at the end the page’s <body> tag:

...
  <script type="text/javascript">
    var gaJsHost = (("https:" == document.location.protocol) ?
    "https://ssl." : "http://www.");
    document.write(unescape("%3Cscript src='" + gaJsHost +
    "google-analytics.com/ga.js' type='text/javascript'%3E%3C/script%3E"));
  </script>
 
  <script type="text/javascript">
    var pageTracker = _gat._getTracker("XX-XXXXXX-X"); // Your GA id
    pageTracker._trackPageview();
  </script>
</body>
...

The first script tag ensures that the correct HTTP or HTTPS version of the qa.js file is loaded. The second script tag then calls into the javascript file triggering the beacon image download.

These two script tags are placed at the end of the body tag to ensure that they don’t hold up the download of any other resources on the page. The disadvantage of doing this is that the analytics call may not be triggered if the user exits the page before it has completely downloaded.

The asynchronous version of the Google Analytics loading code uses a single script tag at the end of the page’s <head> :

...
 
  <script>
    var _gaq = _gaq || [];
    _gaq.push(['_setAccount', 'XX-XXXXXX-X']);
    _gaq.push(['_trackPageview']);
 
    (function() {
      var ga = document.createElement('script');
      ga.type = 'text/javascript';
      ga.async = true;
      ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 'http://www')
        + '.google-analytics.com/ga.js';
      var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(ga, s);
 
    })();
  </script>
 
</head>
...

It sets up the parameters required to make the call into ga.js but doesn’t invoke the function directly. A script element is added to the document allowing asynchronous download without blocking other elements of the page. The HTML 5 async attribute is also set on the script tag for browsers that support it.

Once the ga.js file is loaded and executed it looks for an array variable called _gaq and executes the function with the previously specified arguments.

Do Pages Load Faster With Asynchronous Google Analytics?

To try this out we created a version of our Download page using both versions of the Google Analytics loading code. We first tried the synchronous version with an empty cache in IE to simulate a new visitor to the page:

Synchronous GA Test With Empty cache in IE

And then with the asynchronous version:

Asynchronous GA Test With Empty cache in IE

The page load times were dominated by other components on the page. Using the asynchronous version of Google Analytics didn’t really make any difference.

We tried the same tests with a primed cache to see if there would be a greater impact when the page was loaded during a repeat visit. First the synchronous version:

Synchronous GA Test With Primed cache in IE

and then the asynchronous version:

Asynchronous GA Test With Primed cache in IE

Again there was practically no difference in the page load time of the page allowing for variability in our tests.

The reason for that is that calls to Google Analytics are incredibly fast (usually around a 100ms or less) and the download of the image beacon doesn’t block other components because it uses a different hostname.

We tried Firefox 3.6 and got almost the same results.

Conclusion: Google Analytics is so fast that you won’t see any significant improvement with the asynchronous loading version.

Would Google Performance Problems Have Less Impact With Asynchronous Google Analytics?

For this test we needed to simulate performance problems at Google. We did this by modifying the standard script snippets to call our own ASP.NET versions of the Google files that served the same content but added a 5 second delay.

For example, here’s the ASPX file we used to serve up a local copy of ga.js:

<%@ Page Language="C#" Debug="true" %>
<%@ Import Namespace=System.IO %>
<%
  System.Threading.Thread.Sleep(6000);
  Response.AddHeader("Content-Type", "text/javascript");
  Response.WriteFile( ".\\ga.js");
  Response.End();
%>

We did something similar with the Google __utm.gif file so that we could add delays to either component.

Using our slow simulation of ga.js in IE 8 we found that the page’s onload event was delayed with the synchronous version of the Google Analytics:

Slow ga.js in IE with Synchronous Google Analytics

This caused the IE 8 to display the spinning icon in the current tab:

Spinning Tab Icon in IE 8

and a “Waiting…” message in the status bar:

Waiting message in IE 8 Status bar

indicating to the user that the page was not fully downloaded.

We then tried using the slow version of the __utm.gif image file. This didn’t cause a problem in IE 8 as the page loaded successfully but continued to download the image beacon in the background even with the synchronous version of Google Analytics:

Slow __utm.gif with synchronous Google Analytics

Firefox 3.6 didn’t cope as well. Delaying the download of the beacon file with the synchronous loading code had the same effect as delaying the ga.js file – the page load was delayed, the spinning tab icon was displayed and the status line indicated that it was waiting for data.

We then tried the asynchronous version of the analytics loading code. Delaying either the ga.js or utm.gif file had no effect on the loading of the page in IE; effectively hiding the issue from the web site visitor:

Slow ga.js in IE with asynchronous Google Analytics

Surprisingly, the asynchronous version of the analytics code made no difference to Firefox 3.6 when we simulated slow downloads of the Google components:

Slow ga.js loading in Firefox with asynchronous Google Analytics

Conclusion: Using the asynchronous load made IE 8 more robust to performance problems in the loading of qa.js. Otherwise it made no difference. In reality, ga.js is often cached anyway making it the less likely of the two components to be subject to performance problems.

Is Data More Likely to be Recorded by Asynchronous Google Analytics During Early Page Exits?

To test this potential benefit we changed our test pages to include a slow loading script file at the top of the body tag. The idea was to emulate what might happen on your page if a third party component, such as an ad script, started to slow down.

The script tag we added called an ASPX file than delayed 6 seconds before returning an empty script block:

<body>
  <script type="text/javascript" src='http://veryslowdownload/ad.aspx'></script>

This stopped our page being displayed for six seconds.

Using the synchronous version of the loading code in IE 8 and Firefox 3.6, we found that the Google Analytics beacon image was not downloaded if the user gave up and went elsewhere:

Early page exit in IE 8 with synchronous Google Analytics

The asynchronous loading of Google Analytics solved this problem in both browsers. The image beacon was downloaded almost immediately even though the page was blocked by the slow script tag:

Early page exit in Firefox 3.6 with asynchronous Google Analytics

Conclusion: The asynchronous version of Google Analytics helps to ensure that analytics data is gathered in IE and Firefox when the user leaves a page early.

Should I Use the Asynchronous Version of Google Analytics?

Yes, but not for the reasons you might expect. It’s unlikely to make any difference to how quickly your pages load.

The main reason to use it is that you are more likely to get analytics data if a user leaves a page early.

Four Tips for Setting up HTTP File Downloads

calendarMarch 24, 2010 in Firefox , HTTP , HTTPS , HttpWatch , Internet Explorer , Javascript , Optimization

Web sites don’t just contain pages; sometimes you need to provide files that users can download. Putting a file on your web server and linking to it from an HTML page is just the first step. You also need to be aware of the HTTP response headers that affect file downloads.

These four tips cover some of the issues you may run into:

Tip #1: Forcing a Download and Controlling the File Name

Providing a download link in the HTML is easy:

...
<a href="http://download.httpwatch.com/httpwatch.exe">Download</a>
...

It works well for binary files like setup programs and ZIP archives that the browser doesn’t know how to display. A dialog is displayed allowing the user to save the file locally:

IE File Save Dialog

The trouble is that the browser behaves differently if the file is something that it can display itself. For example, if you link to a plain text file the browser just opens it and doesn’t prompt to save the download:

Plain Text in IE

You can force the use of the file download dialog by adding the following response header:

Content-Disposition: attachment; filename=<file name.ext>

The header also allows you to control the default file name. This can be handy if you’re generating the content in something like getfile.aspx but you want to supply a more meaningful file name to the user.

For static content you can manually configure the additional header in your web server. For example, here’s the setting in IIS:

content_disposition_header

For dynamically generated content you would need to add this header in the page’s server side code.

After adding the header, the browser will always prompt the user to download the file:

plain_text_download

Tip #2: Use Effective HTTP Caching

Like any other content, it’s worth setting up HTTP caching to maximize the speed of download and minimize your bandwidth costs. Usually content needs to expire immediately or be cached forever.

Our example download of the HTTP spec (RFC2616) could be cached forever because it is not expected to change. You can see here in HttpWatch we have set up a far futures Expires value and set Cache-Control to public :

effective_caching

This allows future downloads of the file to be delivered from the local browser cache or an intermediate proxy. If the file is subject to frequent changes, you may want to expire it immediately so that a fresh copy is always downloaded. You can do this by setting Expires to -1 or any date in the past.

Tip #3: Don’t break HTTPS downloads in IE

It’s tempting to use the no-store and no-cache directives with the Cache-Control response header to prevent any caching of a file that is often updated:

Cache-Control: no-store, no-cache

This works in Firefox, but watch out for Internet Explorer. It interprets these flags as meaning that the content should never be saved to the disk when HTTPS is being used and causes the file download dialog to hang at 0% for several minutes:

https_ie_hang

It eventually displays an error message:

https_ie_error

There’s more information about this problem and other possible causes in a post on Eric Lawrence’s IEInternals blog.

Tip #4: Don’t Forget to Setup Analytics

You’ll probably want to track file downloads along with other metrics from your web site. Javascript based solutions such as Google Analytics are very popular, but will not show file downloads by default. This is because downloading a file does not cause any Javascript to be executed.

With Google Analytics you need to add an onlick handler to enable download tracking:

...
<a onclick="pageTracker._trackPageview('/httpwatch.exe');" href="...">Download</a>
...

You can see the Google Analytics call being made just before the file download starts:

ga_download

Using Protocol Relative URLs to Switch between HTTP and HTTPS

calendarFebruary 10, 2010 in HTTPS , HttpWatch , Internet Explorer

Attempting to use HTTP resources on an secure web page is a guaranteed way to annoy IE users, because it generates a confusing security warning for anyone running with the default IE settings. Our previous post on fixing this message in IE 8 is responsible for more than 50% of the traffic and comments on this blog. Hopefully, Microsoft will take note and change this in a future version of IE.

In the meantime, it’s important to avoid this warning by ensuring that every image, CSS and Javscript file on a secure page is accessed using HTTPS. For content on the same domain it’s quite straightforward – you just need to use relative URLs. A relative URL contains the ‘offset’ URL that needs to be applied to the page’s absolute URL in order to find a resource.

For example, here’s a screen shot from HttpWatch accessing our home page over HTTPS:

HttpWatch Home Page Access with HTTPS

Searching for one of the images in HttpWatch shows that it was specified with a relative URL and the switch to HTTPS happened automatically:

Relative URL

A problem arises though, if you attempt to access a resource from a different domain because you can’t use the simple path-relative URL to access the resource. This often happens when you attempt to use a third party service such as Google Analytics or a third party Ajax library CDN.

Google Analytics solves the problem with its external javascript file by recommending the use of this code to dynamically switch protocols:

var gaJsHost = (("https:" == document.location.protocol) ?
    "https://ssl." : "http://www.");
document.write(unescape("%3Cscript src='" + gaJsHost +
    "google-analytics.com/ga.js' type='text/javascript'%3E%3C/script%3E"));

Another solution is to hard code the external resource with an HTTPS based URL regardless of how the containing page is accessed. This avoids any security warnings in IE, but does mean that the HTTPS resource download will consume more CPU resources in cases where the HTTP download would have been sufficient.

The solution we prefer to use is to specify a Protocol Relative URL. It’s just like a regular URL except that you leave out the protocol prefix. For example, when Microsoft recently announced SSL support for their Ajax CDN the recommended script tag for secure pages was:

<script src=”https://ajax.microsoft.com/ajax/jquery/jquery-1.3.2.min.js” type=”text/javascript”></script>

But if you change this to a protocol relative URL:

<script src=”//ajax.microsoft.com/ajax/jquery/jquery-1.3.2.min.jstype=”text/javascript”></script>

You get the automatic use of HTTPS on secure pages and avoid the overhead of HTTPS on non-secure pages.

We put together a simple jQuery demo page using this technique. In HttpWatch, you can see that the resource is automatically downloaded with HTTP for non-secure access:

and HTTPS with secure access:

One downside of this approach is that there will be separate cached copies for both the HTTP and HTTPS based URLs. If a vistor to your site first uses a non-secure HTTP based URL and then switches to HTTPS they would have to download fresh copies of resources that had already been accessed over HTTP.

UPDATE: Steve Souders pointed out another downside to using protocol relative URLs; they cause a double download of CSS files in IE 7 & 8.

You can check SSL/TLS configuration our new SSL test tool SSLRobot . It will also look for potential issues with the certificates, ciphers and protocols used by your site. Try it now for free!

Ready to get started? TRY FOR FREE Buy Now