Just in Time Connection Reuse in Firefox 6

calendarAugust 10, 2011 in Firefox , HTTP , HttpWatch

Firefox 6 is almost ready for release and we have updated HttpWatch to work with the latest beta versions . While doing this we noticed some unexpected behavior in the way that it creates new connections.

Normally, in HttpWatch you can see when a browser uses a new TCP connection by looking for the yellow Connect block in each request. You can also confirm this by adding a Client Port column to the main grid. Here’s screenshot from HttpWatch showing new client ports being used as new connections are made to the web server:

In Firefox 6 we noticed that existing connections were sometimes re-used even though a request had a Connect phase:

On closer examination we found that Firefox 6 may reuse an existing connection even though it has already started to connect a new socket. The new actions on the Overview tab show exactly what is happening:

Initially, there was no idle connection available for the CSS file download because the first connection was still being used by the request for the page’s HTML. Firefox 6 therefore started to create a new TCP connection to the host. In older versions of Firefox it would simply have waited for the new connection to be completed and then would use that connection to dispatch the second request.

However, in this case Firefox 6 reused the initial connection (port 51384) when it became available even though it was still in the process of setting up a new connection. This new connection (port 51385) wasn’t wasted though. Its setup was completed in the background and the connection was reused by another request further down the water fall time chart.

Sometimes, you’ll even see a case where the new connection is correctly setup but Firefox 6 still reuses an existing connection:

So why does Firefox 6 aggressively reuse existing connections instead of new connections? There are two main reasons:

  1. If an existing connection becomes available before the new connection is setup it means the HTTP request can be sent off sooner
  2. Existing connections will usually have a larger TCP congestion window and allow greater throughput.

 

Asynchronous Google Analytics is Better but Not Faster

calendarJuly 29, 2010 in Firefox , HTTP , HttpWatch , Internet Explorer , Javascript , Optimization

In December 2009, Google launched the asynchronous version of the Google Analytics script. The update aimed to address potential script blocking problems that have been extensively researched and reported by Steve Souders at Google.

Steve’s blog post about the new asynchronous loading of Google Analytics identified three potential benefits:

  1. Your pages should load faster
  2. Availability or performance problems at Google should have less impact on your site
  3. Analytics data is more likely to be collected if a user leaves a page early

Before applying the change to our web site we decide to compare the new and old versions of the Google Analytics scripts using HttpWatch 7.0 .

The following sections describe what has changed in the asynchronous Google Analytics script and the tests we performed to compare it to the traditional synchronous version.

What has Changed?

Google Analytics collects data using two components on a page:

  • A javascript file script file that is loaded from http://www.google-analytics.com/ga.js
  • A 1×1 pixel image beacon (http://www.google-analytics.com/__utm.gif) that passes data back to Google in query string parameters

You enable Google Analytics on a web page by calling a page tracking function in ga.js. This function automatically gathers analytics data, such as operating system and browser versions, then generates the call to the image beacon.

The difference between the two ways of loading Google Analytics is in the way that the script file is loaded. In the traditional synchronous version, two small script tags are added at the end the page’s <body> tag:

...
  <script type="text/javascript">
    var gaJsHost = (("https:" == document.location.protocol) ?
    "https://ssl." : "http://www.");
    document.write(unescape("%3Cscript src='" + gaJsHost +
    "google-analytics.com/ga.js' type='text/javascript'%3E%3C/script%3E"));
  </script>
 
  <script type="text/javascript">
    var pageTracker = _gat._getTracker("XX-XXXXXX-X"); // Your GA id
    pageTracker._trackPageview();
  </script>
</body>
...

The first script tag ensures that the correct HTTP or HTTPS version of the qa.js file is loaded. The second script tag then calls into the javascript file triggering the beacon image download.

These two script tags are placed at the end of the body tag to ensure that they don’t hold up the download of any other resources on the page. The disadvantage of doing this is that the analytics call may not be triggered if the user exits the page before it has completely downloaded.

The asynchronous version of the Google Analytics loading code uses a single script tag at the end of the page’s <head> :

...
 
  <script>
    var _gaq = _gaq || [];
    _gaq.push(['_setAccount', 'XX-XXXXXX-X']);
    _gaq.push(['_trackPageview']);
 
    (function() {
      var ga = document.createElement('script');
      ga.type = 'text/javascript';
      ga.async = true;
      ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 'http://www')
        + '.google-analytics.com/ga.js';
      var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(ga, s);
 
    })();
  </script>
 
</head>
...

It sets up the parameters required to make the call into ga.js but doesn’t invoke the function directly. A script element is added to the document allowing asynchronous download without blocking other elements of the page. The HTML 5 async attribute is also set on the script tag for browsers that support it.

Once the ga.js file is loaded and executed it looks for an array variable called _gaq and executes the function with the previously specified arguments.

Do Pages Load Faster With Asynchronous Google Analytics?

To try this out we created a version of our Download page using both versions of the Google Analytics loading code. We first tried the synchronous version with an empty cache in IE to simulate a new visitor to the page:

Synchronous GA Test With Empty cache in IE

And then with the asynchronous version:

Asynchronous GA Test With Empty cache in IE

The page load times were dominated by other components on the page. Using the asynchronous version of Google Analytics didn’t really make any difference.

We tried the same tests with a primed cache to see if there would be a greater impact when the page was loaded during a repeat visit. First the synchronous version:

Synchronous GA Test With Primed cache in IE

and then the asynchronous version:

Asynchronous GA Test With Primed cache in IE

Again there was practically no difference in the page load time of the page allowing for variability in our tests.

The reason for that is that calls to Google Analytics are incredibly fast (usually around a 100ms or less) and the download of the image beacon doesn’t block other components because it uses a different hostname.

We tried Firefox 3.6 and got almost the same results.

Conclusion: Google Analytics is so fast that you won’t see any significant improvement with the asynchronous loading version.

Would Google Performance Problems Have Less Impact With Asynchronous Google Analytics?

For this test we needed to simulate performance problems at Google. We did this by modifying the standard script snippets to call our own ASP.NET versions of the Google files that served the same content but added a 5 second delay.

For example, here’s the ASPX file we used to serve up a local copy of ga.js:

<%@ Page Language="C#" Debug="true" %>
<%@ Import Namespace=System.IO %>
<%
  System.Threading.Thread.Sleep(6000);
  Response.AddHeader("Content-Type", "text/javascript");
  Response.WriteFile( ".\\ga.js");
  Response.End();
%>

We did something similar with the Google __utm.gif file so that we could add delays to either component.

Using our slow simulation of ga.js in IE 8 we found that the page’s onload event was delayed with the synchronous version of the Google Analytics:

Slow ga.js in IE with Synchronous Google Analytics

This caused the IE 8 to display the spinning icon in the current tab:

Spinning Tab Icon in IE 8

and a “Waiting…” message in the status bar:

Waiting message in IE 8 Status bar

indicating to the user that the page was not fully downloaded.

We then tried using the slow version of the __utm.gif image file. This didn’t cause a problem in IE 8 as the page loaded successfully but continued to download the image beacon in the background even with the synchronous version of Google Analytics:

Slow __utm.gif with synchronous Google Analytics

Firefox 3.6 didn’t cope as well. Delaying the download of the beacon file with the synchronous loading code had the same effect as delaying the ga.js file – the page load was delayed, the spinning tab icon was displayed and the status line indicated that it was waiting for data.

We then tried the asynchronous version of the analytics loading code. Delaying either the ga.js or utm.gif file had no effect on the loading of the page in IE; effectively hiding the issue from the web site visitor:

Slow ga.js in IE with asynchronous Google Analytics

Surprisingly, the asynchronous version of the analytics code made no difference to Firefox 3.6 when we simulated slow downloads of the Google components:

Slow ga.js loading in Firefox with asynchronous Google Analytics

Conclusion: Using the asynchronous load made IE 8 more robust to performance problems in the loading of qa.js. Otherwise it made no difference. In reality, ga.js is often cached anyway making it the less likely of the two components to be subject to performance problems.

Is Data More Likely to be Recorded by Asynchronous Google Analytics During Early Page Exits?

To test this potential benefit we changed our test pages to include a slow loading script file at the top of the body tag. The idea was to emulate what might happen on your page if a third party component, such as an ad script, started to slow down.

The script tag we added called an ASPX file than delayed 6 seconds before returning an empty script block:

<body>
  <script type="text/javascript" src='http://veryslowdownload/ad.aspx'></script>

This stopped our page being displayed for six seconds.

Using the synchronous version of the loading code in IE 8 and Firefox 3.6, we found that the Google Analytics beacon image was not downloaded if the user gave up and went elsewhere:

Early page exit in IE 8 with synchronous Google Analytics

The asynchronous loading of Google Analytics solved this problem in both browsers. The image beacon was downloaded almost immediately even though the page was blocked by the slow script tag:

Early page exit in Firefox 3.6 with asynchronous Google Analytics

Conclusion: The asynchronous version of Google Analytics helps to ensure that analytics data is gathered in IE and Firefox when the user leaves a page early.

Should I Use the Asynchronous Version of Google Analytics?

Yes, but not for the reasons you might expect. It’s unlikely to make any difference to how quickly your pages load.

The main reason to use it is that you are more likely to get analytics data if a user leaves a page early.

Four Tips for Setting up HTTP File Downloads

calendarMarch 24, 2010 in Firefox , HTTP , HTTPS , HttpWatch , Internet Explorer , Javascript , Optimization

Web sites don’t just contain pages; sometimes you need to provide files that users can download. Putting a file on your web server and linking to it from an HTML page is just the first step. You also need to be aware of the HTTP response headers that affect file downloads.

These four tips cover some of the issues you may run into:

Tip #1: Forcing a Download and Controlling the File Name

Providing a download link in the HTML is easy:

...
<a href="http://download.httpwatch.com/httpwatch.exe">Download</a>
...

It works well for binary files like setup programs and ZIP archives that the browser doesn’t know how to display. A dialog is displayed allowing the user to save the file locally:

IE File Save Dialog

The trouble is that the browser behaves differently if the file is something that it can display itself. For example, if you link to a plain text file the browser just opens it and doesn’t prompt to save the download:

Plain Text in IE

You can force the use of the file download dialog by adding the following response header:

Content-Disposition: attachment; filename=<file name.ext>

The header also allows you to control the default file name. This can be handy if you’re generating the content in something like getfile.aspx but you want to supply a more meaningful file name to the user.

For static content you can manually configure the additional header in your web server. For example, here’s the setting in IIS:

content_disposition_header

For dynamically generated content you would need to add this header in the page’s server side code.

After adding the header, the browser will always prompt the user to download the file:

plain_text_download

Tip #2: Use Effective HTTP Caching

Like any other content, it’s worth setting up HTTP caching to maximize the speed of download and minimize your bandwidth costs. Usually content needs to expire immediately or be cached forever.

Our example download of the HTTP spec (RFC2616) could be cached forever because it is not expected to change. You can see here in HttpWatch we have set up a far futures Expires value and set Cache-Control to public :

effective_caching

This allows future downloads of the file to be delivered from the local browser cache or an intermediate proxy. If the file is subject to frequent changes, you may want to expire it immediately so that a fresh copy is always downloaded. You can do this by setting Expires to -1 or any date in the past.

Tip #3: Don’t break HTTPS downloads in IE

It’s tempting to use the no-store and no-cache directives with the Cache-Control response header to prevent any caching of a file that is often updated:

Cache-Control: no-store, no-cache

This works in Firefox, but watch out for Internet Explorer. It interprets these flags as meaning that the content should never be saved to the disk when HTTPS is being used and causes the file download dialog to hang at 0% for several minutes:

https_ie_hang

It eventually displays an error message:

https_ie_error

There’s more information about this problem and other possible causes in a post on Eric Lawrence’s IEInternals blog.

Tip #4: Don’t Forget to Setup Analytics

You’ll probably want to track file downloads along with other metrics from your web site. Javascript based solutions such as Google Analytics are very popular, but will not show file downloads by default. This is because downloading a file does not cause any Javascript to be executed.

With Google Analytics you need to add an onlick handler to enable download tracking:

...
<a onclick="pageTracker._trackPageview('/httpwatch.exe');" href="...">Download</a>
...

You can see the Google Analytics call being made just before the file download starts:

ga_download

Ready to get started? TRY FOR FREE Buy Now