Blocked time and IE 8

calendarMarch 31, 2008 in HTTP , HttpWatch , Internet Explorer , Optimization

A common question we hear from our customers is “What is the Blocked time in HttpWatch and why are we seeing some much of it?”

The Blocked time in HttpWatch is shown as a gray block at the start of a request:

Blocked Time

We measure this time by looking at the time interval between these two events:

  1. The point at which IE determines that it requires the resource at a certain URL. For example, it may have downloaded some HTML and encountered an <img> tag that refers to a GIF file.
  2. At some later time IE peforms a network action required to download or validate a cached copy of the resource.

During this time interval, IE will check the cache to see if the resource is stored locally, determine what headers and cookies would have to be sent in a GET request to the server, and if necessary, wait for an existing connection to become available.

Although, IE is multi-threaded and could start downloading many resources in parallel it is subject to a connection limit per host name. The HTTP 1.1 specification (RFC2616) recommends that HTTP clients should have no more than two connections active per host name. Therefore, a web page that has all its embedded images, CSS and javascript files on the same hostname (e.g. www.example.com) will only be able to use a maximum of two connections to download content.

Here’s a screenshot from HttpWatch that shows what happens when you force a refresh (Ctrl+F5) of our home page using IE 7:

Two connections per host in IE 7

At any one time, only two HTTP requests are actively downloading content. The rest are in the blocked state waiting for their turn to use one of the two connections to the web server. The two yellow segments in the first two requests indicate that two TCP connections were made to the server (a yellow segment in an HttpWatch time chart indicates the TCP connect time).

For many web pages this has a major performance impact and is a strong motivator for reducing the number of round trips whenever possible.

Microsoft has just released IE 8 Beta 1. One of the most significant changes compared to IE 7 is that the maximum number of connections per host name has been changed. If you are on a fast, broadband connection it now uses a maximum of six rather than two connections per host name.

BTW, if you are going to use HttpWatch with IE 8 Beta 1 please make sure that you download and install version 5.3.

This screen shots shows the six connections being made with IE 8 and the increased concurrency during download:

Six connections per host in IE 8

The increase in the number of connections leads to a major reduction in blocked time. We found that the load time for our home page, with an empty cache, is on average 50% less in IE 8 Beta 1 compared to IE 7.

Steve Souders, author of ‘High Performance Web Sites’, has written a blog post entitled ‘Roundup on Parallel Connections’ that discusses the connection limits of other browsers and the possible effect on existing web servers.

Image Caching in Internet Explorer

calendarFebruary 27, 2008 in Caching , HttpWatch , Internet Explorer , Optimization

If you build, maintain or tune web sites you’ll know about the browser cache and how to control caching using HTTP response headers. We’ve talked about caching in several previous posts.

However, you may not be aware that IE uses two caches for holding images. First, there is the regular browser cache that keeps a copy of downloaded image files in your Temporary Internet Files folder. It’s this cache that can be controlled by HTTP response headers such as Cache-Control and Expires.

There is also an image cache that IE uses to hold recently used images in memory. The main difference between the image cache and the browser cache is:

  • The image cache is never written to disk and is always emptied when IE closes
  • The image cache contains an expanded, Windows bitmap version of  GIF, JPG or PNG files
  • HttpWatch does not record access to the image cache, unlike the 304 and (Cache) responses that you’ll see when IE reads content from the browser cache

The point of the image cache is that it can be used to quickly render images without taking the CPU hit required to convert them from their native compressed format into a Windows bitmap.

The only documentation about the image cache is the “Image Cache Limits” section of the registry settings for IE on Windows CE:

http://msdn2.microsoft.com/en-us/library/aa908131.aspx

Here’s the relevant section:

Image Caching in Windows CE

Although, these settings don’t work with IE 6 or 7 they do show that the image cache has limits on the size of images, the number of images cached and the maximum amount of memory that can be used. Limits like this have to be set because the expanded images can take up to 500% more memory than the original PNG, GIF or JPG format.

Let’s see the effect of the image cache by starting a new instance of IE and visting www.httpwatch.com:

Images loaded from browser cache

The (Cache) responses shown in HttpWatch indicate that the images on our home page were read from the browser cache.  The image cache would have been empty at this point because IE had just started.

If you click on the Download tab and then click back on the Home page tab you’ll see that only two images are shown in the resulting HttpWatch trace:

Image Caching on HttpWatch home page

Clearly, all the images for the home page must have been loaded from somewhere because the page was correctly rendered, but only two images requests from the browser cache are seen in HttpWatch. This is because the other images were read from the image cache.

Why were two of the PNG files still read from the browser cache? These two images are the largest image files on the page and they would expand to approximately 500 KB and 300 KB if they were both placed in the image cache. They probably breached the maximum image size limit in the IE 7 image cache and were therefore not stored in their expanded form.

It’s also possible to see that the image cache has a limit on the number of images it stores. Try visiting ebay.com and then yahoo.com in the same browser window to load up the image cache. If you then go back to www.httpwatch.com you’ll see that all the images have been flushed out of the image cache and had to be reloaded from the browser cache.

So, to make effective use of the image cache in IE you should:

  1. Minimize the number of images that your site uses
  2. Avoid single images that might expand to more than approx 200 KB

Of course, you should always try to minimize the number of images to reduce the network round-trips when a browser first loads a page. A popular technique for doing this is to use CSS sprites to merge several images together and there are some great tools to help you create the compound image.

Be careful though, that you don’t run into item 2) by creating a single large image that is not loaded into the image cache. You would get the benefit of less round trips on an initial visit, but the rendering in IE might actually be slower because the image would have to expanded from the browser cache whenever it was re-displayed.

The Performance Impact of Uploaded Data

calendarJanuary 18, 2008 in Caching , HTTP , HttpWatch , Optimization

Web developers are becoming more aware of the performance penalties of page bloat and as we covered in our previous posts there are ways to mitigate this, compression being just one.

However, one of the causes of poor performance that is often overlooked is the transmission time taken to upload data to the server. Although, HTTP request messages are typically smaller than HTTP response messages, the performance cost can be an order of magnitude higher per byte. This is caused by the asymmetric nature of many consumer broadband connections.

For example, the results of a speed test on a UK broadband cable connection are shown here:

Broadband Speed Test

The upload speed is only about 6% of the download speed. That means, byte for byte, uploaded data takes about 16 times as long to transmit as the equivalent amount of downloaded data.

To put it another way, if you upload 4 KB of data in an HTTP request message it may take the same length of time as downloading a 64 KB page.

You can easily see the size of the HTTP request message by looking at the Sent column in HttpWatch:

 Sent Column

The value shown in the Sent Column is made up of the size of the following items:

  • The HTTP GET or POST request line
  • HTTP request headers
  • Form fields and uploaded files sent with POST requests

Unfortunately, request data is never compressed because there is no server-side equivalent of the Accept-Encoding request header that is used by browsers to indicate that they support compression of downloaded content.

For a typical site, you might be surprised to know that the request data can be up to 50% of the size of the response data. Since many broadband connections are asymmetric, this can have a substantial impact on performance. Here’s an example of a flight search page on Expedia:

Upload / Download Ratio

The ratio increases as the downloaded content is cached by the browser, often making uploaded data the most significant factor in the performance of a web page.

So what can be done to reduce the amount of uploaded data?

Step 1: Minimize the size of Cookies

Cookies are simply part of the request headers. Expedia uses around 9 cookies which are fortunately quite small but it’s easily possible to end up with a lot of cookie data, particularly if you’re using 3rd party web frameworks. RFC2109 specifies that browsers should support at least 20 cookies for each domain and at least 4K of data per cookie.

The problem with cookies is that they need to be sent with every single HTTP request where the URL is in the domain and path to which they apply. That includes requests for style-sheets, images and scripts.  So in most cases the amount of cookie data uploaded, is effectively multiplied by the number of requests per page.

One way to reduce the amount of cookie data (apart from making them as small as possible and using them less) is to use different domains or paths for your content. For instance, you probably need cookies in your page processing code but you don’t need them for static content such as images. If you put static content in a different location, cookie data will not be sent because cookies are domain and path specific.

Another way to reduce cookie data is to look at server-side storage, such as using the session state management of a framework like ASP.NET. You can then use a single cookie that just contains a session id and look up any session related data on the server when required. Of course, this may have an impact on server side performance, but it is a useful way of minimizing the amount of cookie data that a site requires.

Step 2: Avoid Excessive use of Hidden Form Fields

There are two sets of data in a typical HTML form:

  1. Fields you want the user to fill out with information.
  2. Hidden fields.

You can’t really do much about the first type, except reducing the size of your field names. This may be difficult depending on your implementation framework.

Hidden fields are used to maintain page-scoped variables that will be required by the server when the form is submitted, e.g. a user ID. They may also be injected by various web frameworks or server-side controls. One such example is the __VIEWSTATE field used by ASP.NET as shown below:

ASP.NET hidden fields

There are two reasons for doing this. Firstly it’s easier to have the state to hand (so to speak) in your page logic and secondly it often scales better across a web farm by keeping page scoped state within the page rather than fetching it from somewhere else like a database. 

One way to reduce the amount of data in hidden fields is to only use a single key in a hidden form field. The key value is then used on the server to retrieve the data required to process the submitted page. This is exactly like the approach used to reduce the amount of cookie data.  And like with cookies, reducing request transmission time in this way may have an impact on server-side performance and scalability.

Step 3: Avoid Verbose URLs

This is not the most important issue on the list, but it is still worth considering. In practice there are no ubiquitous limits on URL length, but most browsers will struggle with URLs longer than 4 KB – some may struggle at 1 KB or less. Remember that your URL will also end up in the referrer header of images and other embedded resources.

It’s usually the query string parameters that make a URL overly long. Again using Expedia as an example, you can see how many sites use these variables:

http://www.expedia.com/pub/agent.dll?qscr=fexp&flag=q&city1=lon&citd1=bos&date1=1/22/2008&time1=362&
date2=1/22/2008&time2=362&cAdu=1&cSen=&cChi=&cInf=&infs=2&
tktt=&trpt=2&ecrc=&eccn=&qryt=8&load=1&airp1=&dair1=&rdct=1&
rfrr=-429

In this case the URL is used to quickly pinpoint a results page for flights between ‘city1′ London (LON) and ‘city2′ Boston (BOS) on specific dates.

Encoding data like this in URLs does have certain advantages. If you bookmark or share the URL the same results will be displayed when the URL is next used. However, if the URL contains unused or redundant data it may be causing a significant increase in the amount of uploaded data.

Ready to get started? TRY FOR FREE Buy Now