The Surprising Effect of Distance on Download Speed

calendarAugust 14, 2008 in HTTP , Optimization

Let’s start with a question. What download speed would you expect in this scenario?

Download Scenario (100 Mbps server and 20 Mbps client)

If you just think of network connections as a simple pipe, then you might expect the download speed to be approximately the same as the slowest network connection, i.e. 20 Mbps. When we tested this out using a local UK based website with a ping time of 13 ms we saw this:

Local Download Speed

The download speed of 1.57 MB /s or 12.56 Mbps (i.e. 1.57 x 8 for 8 bits per byte) was over 60% of the theoretical maximum for the internet connection. That’s quite respectable if you allow you the overhead of the IP and TCP headers on each network packet.

However, the situation was quite different with our own web site that’s located in Dallas. It has a ping time of 137 ms from our office in the UK:

Remote Download Speed

This time the download speed was 372 KB/s or 2.91 Mbps – less than 15% of the advertised internet connection speed.

At first we thought this was some sort of bandwidth throttling of transatlantic traffic by our ISP. However, when we tried using other networks to perform transatlantic downloads we could never get more than about 3 – 4 Mbps. The reason for this behavior became clear when we saw Aladdin Nassar’s talk about Hotmail Performance Tuning at Velocity 2008:

Slide 7 of his talk shows that with standard TCP connections the round trip time between client and server (i.e. ping time) imposes an upper limited on maximum throughput.

This upper limit is caused by the TCP flow control protocol, It requires each block of data, know as the TCP window, to be acknowledged by the receiver. The sender will not start transmitting the next block data until it receives the acknowledgement from the receiver for the previous block. The reasons for doing this are:

  • It avoids the receiver getting swamped with data that it cannot process quickly enough. This is particularly important for memory challenged mobile devices
  • It allows the receiver to request re-transmission of the last data block, if it detects data corruption or the loss of packets

The ping time determines how many TCP windows can be acknowledged per second and is often the limiting factor on high bandwidth connections as the maximum window size in standard TCP/IP is only 64 KB.

Let’s try putting some numbers into the bandwidth equation to see if it matches our results. For large downloads in IE the TCP window starts at 16 KB but is automatically increased to 64 KB.

So for our remote download test:

Maximum throughput = 8,000 x 64 / ( 1.5 x 137) = 2, 491 Kbps = 2.43 Mbps

This value agrees fairly well with the measured figure. The 1.5 factor in the equation represents the overhead of the IP / TCP headers on each network packet and may be a a little too high explaining some of the difference.

In the past, with dialup connections the maximum throughput value was much higher than the 56 Kbps available bandwidth. But with today’s high bandwidth connections, this limit becomes much more significant. You really need to geographically position your web server(s) or content close to your users if you want to offer the best possible performance over high bandwidth connections.

There are two solutions to this issue but neither is in widespread use:

  • IPv6 allows window sizes over 64 KB
  • RFC1323 allows larger window sizes over IPv4 and is enabled by default in Windows Vista. Unfortunately, many routers and modems not support it

In the case of interplanetary spacecraft, the round trip time may be colossal. For example, the ping time to Mars can be up to 40 minutes and would limit TCP throughput to approximately 142 bits / sec! With this in mind an interplanetary version of TCP/IP has been designed:

http://www.cs.ucsb.edu/~ebelding/courses/595/s04_dtn/papers/TCP-Planet.pdf

Conclusions

  1. Distance doesn’t just increase round trip times; it can also reduce download speeds
  2. Don’t blame your ISP for poor downloads from distant web sites
  3. Consider using a Content Delivery Network (CDN) or geo-locating your web servers

Google uses HttpWatch to Speed Up Gmail

calendarMay 16, 2008 in HttpWatch , Optimization

Google MailThere’s a post over on the Gmail blog by Wiltse Carpenter, the Tech Lead for Gmail Performance, about how they used HttpWatch and other tools to speed up the login for Gmail.

Here’s what they said about HttpWatch:

“The Httpwatch plug-in for Internet Explorer was one that proved easy to use and provided us with most of the information we needed. It really helps that we can capture and save browser traces with it too.”

To reduce the time required to login they looked for ways to minimize the number of requests and reduce the overhead of each request:

“We spent hours poring over these traces to see exactly what was happening between the browser and Gmail during the sign-in sequence, and we found that there were between fourteen and twenty-four HTTP requests required to load an inbox and display it. To put these numbers in perspective, a popular network news site’s home page required about a 180 requests to fully load when I checked it yesterday. But when we examined our requests, we realized that we could do better. We decided to attack the problem from several directions at once: reduce the number of overall requests, make more of the requests cacheable by the browser, and reduce the overhead of each request.”

The Performance Benefits of Ajax

calendarApril 18, 2008 in HTTP , HttpWatch , Javascript , Optimization

Web 2.0 is a term often used to describe next generation web sites that have moved beyond the simple page request->process->response cycle and are utilizing services on the web server to return data that can be rendered without making page transitions. The result is often a more responsive user interface that closely mimics a desktop application.

The technology to make HTTP calls from JavaScript embedded in an HTML page was first introduced for general use by Microsoft in IE5 in order to support Outlook Web Access way back in 1999. However, the XmlHttpRequest object was not widely used until it was adopted by Mozilla in 2002. In 2005 the programming model was given the name Ajax which has now been widely recognised and stands for Asynchronous JavaScript + XML.

From a performance perspective Ajax has two main benefits:

  1. It can reduce the number of round trips by minimizing the number of page transitions in a web application
  2. It can reduce the size of uploaded and downloaded data because it allows the web programmer to control exactly what is transferred. For example, if a user performs a search the data can be returned in a compact data format (e.g. JSON ) rather than HTML.

In order to illustrate how Ajax can increase the speed and usability of a site let’s first look at a traditional web site and then compare it to an Ajax example.

This is what happens on Expedia.com when we try to book a flight but we specify a partial or mispelled city code. In this case we typed in “LOND” instead of “LON” or “London”. We expectantly press the submit button hoping for a list of possible flights, but instead we’re directed to an error page:

Expedia Error Page

Using HttpWatch we can see that Expedia took 4.4 seconds, 30 round trips and 156K of data (of which 41K was uploaded)  to display the error page:

Expedia Error Page Summary

Even if there are no further mistakes we will still have to make further page transitions before we get the results we’re expecting.

Let’s look at a similar example on BA.COM where Ajax has been used:

BA Cities List

Here we did exactly the same thing – typed “LOND” instead of London. Instead of having to press submit, fetch a new choices page (with all linked page components) and drag the user through an additional form, the page used background processing to query the server and display a list of possible choices.

The beauty of this approach is that is that it didn’t require any user interaction, is very fast and it didn’t leave the search page. Here’s the Ajax request in HttpWatch:

BA Ajax Request

BA.COM uses the DWR framework for Ajax to request the possible airports. The framework returns a JavaScript list used to render the content into the dropdown list control. The single HTTP call required is much smaller and faster than the Expedia model where the whole page must be fetched and rendered.

Due to the inconsistencies between the programmable object models on different browsers, most developers now choose a framework to ease JavaScript development and most provide Ajax helper functions that wrap XmlHttpRequest. Java’s DWR has RPC style support. ASP.Net’s Ajax library, Dojo, Yahoo’s YUI and Google’s GWT support a whole suite of UI and control extensions as well as offering some kind of Ajax support. Similarly, Prototype and jQuery are two extremely lightweight and popular JavaScript libraries.

Let’s say you wanted to populate a list box with city codes based on the country a user selects. In Web 1.0 you would have to submit a form making a round-trip to the server with your choice of country so that the server could render the appropriate list of choices on the resulting HTML page.

With Ajax, this can all be done in the background. A lot of services use the XML data format, but there’s no restriction on the content type that XmlHttpRequest can send or receive. It could be HTML, JSON or any other format that the page’s JavaScript can parse.  Here’s an example of using the jQuery library to populate a list control from an Ajax request:

<script type="text/javascript" src="jQuery.js"></script>
...
<h2>Products</h2>
<div id="products">(fetching product list ...)</div>
<script>
...
jQuery("#products").load("productListHTML.aspx");
...
</script>

The products <div> is initially empty, but the load() method wraps XmlHttpRequest and fetches some HTML from the server and uses it to populate the <div>.

One of the drawbacks people found with using Ajax was that they ended up breaking the back button. Clicking the Back button in some Ajax applications did not go to the previous logical operation. For example, if you were viewing the results of a search you would expect Back to take you to the search page so that you could modify the search criteria. Instead, the user would be taken to whatever happened to be the previous page in the browser’s history list.  In some cases, they may even have been the previous web site that the user had visited.

More recently though, browsers have exposed object models to help Ajax applications manipulate the browser history artificially so that they can correct the flow of operations as the user uses the forward and back buttons.

Ready to get started? TRY FOR FREE Buy Now